Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion docs/automq-kafka-source.md
Original file line number Diff line number Diff line change
Expand Up @@ -28,5 +28,5 @@ Click **Next**. Timeplus will connect to the server and list all topics. Choose
In the next step, confirm the schema of the Timeplus stream and specify a name. At the end of the wizard, an external stream will be created in Timeplus. You can query data or even write data to the AutoMQ topic with SQL.

See also:
* [Kafka External Stream](/proton-kafka)
* [Kafka External Stream](/kafka-source)
* [Tutorial: Streaming ETL from Kafka to ClickHouse](/tutorial-sql-etl-kafka-to-ch)
34 changes: 34 additions & 0 deletions docs/bigquery-external.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,34 @@
# BigQuery

Leveraging HTTP external stream, you can write / materialize data to BigQuery directly from Timeplus.

## Write to BigQuery {#example-write-to-bigquery}

Assume you have created a table in BigQuery with 2 columns:
```sql
create table `PROJECT.DATASET.http_sink_t1`(
num int,
str string);
```

Follow [the guide](https://cloud.google.com/bigquery/docs/authentication) to choose the proper authentication to Google Cloud, such as via the gcloud CLI `gcloud auth application-default print-access-token`.

Create the HTTP external stream in Timeplus:
```sql
CREATE EXTERNAL STREAM http_bigquery_t1 (num int,str string)
SETTINGS
type = 'http',
http_header_Authorization='Bearer $OAUTH_TOKEN',
url = 'https://bigquery.googleapis.com/bigquery/v2/projects/$PROJECT/datasets/$DATASET/tables/$TABLE/insertAll',
data_format = 'Template',
format_template_resultset_format='{"rows":[${data}]}',
format_template_row_format='{"json":{"num":${num:JSON},"str":${str:JSON}}}',
format_template_rows_between_delimiter=','
```

Replace the `OAUTH_TOKEN` with the output of `gcloud auth application-default print-access-token` or other secure way to obtain OAuth token. Replace `PROJECT`, `DATASET` and `TABLE` to match your BigQuery table path. Also change `format_template_row_format` to match the table schema.

Then you can insert data via a materialized view or just via `INSERT` command:
```sql
INSERT INTO http_bigquery_t1 VALUES(10,'A'),(11,'B');
```
2 changes: 1 addition & 1 deletion docs/changelog-stream.md
Original file line number Diff line number Diff line change
Expand Up @@ -403,7 +403,7 @@ Debezium also read all existing rows and generate messages like this

### Load data to Timeplus

You can follow this [guide](/proton-kafka) to add 2 external streams to load data from Kafka or Redpanda. For example:
You can follow this [guide](/kafka-source) to add 2 external streams to load data from Kafka or Redpanda. For example:

* Data source name `s1` to load data from topic `doc.public.dim_products` and put in a new stream `rawcdc_dim_products`
* Data source name `s2` to load data from topic `doc.public.orders` and put in a new stream `rawcdc_orders`
Expand Down
2 changes: 1 addition & 1 deletion docs/cli-migrate.md
Original file line number Diff line number Diff line change
Expand Up @@ -8,7 +8,7 @@ This tool is available in Timeplus Enterprise 2.5. It supports [Timeplus Enterpr

## How It Works

The migration is done via capturing the SQL DDL from the source deployment and rerunning those SQL DDL in the target deployment. Data are read from source Timeplus via [Timeplus External Streams](/timeplus-external-stream) and write to the target Timeplus via `INSERT INTO .. SELECT .. FROM table(tp_ext_stream)`. The data files won't be copied among the source and target Timeplus, but you need to ensure the target Timeplus can access to the source Timeplus, so that it can read data via Timeplus External Streams.
The migration is done via capturing the SQL DDL from the source deployment and rerunning those SQL DDL in the target deployment. Data are read from source Timeplus via [Timeplus External Streams](/timeplus-source) and write to the target Timeplus via `INSERT INTO .. SELECT .. FROM table(tp_ext_stream)`. The data files won't be copied among the source and target Timeplus, but you need to ensure the target Timeplus can access to the source Timeplus, so that it can read data via Timeplus External Streams.


## Supported Resources
Expand Down
Original file line number Diff line number Diff line change
@@ -1,5 +1,7 @@
# ClickHouse External Table

## Overview

Timeplus can read or write ClickHouse tables directly. This unlocks a set of new use cases, such as

- Use Timeplus to efficiently process real-time data in Kafka/Redpanda, apply flat transformation or stateful aggregation, then write the data to the local or remote ClickHouse for further analysis or visualization.
Expand Down Expand Up @@ -41,7 +43,7 @@ The required settings are type and address. For other settings, the default valu

The `config_file` setting is available since Timeplus Enterprise 2.7. You can specify the path to a file that contains the configuration settings. The file should be in the format of `key=value` pairs, one pair per line. You can set the ClickHouse user and password in the file.

Please follow the example in [Kafka External Stream](/proton-kafka#config_file).
Please follow the example in [Kafka External Stream](/kafka-source#config_file).

You don't need to specify the columns, since the table schema will be fetched from the ClickHouse server.

Expand Down
14 changes: 6 additions & 8 deletions docs/ingestion.md → docs/connect-data-in.md
Original file line number Diff line number Diff line change
@@ -1,9 +1,9 @@
# Getting Data In
# Connect Data In

Timeplus supports multiple ways to load data into the system, or access the external data without copying them in Timeplus:

- [External Stream for Apache Kafka](/external-stream), Confluent, Redpanda, and other Kafka API compatible data streaming platform. This feature is also available in Timeplus Proton.
- [External Stream for Apache Pulsar](/pulsar-external-stream) is available in Timeplus Enterprise 2.5 and above.
- [External Stream for Apache Pulsar](/pulsar-source) is available in Timeplus Enterprise 2.5 and above.
- Source for extra wide range of data sources. This is only available in Timeplus Enterprise. This integrates with [Redpanda Connect](https://redpanda.com/connect), supporting 200+ connectors.
- On Timeplus web console, you can also [upload CSV files](#csv) and import them into streams.
- For Timeplus Enterprise, [REST API](/ingest-api) and SDKs are provided to push data to Timeplus programmatically.
Expand All @@ -15,12 +15,12 @@ Timeplus supports multiple ways to load data into the system, or access the exte
Choose "Data Collection" from the navigation menu to setup data access to other systems. There are two categories:
* Timeplus Connect: directly supported by Timeplus Inc, with easy-to-use setup wizards.
* Demo Stream: generate random data for various use cases. [Learn more](#streamgen)
* Timeplus: read data from another Timeplus deployment. [Learn more](/timeplus-external-stream)
* Timeplus: read data from another Timeplus deployment. [Learn more](/timeplus-source)
* Apache Kafka: setup external streams to read from Apache Kafka. [Learn more](#kafka)
* Confluent Cloud: setup external streams to read from Confluent Cloud
* Redpanda: setup external streams to read from Redpanda
* Apache Pulsar: setup external streams to read from Apache Pulsar. [Learn more](#pulsar)
* ClickHouse: setup external tables to read from ClickHouse, without duplicating data in Timeplus. [Learn more](/proton-clickhouse-external-table)
* ClickHouse: setup external tables to read from ClickHouse, without duplicating data in Timeplus. [Learn more](/clickhouse-external-table)
* NATS: load data from NATS to Timeplus streams
* WebSocket: load data from WebSocket to Timeplus streams
* HTTP Stream: load data from HTTP stream to Timeplus streams
Expand All @@ -29,19 +29,17 @@ Choose "Data Collection" from the navigation menu to setup data access to other
* Stream Ingestion: a wizard to guide you to push data to Timeplus via Ingest REST API. [Learn more](/ingest-api)
* Redpanda Connect: available since Timeplus Enterprise 2.5 or above. Set up data access to other systems by editing a YAML file. Powered by Redpanda Connect, supported by Redpanda Data Inc. or Redpanda Community.



### Load streaming data from Apache Kafka {#kafka}

As of today, Kafka is the primary data integration for Timeplus. With our strong partnership with Confluent, you can load your real-time data from Confluent Cloud, Confluent Platform, or Apache Kafka into the Timeplus streaming engine. You can also create [external streams](/external-stream) to analyze data in Confluent/Kafka/Redpanda without moving data.

[Learn more.](/proton-kafka)
[Learn more.](/kafka-source)

### Load streaming data from Apache Pulsar {#pulsar}

Apache® Pulsar™ is a cloud-native, distributed, open source messaging and streaming platform for real-time workloads. Since Timeplus Enterprise 2.5, Pulsar External Streams can be created to read or write data for Pulsar.

[Learn more.](/pulsar-external-stream)
[Learn more.](/pulsar-source)

### Upload local files {#csv}

Expand Down
39 changes: 39 additions & 0 deletions docs/databricks-external.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,39 @@
# Databricks

Leveraging HTTP external stream, you can write / materialize data to Databricks directly from Timeplus.

## Write to Databricks {#example-write-to-databricks}

Follow [the guide](https://docs.databricks.com/aws/en/dev-tools/auth/pat) to create an access token for your Databricks workspace.

Assume you have created a table in Databricks SQL warehouse with 2 columns:
```sql
CREATE TABLE sales (
product STRING,
quantity INT
);
```

Create the HTTP external stream in Timeplus:
```sql
CREATE EXTERNAL STREAM http_databricks_t1 (product string, quantity int)
SETTINGS
type = 'http',
http_header_Authorization='Bearer $TOKEN',
url = 'https://$HOST.cloud.databricks.com/api/2.0/sql/statements/',
data_format = 'Template',
format_template_resultset_format='{"warehouse_id":"$WAREHOUSE_ID","statement": "INSERT INTO sales (product, quantity) VALUES (:product, :quantity)", "parameters": [${data}]}',
format_template_row_format='{ "name": "product", "value": ${product:JSON}, "type": "STRING" },{ "name": "quantity", "value": ${quantity:JSON}, "type": "INT" }',
format_template_rows_between_delimiter=''
```

Replace the `TOKEN`, `HOST`, and `WAREHOUSE_ID` to match your Databricks settings. Also change `format_template_row_format` and `format_template_row_format` to match the table schema.

Then you can insert data via a materialized view or just via `INSERT` command:
```sql
INSERT INTO http_databricks_t1(product, quantity) VALUES('test',95);
```

This will insert one row per request. We plan to support batch insert and Databricks specific format to support different table schemas in the future.


24 changes: 24 additions & 0 deletions docs/datadog-external.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,24 @@
# Datadog

Leveraging HTTP external stream, you can write / materialize data to Datadog directly from Timeplus.

## Write to Datadog {#example-write-to-datadog}

Create or use an existing API key with the proper permission for sending data.

Create the HTTP external stream in Timeplus:
```sql
CREATE EXTERNAL STREAM datadog_t1 (event string)
SETTINGS
type = 'http',
data_format = 'JSONEachRow',
output_format_json_array_of_rows = 1,
http_header_DD_API_KEY = 'THE_API_KEY',
http_header_Content_Type = 'application/json',
url = 'https://http-intake.logs.us3.datadoghq.com/api/v2/logs' --make sure you set the right region
```

Then you can insert data via a materialized view or just
```sql
INSERT INTO datadog_t1(message, hostname) VALUES('test message','a.test.com'),('test2','a.test.com');
```
25 changes: 25 additions & 0 deletions docs/elastic-external.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,25 @@
# Elastic Search

Leveraging HTTP external stream, you can write data to Elastic Search or Open Search directly from Timeplus.

## Write to OpenSearch / ElasticSearch {#example-write-to-es}

Assuming you have created an index `students` in a deployment of OpenSearch or ElasticSearch, you can create the following external stream to write data to the index.

```sql
CREATE EXTERNAL STREAM opensearch_t1 (
name string,
gpa float32,
grad_year int16
) SETTINGS
type = 'http',
data_format = 'OpenSearch', --can also use the alias "ElasticSearch"
url = 'https://opensearch.company.com:9200/students/_bulk',
username='admin',
password='..'
```

Then you can insert data via a materialized view or just
```sql
INSERT INTO opensearch_t1(name,gpa,grad_year) VALUES('Jonathan Powers',3.85,2025);
```
Loading