-
Notifications
You must be signed in to change notification settings - Fork 35
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Snowflake connector configuration dialog #859
Labels
Milestone
Comments
ryzhyk
added
Web Console
Related to the browser based UI
adapters
Issues related to the adapters crate
labels
Oct 9, 2023
ryzhyk
pushed a commit
that referenced
this issue
Oct 11, 2023
We use the Snowflake Kafka connector with Snowpipe Streaming to send a stream of output changes from Feldera into Snowflake. The main challenge is that Snowpipe currently supports inserts but not updates or deletions. The workaround is to write all updates into a set of landing tables that mirror the target database schema and use a combination of snowflake streams to incrementally apply these updates to target tables by converting them into insert/update/delete commands. So the end-to-end process is: * Feldera outputs updates to Kafka * The Snowflake Kafka connector converts them into a stream of inserts into landing tables. * We attach a Snowpipe Stream to each landing table to track changes to the table. A periodic task reads updates from the stream, applies them to the target table and removes them from the landing tables. At the moment the landing tables and the data ingestion logic (Snowflake streams and tasks) must be written by the user, but they can in principle be automatically generated. TODO: - Docs (#867) - WebConsole support (#859) - Support Snowflake's `TIMESTAMP` format (#862) - Figure out how to apply multiple updates atomically (See: snowflakedb/snowflake-kafka-connector#717) - Test under load. - Automate the generation of landing tables and data ingestion tasks. - Figure out downtime and schema evolution ops. Addresses #774 Signed-off-by: Leonid Ryzhyk <leonid@feldera.com>
ryzhyk
pushed a commit
that referenced
this issue
Oct 11, 2023
We use the Snowflake Kafka connector with Snowpipe Streaming to send a stream of output changes from Feldera into Snowflake. The main challenge is that Snowpipe currently supports inserts but not updates or deletions. The workaround is to write all updates into a set of landing tables that mirror the target database schema and use a combination of snowflake streams to incrementally apply these updates to target tables by converting them into insert/update/delete commands. So the end-to-end process is: * Feldera outputs updates to Kafka * The Snowflake Kafka connector converts them into a stream of inserts into landing tables. * We attach a Snowpipe Stream to each landing table to track changes to the table. A periodic task reads updates from the stream, applies them to the target table and removes them from the landing tables. At the moment the landing tables and the data ingestion logic (Snowflake streams and tasks) must be written by the user, but they can in principle be automatically generated. TODO: - Docs (#867) - WebConsole support (#859) - Support Snowflake's `TIMESTAMP` format (#862) - Figure out how to apply multiple updates atomically (See: snowflakedb/snowflake-kafka-connector#717) - Test under load. - Automate the generation of landing tables and data ingestion tasks. - Figure out downtime and schema evolution ops. Addresses #774 Signed-off-by: Leonid Ryzhyk <leonid@feldera.com>
ryzhyk
pushed a commit
that referenced
this issue
Oct 11, 2023
We use the Snowflake Kafka connector with Snowpipe Streaming to send a stream of output changes from Feldera into Snowflake. The main challenge is that Snowpipe currently supports inserts but not updates or deletions. The workaround is to write all updates into a set of landing tables that mirror the target database schema and use a combination of snowflake streams to incrementally apply these updates to target tables by converting them into insert/update/delete commands. So the end-to-end process is: * Feldera outputs updates to Kafka * The Snowflake Kafka connector converts them into a stream of inserts into landing tables. * We attach a Snowpipe Stream to each landing table to track changes to the table. A periodic task reads updates from the stream, applies them to the target table and removes them from the landing tables. At the moment the landing tables and the data ingestion logic (Snowflake streams and tasks) must be written by the user, but they can in principle be automatically generated. TODO: - Docs (#867) - WebConsole support (#859) - Support Snowflake's `TIMESTAMP` format (#862) - Figure out how to apply multiple updates atomically (See: snowflakedb/snowflake-kafka-connector#717) - Test under load. - Automate the generation of landing tables and data ingestion tasks. - Figure out downtime and schema evolution ops. Addresses #774 Signed-off-by: Leonid Ryzhyk <leonid@feldera.com>
ryzhyk
pushed a commit
that referenced
this issue
Oct 11, 2023
We use the Snowflake Kafka connector with Snowpipe Streaming to send a stream of output changes from Feldera into Snowflake. The main challenge is that Snowpipe currently supports inserts but not updates or deletions. The workaround is to write all updates into a set of landing tables that mirror the target database schema and use a combination of snowflake streams to incrementally apply these updates to target tables by converting them into insert/update/delete commands. So the end-to-end process is: * Feldera outputs updates to Kafka * The Snowflake Kafka connector converts them into a stream of inserts into landing tables. * We attach a Snowpipe Stream to each landing table to track changes to the table. A periodic task reads updates from the stream, applies them to the target table and removes them from the landing tables. At the moment the landing tables and the data ingestion logic (Snowflake streams and tasks) must be written by the user, but they can in principle be automatically generated. TODO: - Docs (#867) - WebConsole support (#859) - Support Snowflake's `TIMESTAMP` format (#862) - Figure out how to apply multiple updates atomically (See: snowflakedb/snowflake-kafka-connector#717) - Test under load. - Automate the generation of landing tables and data ingestion tasks. - Figure out downtime and schema evolution ops. Addresses #774 Signed-off-by: Leonid Ryzhyk <leonid@feldera.com>
ryzhyk
pushed a commit
that referenced
this issue
Oct 11, 2023
We use the Snowflake Kafka connector with Snowpipe Streaming to send a stream of output changes from Feldera into Snowflake. The main challenge is that Snowpipe currently supports inserts but not updates or deletions. The workaround is to write all updates into a set of landing tables that mirror the target database schema and use a combination of snowflake streams to incrementally apply these updates to target tables by converting them into insert/update/delete commands. So the end-to-end process is: * Feldera outputs updates to Kafka * The Snowflake Kafka connector converts them into a stream of inserts into landing tables. * We attach a Snowpipe Stream to each landing table to track changes to the table. A periodic task reads updates from the stream, applies them to the target table and removes them from the landing tables. At the moment the landing tables and the data ingestion logic (Snowflake streams and tasks) must be written by the user, but they can in principle be automatically generated. TODO: - Docs (#867) - WebConsole support (#859) - Support Snowflake's `TIMESTAMP` format (#862) - Figure out how to apply multiple updates atomically (See: snowflakedb/snowflake-kafka-connector#717) - Test under load. - Automate the generation of landing tables and data ingestion tasks. - Figure out downtime and schema evolution ops. Addresses #774 Signed-off-by: Leonid Ryzhyk <leonid@feldera.com>
ryzhyk
pushed a commit
that referenced
this issue
Oct 11, 2023
We use the Snowflake Kafka connector with Snowpipe Streaming to send a stream of output changes from Feldera into Snowflake. The main challenge is that Snowpipe currently supports inserts but not updates or deletions. The workaround is to write all updates into a set of landing tables that mirror the target database schema and use a combination of snowflake streams to incrementally apply these updates to target tables by converting them into insert/update/delete commands. So the end-to-end process is: * Feldera outputs updates to Kafka * The Snowflake Kafka connector converts them into a stream of inserts into landing tables. * We attach a Snowpipe Stream to each landing table to track changes to the table. A periodic task reads updates from the stream, applies them to the target table and removes them from the landing tables. At the moment the landing tables and the data ingestion logic (Snowflake streams and tasks) must be written by the user, but they can in principle be automatically generated. TODO: - Docs (#867) - WebConsole support (#859) - Support Snowflake's `TIMESTAMP` format (#862) - Figure out how to apply multiple updates atomically (See: snowflakedb/snowflake-kafka-connector#717) - Test under load. - Automate the generation of landing tables and data ingestion tasks. - Figure out downtime and schema evolution ops. Addresses #774 Signed-off-by: Leonid Ryzhyk <leonid@feldera.com>
ryzhyk
pushed a commit
that referenced
this issue
Oct 11, 2023
We use the Snowflake Kafka connector with Snowpipe Streaming to send a stream of output changes from Feldera into Snowflake. The main challenge is that Snowpipe currently supports inserts but not updates or deletions. The workaround is to write all updates into a set of landing tables that mirror the target database schema and use a combination of snowflake streams to incrementally apply these updates to target tables by converting them into insert/update/delete commands. So the end-to-end process is: * Feldera outputs updates to Kafka * The Snowflake Kafka connector converts them into a stream of inserts into landing tables. * We attach a Snowpipe Stream to each landing table to track changes to the table. A periodic task reads updates from the stream, applies them to the target table and removes them from the landing tables. At the moment the landing tables and the data ingestion logic (Snowflake streams and tasks) must be written by the user, but they can in principle be automatically generated. TODO: - Docs (#867) - WebConsole support (#859) - Support Snowflake's `TIMESTAMP` format (#862) - Figure out how to apply multiple updates atomically (See: snowflakedb/snowflake-kafka-connector#717) - Test under load. - Automate the generation of landing tables and data ingestion tasks. - Figure out downtime and schema evolution ops. Addresses #774 Signed-off-by: Leonid Ryzhyk <leonid@feldera.com>
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Labels
The design is very similar to what we did for Debezium, except that this is based on the output Kafka connector, not input.
Example config this connector should produce:
The
transport
section is just a standard Kafka transport config. Theformat
section specifies thejson
format, withupdate_format
set to "snowflake" (our normal JSON format config dialog does not allow choosing update format).Design
Kafka-Snowflake
output connector type to WebConsole. Since this is just a Kafka connector, the WebConsole must rely on theupdate_format
field to distinguish it from normal Kafka connectors when listing connectors or when opening a connector edit dialog.Details
,Server
andsecurity
tabs - no modifications.Format
tab only has one config option:Data format
, and the only supported data format is "JSON" (we will add "AVRO" in the future, so maybe we can show it as a disabled option).The text was updated successfully, but these errors were encountered: