Snowflake connector configuration dialog #859

ryzhyk · 2023-10-09T17:29:52Z

The design is very similar to what we did for Debezium, except that this is based on the output Kafka connector, not input.

Example config this connector should produce:

    stream: PREFERRED_VENDOR
    transport:
      name: kafka
      config:
        topic: snowflake.preferred_vendor
    format:
      name: json
      config:
        update_format: snowflake
    max_buffered_records: 1000000

The transport section is just a standard Kafka transport config. The format section specifies the json format, with update_format set to "snowflake" (our normal JSON format config dialog does not allow choosing update format).

Design

Add a new Kafka-Snowflake output connector type to WebConsole. Since this is just a Kafka connector, the WebConsole must rely on the update_format field to distinguish it from normal Kafka connectors when listing connectors or when opening a connector edit dialog.
The config dialog is based on the Kafka config dialog with the following changes:
- "New Kafka Output" -> "New Kafka-Snowflake output"
- "Add a Kafka output" -> "Output to a Snowflake table via a Kafka topic"
- Details, Server and security tabs - no modifications.
- Format tab only has one config option: Data format, and the only supported data format is "JSON" (we will add "AVRO" in the future, so maybe we can show it as a disabled option).

The text was updated successfully, but these errors were encountered:

We use the Snowflake Kafka connector with Snowpipe Streaming to send a stream of output changes from Feldera into Snowflake. The main challenge is that Snowpipe currently supports inserts but not updates or deletions. The workaround is to write all updates into a set of landing tables that mirror the target database schema and use a combination of snowflake streams to incrementally apply these updates to target tables by converting them into insert/update/delete commands. So the end-to-end process is: * Feldera outputs updates to Kafka * The Snowflake Kafka connector converts them into a stream of inserts into landing tables. * We attach a Snowpipe Stream to each landing table to track changes to the table. A periodic task reads updates from the stream, applies them to the target table and removes them from the landing tables. At the moment the landing tables and the data ingestion logic (Snowflake streams and tasks) must be written by the user, but they can in principle be automatically generated. TODO: - Docs (#867) - WebConsole support (#859) - Support Snowflake's `TIMESTAMP` format (#862) - Figure out how to apply multiple updates atomically (See: snowflakedb/snowflake-kafka-connector#717) - Test under load. - Automate the generation of landing tables and data ingestion tasks. - Figure out downtime and schema evolution ops. Addresses #774 Signed-off-by: Leonid Ryzhyk <leonid@feldera.com>

ryzhyk added Web Console Related to the browser based UI adapters Issues related to the adapters crate labels Oct 9, 2023

ryzhyk added this to the October 24, 2023 milestone Oct 9, 2023

ryzhyk assigned Karakatiza666 Oct 9, 2023

ryzhyk mentioned this issue Oct 11, 2023

Experimental Snowflake sink. #868

Merged

Karakatiza666 mentioned this issue Oct 16, 2023

WebConsole: Add Snowflake-Kafka output connector #881

Merged

Karakatiza666 closed this as completed Oct 16, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Snowflake connector configuration dialog #859

Snowflake connector configuration dialog #859

ryzhyk commented Oct 9, 2023 •

edited

Loading

Snowflake connector configuration dialog #859

Snowflake connector configuration dialog #859

Comments

ryzhyk commented Oct 9, 2023 • edited Loading

Design

ryzhyk commented Oct 9, 2023 •

edited

Loading