Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
We use the Snowflake Kafka connector with Snowpipe Streaming to send a stream of output changes from Feldera into Snowflake. The main challenge is that Snowpipe currently supports inserts but not updates or deletions. The workaround is to write all updates into a set of landing tables that mirror the target database schema and use a combination of snowflake streams to incrementally apply these updates to target tables by converting them into insert/update/delete commands. So the end-to-end process is: * Feldera outputs updates to Kafka * The Snowflake Kafka connector converts them into a stream of inserts into landing tables. * We attach a Snowpipe Stream to each landing table to track changes to the table. A periodic task reads updates from the stream, applies them to the target table and removes them from the landing tables. At the moment the landing tables and the data ingestion logic (Snowflake streams and tasks) must be written by the user, but they can in principle be automatically generated. TODO: - Docs (#867) - WebConsole support (#859) - Support Snowflake's `TIMESTAMP` format (#862) - Figure out how to apply multiple updates atomically (See: snowflakedb/snowflake-kafka-connector#717) - Test under load. - Automate the generation of landing tables and data ingestion tasks. - Figure out downtime and schema evolution ops. Addresses #774 Signed-off-by: Leonid Ryzhyk <leonid@feldera.com>
- Loading branch information