Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement Cassandra Table Assets #366

Merged
merged 7 commits into from
Sep 7, 2023
Merged

Conversation

eolivelli
Copy link
Member

@eolivelli eolivelli commented Sep 6, 2023

Summary:

  • implement support for the "cassandra-table" and the "cassandra-sink" assets
  • see below for the usage of the new assets
  • in vector-db-sink, with Cassandra/Astra we now support splitting table-name and keyspace (before it was only "table", that included the keyspace)

Other internal changes:

  • renamed AstraDBDataSource to CassandraDataSource, now we support both Astra and Cassandra almost everywhere (query, vector-db-sink and cassandra assets)
  • removed the DataSourceConfig object, as it worked only for Astra

This is how Cassandra assets look like:

name: "Write to Astra DB"
topics:
  - name: "input-topic"
    creation-mode: create-if-not-exists
assets:
  - name: "vsearch-keyspace"
    asset-type: "cassandra-keyspace"
    creation-mode: create-if-not-exists
    config:
      keyspace: "products"
      datasource: "CassandraDatasource"
      create-statements:
        - "CREATE KEYSPACE products WITH REPLICATION = {'class' : 'SimpleStrategy','replication_factor' : 1};"
  - name: "products-table"
    asset-type: "cassandra-table"
    creation-mode: create-if-not-exists
    config:
      table-name: "products"
      keyspace: "products"
      datasource: "CassandraDatasource"
      create-statements:
        - "CREATE TABLE IF NOT EXISTS products.products (id int PRIMARY KEY,name TEXT,description TEXT);"
pipeline:
  - name: "Write to Cassandra"
    type: "vector-db-sink"
    input: "input-topic"
    configuration:
      datasource: "AstraDatasource"
      table-name: "products"
      keyspace: "vsearch"
      mapping: "id=value.id,description=value.description,name=value.name"

In this example we declare a "cassandra-keyspace" asset that ensures that the Cassandra keyspace exists, in case it doesn't exit it creates it.

Then we have a "cassandra-table" that creates a table.

Please note that "create-statements" is a list, so you can perform multiple operations, for instance you can create the table and then create an index on a Vector field (or other fields)

@eolivelli eolivelli marked this pull request as ready for review September 7, 2023 11:44
@eolivelli eolivelli merged commit 20e07b1 into main Sep 7, 2023
8 checks passed
@eolivelli eolivelli deleted the impl/cassandra-table-asset branch September 7, 2023 12:11
benfrank241 pushed a commit to vectorize-io/langstream that referenced this pull request May 2, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

1 participant