Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Rework support of AstraDB and Cassandra #548

Merged
merged 8 commits into from Feb 8, 2024
Merged

Conversation

clun
Copy link
Collaborator

@clun clun commented Jan 25, 2024

In the Datastax Astra DB saas solution, a new way to integrate with vector databases has been introduced: using an HTTP APi instead of the Cassandra Cluster. It is called the DataAPI and use the MongoDB principles with collections.

The pull request includes the following:

Update on previous implementations

  • Previous implementations of embedding stores have been grouped in a single CassandraEmbeddingStore. It can be instantiated for Astra or OSS Cassandra based on 2 different constructor builders but everything else is the same.

  • Previous implementations of chat memory stores have been grouped in a single CassandraChatMemoryStore. It can be instantiated for Astra or OSS Cassandra based on 2 different constructor builders but everything else is the same.

  • Integration test for OSS Cassandra now using test containers (as Cassandra 5-alpha2 image is out)

  • Usage

// Using with Astra (Cassandra AAS in the cloud)
CassandraEmbeddingStore.builderAstra()
  .token(token)
  .databaseId(dbId)
  .databaseRegion(TEST_REGION)
  .keyspace(KEYSPACE)
  .table(TEST_INDEX)
  .dimension(11)
  .metric(CassandraSimilarityMetric.COSINE)
  .build();

// Using OSS Cassandra
CassandraEmbeddingStore.builder()
  .contactPoints(Arrays.asList(contactPoint.getHostName()))
  .port(contactPoint.getPort())
  .localDataCenter(DATACENTER)
  .keyspace(KEYSPACE)
  .table(TEST_INDEX)
  .dimension(11)
  .metric(CassandraSimilarityMetric.COSINE)
  .build();

-Adding jdk11 in the pom

<maven.compiler.source>11</maven.compiler.source>
<maven.compiler.target>11</maven.compiler.target>
  • introducing insertMany(), distributed to all bulk loading

  • Extending the variables EmbeddingStoreIT

  • Using MessageWindowChatMemory for the tests.

@langchain4j
Copy link
Owner

Hi @clun thanks a lot for your contribution! I will check this next week after the release.

langchain4j
langchain4j previously approved these changes Feb 8, 2024
Copy link
Owner

@langchain4j langchain4j left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@clun great job, thank you!

@langchain4j langchain4j merged commit cd006b1 into langchain4j:main Feb 8, 2024
6 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants