Skip to content

Conversation

rohanshah18
Copy link
Contributor

@rohanshah18 rohanshah18 commented Mar 13, 2024

Problem

  1. The SDK currently lacks support for upserting a list of vectors.
  2. We will like to rename the clients and config classes so they can match Node and Python SDK's nomenclature.
  3. Also, for data plane operations, users have to instantiate PineconeConnection object, so to improve the UX and align it with Node and Python SDK, can we get/create a blocking and async data plane connection using Pinecone object?

Solution

Firstly, we have added the support to upsert a list of vectors.
Example usage:

List<String> upsertIds = Arrays.asList("v1", "v2", "v3");
String namespace = "namespace";
List<Float> values = Arrays.asList(1f, 2f, 3f);
List<Long> sparseIndices = Arrays.asList(0L, 1L, 2L);
List<Float> sparseValues = Arrays.asList(1.5f, 3.5f);
Struct metadataStruct = null;

// Create a list of vectors of size 3
List<VectorWithUnsignedIndices> vectors = new ArrayList<>(3);

for (String id : upsertIds) {
    vectors.add(buildUpsertVectorWithUnsignedIndices(id, values, sparseIndices, sparseValues, metadataStruct));
}
index.upsert(vectors, namespace);

We have renamed the following clients and config classes:

  1. PineconeControlPlaneClient is renamed to Pinecone
  2. PineconeBlockingDataPlaneClient is renamed to Index
  3. PineconeFutureDataPlaneClient is renamed to AsyncIndex

Next, we have added createIndexConnection() and createAsyncIndexConnection() to Pinecone class to improve the usage:

String apiKey = "PINECONE_API_KEY";
String indexName = "PINECONE_INDEX_NAME";
Pinecone pinecone = new Pinecone(apiKey);

// All control plane operations can be called using pinecone object
IndexModel indexModel = pinecone.describeIndex(indexName);

// For data plane operations:
Index index = pinecone.createAsyncIndexConnection(indexName);
List<String> upsertIds = Arrays.asList("v1", "v2", "v3");
FetchResponse fetchResponse = index.fetch(upsertIds);

// For async data plane operations:
Index index = pinecone.createAsyncIndexConnection(indexName);
List<String> upsertIds = Arrays.asList("v1", "v2", "v3");
ListenableFuture<FetchResponse> fetchResponse = index.fetch(upsertIds);

Type of Change

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to not work as expected)
  • This change requires a documentation update
  • Infrastructure change (CI configs, etc)
  • Non-code change (docs, etc)
  • None of the above: (explain here)

Test Plan

I have added UpsertAndQueryTestServerlessTest to test both createIndexConnection() and createAsyncIndexConnection() methods. Furthermore, I have renamed existing dataplane tests for pod indexes by adding pod to the class name. I'll be working on adding more data plane tests using both index creation methods for serverless indexes as a part of my next PR.

@rohanshah18 rohanshah18 changed the title Rename clients, add getIndex() and getAsyncIndex() in Pinecone Rename clients, add createIndexConnection() and createAsyncIndexConnection() Mar 13, 2024
@rohanshah18 rohanshah18 marked this pull request as ready for review March 13, 2024 21:22
@ssmith-pc
Copy link
Contributor

@rohanshah18, @austin-denoble, does it make sense to combine your efforts on integration tests and we can review them together? If so, I'll focus my review on src/main in this PR

@rohanshah18
Copy link
Contributor Author

@ssmith-pc I wouldn't mind it in normal circumstances but since this is the last ticket for milestone 2 and to get the next release out v1.0.0.RC1, I'd like to merge this as it is.

@austin-denoble
Copy link
Contributor

@ssmith-pc I wouldn't mind it in normal circumstances but since this is the last ticket for milestone 2 and to get the next release out v1.0.0.RC1, I'd like to merge this as it is.

I'm definitely fine having you merge first and I'll rebase my work on top. Working on finishing up reviewing now.

Copy link
Contributor

@austin-denoble austin-denoble left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall this looks good, I really appreciate the renaming and the helper functions added to Pinecone. This feels like it helps move us further along with our UX desires. I'm good to merge this in as is.

I've left some questions and concerns. I think we need to follow up as a group on handling upsert.

@rohanshah18 rohanshah18 changed the title Rename clients, add createIndexConnection() and createAsyncIndexConnection() Add upsert(List<Vectors>, String namespace), Rename clients, add createIndexConnection() and createAsyncIndexConnection() Mar 14, 2024
@rohanshah18 rohanshah18 merged commit 7536ac0 into main Mar 14, 2024
@rohanshah18 rohanshah18 deleted the rshah/renameClients branch March 14, 2024 17:52
Copy link
Contributor

@austin-denoble austin-denoble left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks very much for getting the list vector upsert support in place. I've left some questions and comments but we can address those in follow-up PRs.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants