Merge release-candidate/2025-01 #455

jhamon · 2025-02-07T14:23:15Z

Problem

In preparation to release v6, we need to merge recent changes in the release-candidate/2025-01 branch.

Solution

Merge conflicts seem limited to just a few things in the pyproject.toml and poetry.lock file, as well as a few method signatures modified to indicate None as return type. So I think this should go pretty smoothly.

Dave Rigby found that upgrading this protobuf dependency delivered a significant performance improvement, but I needed to temporarily revert it to push out a 5.x release without a breaking change to dependencies.

## Problem Needed to consume the 2024-10 API spec. ## Solution The spec updates involved many changes to operation ids. This should not touch the user experience, but does mean this diff is somewhat larger than usual due to the large number of tests for existing functionality that needed to be adjusted for the new names: - upsert > upsert_vector - fetch > fetch_vectors - query > query_vectors - delete > delete_vectors - list > list_vectors - start_import > start_bulk_import - describe_import > describe_bulk_import - etc ### Other changes #### Cleanup prerelease code no longer needed - Deleted `pinecone/core_ea/` which was the home of some code generated off of the 2024-10 spec when bulk import functionality was in a pre-release state. We no longer need this. That's why this PR has 21k lines removed. #### Changes to generation process - Placed a copy of classes such as `ApiClient`, `Endpoint`, `ModelNormal`, etc which were previously generated into a new folder, `pinecone/openapi_support`. Even though these used to be generated, they contain no dynamic content. So taking them out of the generation process should make it significantly easier to work on core improvements to the UX and performance of the generated code, since the source of truth will no longer be a mustache file (FML) - Updated the codegen script `./codegen/build-oas.sh 2024-10 false` to delete instead of de-dupe copies of shared classes like `ApiClient`, and edit model+api files that continue to be generated to find the classes they need in the new `openapi_support` folder. #### mypy fixes - I needed to make adjustments in some `openapi_support` classes to satisfy the mypy type checker (`pinecone/core` has been ignored in the past, and our goal should be to eventually have 100% of code typechecked). - Wrote some initial unit tests to characterize some of the behavior in `ApiClient`. The tests were to help give me more confidence I wasn't breaking something along the way while making adjustments for mypy. - Added a dev dependency on a package with types for datetime ## Type of Change - [ ] Bug fix (non-breaking change which fixes an issue) - [ ] New feature (non-breaking change which adds functionality) - [x] Breaking change (fix or feature that would cause existing functionality to not work as expected) - [ ] This change requires a documentation update - [ ] Infrastructure change (CI configs, etc) - [ ] Non-code change (docs, etc) - [ ] None of the above: (explain here)

## Problem Sometimes users incorrectly instantiate the Index client like this: ```python import pinecone # Initialize the Index with the host index = pinecone.Index(index_name, host=index_host) ``` Then they will later get an authentication exception when using it, since the `Index` class does not have the configuration values it needs when attempting to perform vector operations. ```python ForbiddenException: (403) Reason: Forbidden HTTP response headers: HTTPHeaderDict({'Date': 'Wed, 13 Nov 2024 02:06:45 GMT', 'Content-Type': 'text/plain', 'Content-Length': '9', 'Connection': 'keep-alive', 'x-pinecone-auth-rejected-reason': 'Wrong API key', 'www-authenticate': 'Wrong API key', 'server': 'envoy'}) HTTP response body: Forbidden ``` ## Solution - Rename the `Index` implementation to `_Index` so that people will not accidentally interact with it. - Add a new stub implementation for `Index` that throws an informative message. Bonus: - Move the Index client docstrings off the implementation class and into a related abstract base class. This wasn't strictly necessary, but I was feeling a bit overwhelmed by the size of the `index.py` file. I copied this approach from the grpc module. pdoc seems to still find and render the docs okay when doing this. ## Usage The error message incorporates the args/kwargs the user was attempting to pass. One positional arg ```python >>> import pinecone >>> i = pinecone.Index('my-index') Traceback (most recent call last): File "<stdin>", line 1, in <module> File "/Users/jhamon/workspace/pinecone-python-client/pinecone/data/index.py", line 113, in __init__ raise IndexClientInstantiationError(args, kwargs) pinecone.data.index.IndexClientInstantiationError: You are attempting to access the Index client directly from the pinecone module. The Index client must be instantiated through the parent Pinecone client instance so that it can inherit shared configurations such as API key. INCORRECT USAGE: ``` import pinecone pc = pinecone.Pinecone(api_key='your-api-key') index = pinecone.Index('my-index') ``` CORRECT USAGE: ``` from pinecone import Pinecone pc = Pinecone(api_key='your-api-key') index = pc.Index('my-index') ``` ``` Multiple positional args ```python >>> i = pinecone.Index('my-index', 'https://my-index.blahblah.com') Traceback (most recent call last): File "<stdin>", line 1, in <module> File "/Users/jhamon/workspace/pinecone-python-client/pinecone/data/index.py", line 113, in __init__ raise IndexClientInstantiationError(args, kwargs) pinecone.data.index.IndexClientInstantiationError: You are attempting to access the Index client directly from the pinecone module. The Index client must be instantiated through the parent Pinecone client instance so that it can inherit shared configurations such as API key. INCORRECT USAGE: ``` import pinecone pc = pinecone.Pinecone(api_key='your-api-key') index = pinecone.Index('my-index', 'https://my-index.blahblah.com') ``` CORRECT USAGE: ``` from pinecone import Pinecone pc = Pinecone(api_key='your-api-key') index = pc.Index('my-index', 'https://my-index.blahblah.com') ``` ``` One keyword arg: ```python >>> i = pinecone.Index(host='https://my-index.blahblah.com') Traceback (most recent call last): File "<stdin>", line 1, in <module> File "/Users/jhamon/workspace/pinecone-python-client/pinecone/data/index.py", line 113, in __init__ raise IndexClientInstantiationError(args, kwargs) pinecone.data.index.IndexClientInstantiationError: You are attempting to access the Index client directly from the pinecone module. The Index client must be instantiated through the parent Pinecone client instance so that it can inherit shared configurations such as API key. INCORRECT USAGE: ``` import pinecone pc = pinecone.Pinecone(api_key='your-api-key') index = pinecone.Index(host='https://my-index.blahblah.com') ``` CORRECT USAGE: ``` from pinecone import Pinecone pc = Pinecone(api_key='your-api-key') index = pc.Index(host='https://my-index.blahblah.com') ``` ``` Multiple kwargs ```python >>> i = pinecone.Index(name='my-index', host='https://my-index.blahblah.com', pool_threads=20) Traceback (most recent call last): File "<stdin>", line 1, in <module> File "/Users/jhamon/workspace/pinecone-python-client/pinecone/data/index.py", line 113, in __init__ raise IndexClientInstantiationError(args, kwargs) pinecone.data.index.IndexClientInstantiationError: You are attempting to access the Index client directly from the pinecone module. The Index client must be instantiated through the parent Pinecone client instance so that it can inherit shared configurations such as API key. INCORRECT USAGE: ``` import pinecone pc = pinecone.Pinecone(api_key='your-api-key') index = pinecone.Index(name='my-index', host='https://my-index.blahblah.com', pool_threads=20) ``` CORRECT USAGE: ``` from pinecone import Pinecone pc = Pinecone(api_key='your-api-key') index = pc.Index(name='my-index', host='https://my-index.blahblah.com', pool_threads=20) ``` ``` Mixed args/kwargs ```python >>> i = pinecone.Index('my-index', host='https://my-index.blahblah.com', pool_threads=20) Traceback (most recent call last): File "<stdin>", line 1, in <module> File "/Users/jhamon/workspace/pinecone-python-client/pinecone/data/index.py", line 113, in __init__ raise IndexClientInstantiationError(args, kwargs) pinecone.data.index.IndexClientInstantiationError: You are attempting to access the Index client directly from the pinecone module. The Index client must be instantiated through the parent Pinecone client instance so that it can inherit shared configurations such as API key. INCORRECT USAGE: ``` import pinecone pc = pinecone.Pinecone(api_key='your-api-key') index = pinecone.Index('my-index', host='https://my-index.blahblah.com', pool_threads=20) ``` CORRECT USAGE: ``` from pinecone import Pinecone pc = Pinecone(api_key='your-api-key') index = pc.Index('my-index', host='https://my-index.blahblah.com', pool_threads=20) ``` ## Type of Change - [x] New feature (non-breaking change which adds functionality) - [x] Breaking change This change is a UX feature that could be considered breaking if someone was importing `Index` directly and going out of their way to set it up correctly. This was never documented usage, but somebody skilled at reading code could have figured out how to do this.

## Problem With my asyncio client in progress (not in this diff), I found myself needing to reuse a lot of the openapi request-building boilerplate. ## Solution Rather than copy/paste and create a lot of duplication, I want to pull that logic out into a separate class for building request objects. This change should not break existing behavior in the Index client. ## Todo This was pretty much just a mechanical extraction. I'd like to cleanup and standardize some stuff regarding `_check_type`, an openapi param that we're going to silly lengths to flip the default behavior on, in a follow-up diff. ## Type of Change - [x] None of the above: Refactor ## Test Plan Should still have tests passing

In order to merge results across multiple queries, the SDK must know which similarity metric an index is using. For dotproduct and cosine indexes, a larger score is better while for euclidean a smaller score is better. Unfortunately the data plane API does not currently expose the metric type and a separate call to the control plane to find out seems undesirable from a resiliency and performance perspective. As a workaround, in the initial implementation of `query_namespaces` the SDK would infer the similarity metric needed to merge results by seeing whether the scores of query results were ascending or descending. This worked well, but imposes an implicit limitation that there must be at least 2 results returned. We initially believed this would not be a problem but have since learned that applications using filtering can sometimes filter out all or most results. So an approach that has the user explicitly telling the SDK what similarity metric is being used is preferred to handle these edge cases with 1 or 0 results. - Add a required kwarg to `query_namespaces` to specify the index similarity metric. - Modify `QueryResultsAggregator` to use this similarity metric, and strip out code that was involved in inferring whether results were ascending or descending. - Adjust integration tests to pass new metric kwarg. Except for adding the new kwarg, query_namespaces integration tests did not need to change which indicates the underlying behavior is still working as before. - [x] Bug fix (non-breaking change which fixes an issue)

## Problem By renaming the `Index` class to `_Index` in #418, I accidentally broke plugin installation. ## Solution Rename the class back to `Index` and instead alias it to `_Index` through import statements. We still need to alias it within the `pinecone/data/__init__.py` module definition to avoid collision with the Index class designed to give users a more informative error when they incorrectly import the `Index` class from the `pinecone` module. I went ahead and also cleaned up some stuff where we were unnecessarily exposing some internal openapi details in the public interface. Users should never realistically need to construct openapi request objects. ## Type of Change - [x] Bug fix (non-breaking change which fixes an issue) - [ ] New feature (non-breaking change which adds functionality) - [ ] Breaking change (fix or feature that would cause existing functionality to not work as expected) - [ ] This change requires a documentation update - [ ] Infrastructure change (CI configs, etc) - [ ] Non-code change (docs, etc) - [ ] None of the above: (explain here)

## Problem Need to incorporate change for upcoming sparse index support. ## Solution Incorporating these changes represents a few different types of challenges ### Control plane changes - Create index adjusted to accept new optional field, `vector_type`. ### Data plane changes - Codegen for proto changes - Needed to add codegen scripts to support generation of new grpc message formats from an updated proto file for the first time. This used to be done with some scripts directly invoking `protoc` CLI from the now-deprecated private repo `pinecone-protos`. But since we've moved away from that pattern, needed to implement something here for the first time. - I decided to use the [Buf CLI](https://buf.build/product/cli), which gives a more modern way to manage the required dependencies for stub generation via a plugin system. - Dependency changes - Had to bump our protobuf and googleapis-common-protos deps to match the Buf cli's plugin output. But that's a good thing as we were on fairly old versions and updating protobuf in particular is supposed to unlock some nontrivial performance benefit. - Vector object switcheroo for UX - The `Vector` object generated by OpenAPI for the REST pathway was unfortunately insisting that an empty array `values=[]` be sent when upserting for sparse indexes. This seems annoying and stupid if you are focused on the sparse use case, so I ended up moving away from having users use that generated object in favor of my own custom Vector dataclass. The blast radius of this change was fairly large, requiring updates to the various vector factory classes and hundreds of tests. - For those keeping score, there are now 5 similar things that can be called `Vector` inside the SDK: - a class generated with OpenAPI - a grpc Message object - a user-supplied dictionary with specific keys defined by a `TypedDict` - a user-supplied tuple with either 2 or 3 elements representing id, values, and metadata - **New: a `@dataclass` I created in this PR** - Out of these "vector" things listed above, the SDK used to provide the OpenAPI vector class to those who write code along the lines of `from pinecone import Vector`. This was already bad due to the interoperability problems between the GRPC and REST code paths this brought into the vector factory classes (e.g. if you were attempting to pass an OpenAPI Vector object into the GRPC-based upsert implementation extra logic was needed to convert that into the grpc Message object). But with the introduction of sparse, there was no workaround that would give the user the ability to instantiate the OpenAPI vector without also passing `values=[]` which just seemed like a bridge too far from a UX standpoint. So the decision was made to use this name `Vector` to refer to the custom dataclass instead and have that be the main vehicle for user input. - This object switch also caused a bunch of mypy type checks and tests to start failing in subtle ways that required extra time and effort to track down since even though the new and old objects were functionally equivalent to the user they are not the same to the type checker so many type annotations and tests had to be adjusted. But seems worth it for the better UX, and also gets us onto a standard footing for how input should be supplied to both REST and GRPC index clients. - I started writing new integration tests for sparse indexes, but they're disabled in this PR because they currently can't run on prod. ### Other changes - I disabled tests on the CI dev release workflow. Seems like I end up doing this every time I am working on a complex change and want to do some hands on testing with google colab even when 100% of tests aren't passing. I think I will leave the tests commented out until I can figiure out how to wire up a config option that controls them. - I created some named type definitions under `pinecone/data/types` for dictionary objects the user can pass in various places for vector, sparse values, metadata, filter, etc. This is to increase the consistency and readability of types in the code. When they were defined inline, sometimes small inconsistencies creep in. This wasn't really planned in advance, but I had enough challenges wrangling mypy that it seemed simpler to actually centralize some of these definitions into one place than try to iron out minor inconsistencies any other way. - A lot of fixture stuff was moved out of tests/integration/data/conftest and into individual test files. This unified "upsert everything, then run all the tests" approach first began when the product surface was much smaller and freshness was a much bigger problem so it was more advantageous to do all the post-upsert waiting at one time before running any tests. In practice, this has been a big obstacle to development because it means you can't easily run a single test file without the heavy duty global setup and it creates a big opportunity for test pollution if any tests modify state in the index. Breaking this up and putting setup closer to the places where it is tested solves this problem and makes it easier to work with individual test files. ## Usage ```python import random from pinecone import ServerlessSpec, Vector, SparseVector pc = Pinecone(api_key='key') # Create sparse index index_name = f"sparse-testing-{random.randint(0,10000)}" pc.create_index( name=index_name, metric='dotproduct', spec=ServerlessSpec(cloud='aws', region='us-east-1'), vector_type='sparse' ) # Get the index client sparse_index = pc.Index(name=index_name) # Upsert sparse (random data just for illustration purposes) def unique_random_integers(n, range_start, range_end): if n > (range_end - range_start + 1): raise ValueError("Range too small for the requested number of unique integers") return random.sample(range(range_start, range_end + 1), n) # Generate some random sparse vectors sparse_index.upsert( vectors=[ Vector( id=str(i), sparse_values=SparseValues( indices=unique_random_integers(10, 0, 10000), values=[random.random() for j in range(10)] ) ) for i in range(10000) ], batch_size=500, ) # Query sparse sparse_index.query( top_k=10, sparse_vector={ "indices": [1,2,3,4,5], "values": [random.random()]*5 } ) ``` ## Type of Change - [x] New feature (non-breaking change which adds functionality)

…#426) ## Problem Want to expose kwargs for setting and modifying index tags ## Solution The usual. The generated bits backing this feature already got created when I generated off the `2025-01` spec. I added a small amount of logic to handle merging new and existing tags when using configure. ## Usage ```python >>> from pinecone import Pinecone, ServerlessSpec >>> pc = Pinecone(api_key='key') # Create index with optional tags dictionary >>> pc.create_index( ... name='myindex', ... dimension=10, ... metric='cosine', ... spec=ServerlessSpec(cloud='aws', region='us-west-2'), ... vector_type='dense', ... tags={'env': 'testing', 'author': 'jhamon'} ... ) >>> pc.describe_index(name='myindex') { "name": "myindex", "metric": "cosine", "host": "myindex-dojoi3u.svc.apw5-4e34-81fa.pinecone.io", "spec": { "serverless": { "cloud": "aws", "region": "us-west-2" } }, "status": { "ready": true, "state": "Ready" }, "vector_type": "dense", "dimension": 10, "deletion_protection": "disabled", "tags": { "author": "jhamon", "env": "testing" } } # Update a tag value. This will be merged with existing tags. >>> pc.configure_index( ... name='myindex', ... tags={'env': 'production'} ... ) >>> pc.describe_index(name='myindex') { "name": "myindex", "metric": "cosine", "host": "myindex-dojoi3u.svc.apw5-4e34-81fa.pinecone.io", "spec": { "serverless": { "cloud": "aws", "region": "us-west-2" } }, "status": { "ready": true, "state": "Ready" }, "vector_type": "dense", "dimension": 10, "deletion_protection": "disabled", "tags": { "author": "jhamon", "env": "production" } } # Remove a tag by sending empty string value >>> pc.configure_index( ... name='myindex', ... tags={'author': ''} ... ) >>> pc.describe_index(name='myindex') { "name": "myindex", "metric": "cosine", "host": "myindex-dojoi3u.svc.apw5-4e34-81fa.pinecone.io", "spec": { "serverless": { "cloud": "aws", "region": "us-west-2" } }, "status": { "ready": true, "state": "Ready" }, "vector_type": "dense", "dimension": 10, "deletion_protection": "disabled", "tags": { "env": "production" } } ``` ## Type of Change - [x] New feature (non-breaking change which adds functionality)

## Problem The purpose of this PR is to introduce a class, `AsyncioIndex` that provides an async version of the functionality found in the `Index` client. This includes standard index data plane operations such as `upsert`, `query`, etc as well as bulk import operations (`start_import`, `list_imports`, etc). ## Solution This is a very complex diff with many moving parts. - New dependency on `aiohttp`, an asyncio-compatible http client. - New dev dependency on `pytest-asyncio` to support async testing - Heavy refactoring in `pinecone/openapi_support` to introduce asyncio-variants of existing classes: `AsyncioApiClient`, `AsyncioEndpoint`, and `AiohttpRestClient`. I don't love the way any of these are currently laid out, but for simplicity sake I decided to hew close to the existing organization since this was already going to be a complex change. - Adjustments to our private python openapi templates in order to generate asyncio versions of api client (e.g. `AsyncioVectorOperationsApi`) objects and reference the objects named above. - Create a new class, `AsyncioIndex` that uses these asyncio variant objects. Since the majority of the logic (validation, etc) inside each data plane method of `Index` was previously extracted into `IndexRequestFactory`, the amount of actual new code needed inside this class was minimal async from signature changes to use `async` / `await`. - Add new integration test covering asyncio usage with both sparse and dense indexes. - Very mechnical refactoring to also bring bulk import functionality into the AsyncioIndex class as a mixin. I did not add automated tests for these due to the external dependencies required to properly integration test this (e.g. parquet files hosted on S3). Will need to manually verify these in testing. Also: - Drop python 3.8, which is now end of life - Removed `ddtrace` dev dependency for logging test info in datadog. This was giving me a lot of annoying errors when running tests locally. I will troubleshoot and bring it back in later. - Updated `jinja` and `virtualenv` versions in our poetry.lock file to resolve dependabot alerts - Work to implement the asyncio codepath for GRPC was previously handled in a different diff ## Usage In a standalone script, you might do something like this: ```python import random import asyncio from pinecone import Pinecone async def main(): pc = Pinecone(api_key="key") async with pc.AsyncioIndex(name="index-name") as index: tasks = [ index.query( vector=[random.random()] * 1024, namespace="ns1", include_values=False, include_metadata=True, top_k=2 ) for _ in range(20)] # Execute 20 queries in parallel results = await asyncio.gather(*tasks) print(results) asyncio.run(main()) ``` ## Type of Change - [ ] Bug fix (non-breaking change which fixes an issue) - [x] New feature (non-breaking change which adds functionality) - [ ] Breaking change (fix or feature that would cause existing functionality to not work as expected) - [x] This change requires a documentation update - [x] Infrastructure change (CI configs, etc) - [ ] Non-code change (docs, etc) - [ ] None of the above: (explain here) ## Test Plan Describe specific steps for validating this change.

## Problem We want to migrate the content of the inference plugin into the core SDK. This should improve UX by allowing code completion to function properly. ## Solution - Begin by copying & modifying code from [plugin](https://github.com/pinecone-io/python-plugin-inference) - Lazily instantiate the class from the `Pinecone` parent class when someone attempts to use inference capabilities. This saves a little bit of startup overhead if someone is not using these functions. - Migrated tests from the plugin and added coverage of some additional models - Removed dependency on the `pinecone-plugin-inference` package, since we now have all the functionality of that plugin in the core SDK. In addition to copy & modify: - Implemented Enum objects, `RerankModel` and `EmbedModel`, to serve mainly as documentation and support for code suggestions. You can still pass in any string you like to keep this forward compatible with any future models. - Inference class now scans for plugins extending its own capability during initialization. This is so that we can add more experimental things into the `inference` namespace via future plugins if desired. I should really define a base class that does this scan for all classes by default but have not gotten around to it yet. - Added some warning messages if the user attempts to interact with the Inference class directly instead of going through the parent object. This is following a pattern we started for the Index client based on customer success team feedback. - Removed old integration tests verifying the old plugin was being correctly installed. ## Usage: Embed ```python from pinecone import Pinecone pc = Pinecone(api_key='key') # You can pass the model name as a string if you know the correct value embeddings = pc.inference.embed( model="multilingual-e5-large", inputs=["The quick brown fox jumps over the lazy dog.", "lorem ipsum"], parameters={"input_type": "query", "truncate": "END"}, ) ``` Or, if you'd like you can import an enum of available options ```python from pinecone import Pinecone, EmbedModel pc = Pinecone(api_key='key') # You can get the model name from an enum embeddings = pc.inference.embed( model=EmbedModel.Multilingual_E5_Large, inputs=["The quick brown fox jumps over the lazy dog.", "lorem ipsum"], parameters={"input_type": "query", "truncate": "END"}, ) ``` Or, for the very clever/lazy, the enum is also attached to the `inference` namespace ```python from pinecone import Pinecone pc = Pinecone(api_key='key') embeddings = pc.inference.embed( model=pc.inference.EmbedModel.Multilingual_E5_Large, inputs=["The quick brown fox jumps over the lazy dog.", "lorem ipsum"], parameters={"input_type": "query", "truncate": "END"}, ) ``` ## Usage: Rerank Same enum mechanic is now available for Rerank options ```python from pinecone import Pinecone, RerankModel pc = Pinecone(api_key='key') model = RerankModel.Bge_Reranker_V2_M3 result = client.inference.rerank( model=model, query="i love dogs", documents=["dogs are pretty cool", "everyone loves dogs", "I'm a cat person"], return_documents=False, ) ``` ## Type of Change - [ ] Bug fix (non-breaking change which fixes an issue) - [x] New feature (non-breaking change which adds functionality) - [ ] Breaking change (fix or feature that would cause existing functionality to not work as expected) - [x] This change requires a documentation update - [ ] Infrastructure change (CI configs, etc) - [ ] Non-code change (docs, etc) - [ ] None of the above: (explain here)

## Problem Many configuration fields take string inputs even though there is a limited range of accepted values. It's poor UX having to go into documentation or examples in order to know which string values are available. This also means support via type hints from code editors are not available to keep people moving quickly. ## Solution - Create Enum classes for control plane configuration fields under `pinecone.enum`: - General index configs: `Metric`, `VectorType`, `DeletionProtection` - Serverless spec: `CloudProvider`, `AwsRegion`, `GcpRegion`, `AzureRegion` - Pod spec: `PodIndexEnvironment`, `PodType` kwargs that accept these values are loosely typed as the union of the enum type and string. This should prevent unnecessary breaking changes and maintain flexibility to accept new values that may not be avaialble or known at the time this SDK release is published. For example, if in the future pinecone can deploy to more Azure regions, this loose typing would allow a person to pass that configuration as region without necessarily having to update their SDK to satisfy to a type check. ## Usage: Serverless ```python # Old way, which still works but requires you to know what values are available from pinecone import Pinecone, ServerlessSpec pc = Pinecone(api_key='key') pc.create_index( name="my-index", dimension=1024, metric="cosine", spec=ServerlessSpec( cloud="aws", region="us-west-2" ), vector_type="sparse" ) ``` ```python # New way, using enum types from pinecone import ( Pinecone, ServerlessSpec, Metric, VectorType, CloudProvider, AwsRegion ) pc = Pinecone(api_key='key') pc.create_index( name="my-index", dimension=1024, metric=Metric.COSINE, spec=ServerlessSpec( cloud=CloudProvider.AWS, region=AwsRegion.US_WEST_2 ), vector_type=VectorType.SPARSE ) ``` ## Usage: Pods ```python # old way, you have to know all the magic strings from pinecone import Pinecone, PodSpec pc = Pinecone(api_key='key') pc.create_index( name="my-index", dimension=1024, spec=PodSpec( pod_type='s1.x4' environment="us-east1-gcp" ), ) # Later, when scaling pc.configure_index( name="my-index", pod_type="s1.x8" ) ``` ```python # New way, using enum types from pinecone import ( Pinecone, PodSpec, PodIndexEnvironment, PodType ) pc = Pinecone(api_key='key') pc.create_index( name="my-index", dimension=1024, spec=PodSpec( environment=PodIndexEnvironment.US_EAST1_GCP, pod_type=PodType.S1_X4 ) ) # Later, when scaling pc.configure_index( name="my-index", pod_type=PodType.S1_X8 ) ``` ## Type of Change - [ ] Bug fix (non-breaking change which fixes an issue) - [X] New feature (non-breaking change which adds functionality) - [ ] Breaking change (fix or feature that would cause existing functionality to not work as expected) - [ ] This change requires a documentation update - [ ] Infrastructure change (CI configs, etc) - [ ] Non-code change (docs, etc) - [ ] None of the above: (explain here)

## Problem We need to migrate `create_index_for_model` functionality from the records plugin into the core of the SDK to provide improved UX around code completions and error handling ## Solution - Copy & modify implementation of `create_index_for_model` and `IndexEmbed` from records plugin - I did some moderately heavy refactoring of this code and also the existing `create_index` method to reduce the amount of duplication. - Add records plugin to list of deprecated plugins that will halt startup with an exception - Copy relevant tests from plugin and supplement with additional tests to ensure enum types and edge cases are handled properly even after refactoring - Reuse recently-defined enums to make the method signature more self-documenting. ## Todo - Data plane operations in the records plugin will be handled in a separate PR ## Usage These would all be considered valid usage. Enums are available to help know what values are accepted, but you can type the literal strings if you prefer. This flexibility also keeps compatibility with existing usage of the plugin. ```python from pinecone import ( Pinecone, EmbedModel, CloudProvider, AwsRegion, IndexEmbed, Metric ) pc = Pinecone(api_key='key') # All hand-crafted literals pc.create_index_for_model( name='index-name', cloud='aws', region='us-east-1', embed={ "model": "multilingual-e5-large", "field_map": {"text": "my-sample-text"}, "metric": "cosine" }, ) # All enum values pc.create_index_for_model( name='index-name', cloud=CloudProvider.AWS, region=AwsRegion.US_EAST_1, embed=IndexEmbed( model=EmbedModel.Multilingual_E5_Large, field_map={"text": "my-sample-text"}, metric=Metric.COSINE ), ) # Mixed literals and enums pc.create_index_for_model( name='index-name', cloud='aws', region=AwsRegion.US_EAST_1, embed={ "model": EmbedModel.Multilingual_E5_Large, "field_map": {"text": "my-sample-text"}, "metric": "cosine" }, ) ``` ## Type of Change - [x] New feature (non-breaking change which adds functionality)

## Problem Migrating `search_records` (aliased to `search`) and `upsert_records` from the `pinecone-plugin-records` plugin. ## Solution Working off the content of the records plugin, I have done the following: - Adjusted the codegen script to fix the way openapi generator handles underscore fields such as `_id` and `_score` - Adjusted the rest library code in `rest_urllib3.py` and `rest_aiohttp.py` to handle record uploading with content-type `application/x-ndjson` - Copied and modified the integration tests from the plugin - Extracted a lot of the guts of the `upload_records` and `search_records` methods into the request factory where they could more easily be unit tested. The logic around parsing user inputs into the openapi request objects is surprisingly complicated, so I added quite a lot of new unit tests checking some of those edge cases. - Compared to the plugin implementation, the major changes are: - Made `search` an alias of `search_records` - Moved away from usages of `.pop()` which mutates the input objects; this could be confusing for users if they are using those objects for anything else - Added better typing of dict fields - Incorporated optional use of enum values for `RerankModel` - Added asyncio variants of these methods, although most of the guts are shared in the request factory. I already handled disallowing the records plugin in yesterday's PR #438 ## Usage ```python from pinecone import Pinecone, CloudProvider, AwsRegion, EmbedModel, RerankModel pc = Pinecone(api_key="key") # Create an index for your embedding model index_model = pc.create_index_for_model( name="my-model-index", cloud=CloudProvider.AWS, region=AwsRegion.US_EAST_1, embed={ "model": EmbedModel.Multilingual_E5_Large, "field_map": {"text": "my_text_field"} } ) # Create an index client index = pc.Index(host=index_model.host) # Upsert records namespace = "target-namespace" index.upsert_records( namespace=namespace, records=[ { "_id": "test1", "my_text_field": "Apple is a popular fruit known for its sweetness and crisp texture.", }, { "_id": "test2", "my_text_field": "The tech company Apple is known for its innovative products like the iPhone.", }, { "_id": "test3", "my_text_field": "Many people enjoy eating apples as a healthy snack.", }, { "_id": "test4", "my_text_field": "Apple Inc. has revolutionized the tech industry with its sleek designs and user-friendly interfaces.", }, { "_id": "test5", "my_text_field": "An apple a day keeps the doctor away, as the saying goes.", }, { "_id": "test6", "my_text_field": "Apple Computer Company was founded on April 1, 1976, by Steve Jobs, Steve Wozniak, and Ronald Wayne as a partnership.", }, ], ) # Search for similar records response = index.search( namespace=namespace, query={ "inputs":{ "text": "Apple corporation", }, "top_k":3, }, rerank={ "model": RerankModel.Bge_Reranker_V2_M3, "rank_fields": ["my_text_field"], "top_n": 3, }, ) ``` These methods also have asyncio variants available ```python import asyncio from pinecone import Pinecone, RerankModel async def main(): # Create an index client pc = Pinecone(api_key='key') index = pc.AsyncioIndex(host='host') # Upsert records namespace = "target-namespace" records = [ { "_id": "test1", "my_text_field": "Apple is a popular fruit known for its sweetness and crisp texture.", }, { "_id": "test2", "my_text_field": "The tech company Apple is known for its innovative products like the iPhone.", }, { "_id": "test3", "my_text_field": "Many people enjoy eating apples as a healthy snack.", }, { "_id": "test4", "my_text_field": "Apple Inc. has revolutionized the tech industry with its sleek designs and user-friendly interfaces.", }, { "_id": "test5", "my_text_field": "An apple a day keeps the doctor away, as the saying goes.", }, { "_id": "test6", "my_text_field": "Apple Computer Company was founded on April 1, 1976, by Steve Jobs, Steve Wozniak, and Ronald Wayne as a partnership.", }, ] await index.upsert_records( namespace=namespace, records=records, ) # Search for similar records response = await index.search( namespace=namespace, query={ "inputs":{ "text": "Apple corporation", }, "top_k":3, }, rerank={ "model": RerankModel.Bge_Reranker_V2_M3, "rank_fields": ["my_text_field"], "top_n": 3, }, ) asyncio.run(main()) ``` ## Type of Change - [ ] Bug fix (non-breaking change which fixes an issue) - [x] New feature (non-breaking change which adds functionality) - [ ] Breaking change (fix or feature that would cause existing functionality to not work as expected) - [ ] This change requires a documentation update - [ ] Infrastructure change (CI configs, etc) - [ ] Non-code change (docs, etc) - [ ] None of the above: (explain here)

## Problem Previous work implemented asyncio for the db data plane, and now we want to roll out a similar approach for the db control plane and inference as well. ## Solution - Extract request construction logic out of `Pinecone` and move it to a request factory - Implement `PineconeAsyncio` using the request factory to keep most of the method-specific logic the same. - Add new integration tests using the asyncio code path. These are mostly modified from the existing serverless integration tests. - Update tests for the asyncio index client to reflect new setup steps - Some refactorings around async context management to address log warnings being shown from aiohttp ## Usage The async version of the client has some async setup/teardown related to the underlying aiohttp library being used. You can either use the `async with` syntax to have the async context automatically managed for you. Or, if you prefer, you can take the responsibility to close the async context yourself by using `close()`. #### Context management option 1: Using `async with` ```python import asyncio from pinecone import ( PineconeAsyncio, ServerlessSpec, CloudProvider, AwsRegion, ) async def main(): async with PineconeAsyncio(api_key="key") as pc: await pc.create_index( name="my-index", metric="cosine", spec=ServerlessSpec( cloud=CloudProvider.AWS, region=AwsRegion.US_EAST_1 ), ) asyncio.run(main()) ``` #### Context management option 2: Manually `close()` ```python import asyncio from pinecone import ( PineconeAsyncio, ServerlessSpec, CloudProvider, AwsRegion, ) async def main(): pc = PineconeAsyncio(api_key="key") await pc.create_index( name="my-index", metric="cosine", spec=ServerlessSpec( cloud=CloudProvider.AWS, region=AwsRegion.US_EAST_1 ), ) await pc.close() # <-- Don't forget to close the client when you are done mking network calls asyncio.run(main()) ``` #### Sparse index example ```python import asyncio import random from pinecone import ( PineconeAsyncio, ServerlessSpec, CloudProvider, AwsRegion, Metric, VectorType, Vector, SparseValues, ) async def main(): async with PineconeAsyncio() as pc: # Create a sparse index index_name = "my-index2" if not await pc.has_index(index_name): await pc.create_index( name=index_name, metric=Metric.DOTPRODUCT, spec=ServerlessSpec( cloud=CloudProvider.AWS, region=AwsRegion.US_EAST_1 ), vector_type=VectorType.SPARSE, tags={ "env": "testing", } ) # Get the index host description = await pc.describe_index(name=index_name) # Make an index client async with pc.Index(host=description.host) as idx: # Upsert some sparse vectors await idx.upsert( vectors=[ Vector( id=str(i), sparse_values=SparseValues( indices=[j for j in range(100)], values=[random.random() for _ in range(100)] ) ) for i in range(50) ] ) # Query the index query_results = await idx.query( top_k=5, sparse_vector=SparseValues( indices=[5, 10, 20], values=[0.5, 0.5, 0.5] ), ) print(query_results) asyncio.run(main()) ``` ## Type of Change - [x] New feature (non-breaking change which adds functionality)

## Problem The more type hints we have in our package, the better people will be able to understand how to use it. ## Solution Much of the untyped code in the package is derived from old generated code that we have extracted. So there's a fair amount of refactoring in this difif to try to break out some smaller classes and functions with seams that you can start to analyze and type; when everything is just big mutable state blobs it is quite hard to reason about. Along the way I uncovered that bulk import features were in a broken state because some of those operation ids was modified since the last release and these functions are not well-covered with automated tests. This sort of thing really highlights why we need better type coverage in the package. ## Type of Change - [x] Bug fix (non-breaking change which fixes an issue) - [x] None of the above: Refactoring to improve type safety

## Problem Want to support python 3.13, drop support for python 3.8 ## Solution Dependency changes: - Remove httpx, explored in testing but not needed for the release - Move pytest-asyncio from dependencies into dev dependencies - Adjust grpcio and pandas (dev dependency) for 3.13 compatibility ## Type of Change - [x] Bug fix (non-breaking change which fixes an issue)

## Problem Rather than adding a new dependency for everyone, we want asyncio to be an extras install similar to how grpc is handled. ## Solution Adjust pyproject.toml to migrate aiohttp into an asyncio extras that will be installed like `pinecone[asyncio]`. Adjust test configuration to install the required dependencies. ## Type of Change - [x] Infrastructure change (CI configs, etc)

## Problem We want to deprecate a few kwargs and give more helpful error messages. ## Solution - Deprecate `config` and `openapi_config` kwargs; as far as I know these were only used for tests. They have not appeared in documentation. It adds quite a bit of complexity trying to merge configuration from all these sources so going forward we would prefer to just expect configuration from a single source, which are named kwargs. - Add `NotImplementedError` messages that clearly explain some features that are not implemented yet for PineconeAsyncio. This is preferrable to allowing users to pass kwargs that have no effect, then getting frustrated when they seem not to work. - Give more informative error when it looks like the user has passed an invalid value as the host ## Type of Change - [x] Breaking change (fix or feature that would cause existing functionality to not work as expected) - [x] This change requires a documentation update

## Problem We want to minimize the number of required dependencies we have, and tqdm is non-essential. Moreover, common notebook environments like Google Colab will already have tqdm loaded even if we do not declare this an explicit dependency. ## Solution Instead of having a specific dependency on tqdm, we want to detect and use it if it is available in the environment. Otherwise just noop with a stub implementation of our own. ## Type of Change - [x] Breaking change (fix or feature that would cause existing functionality to not work as expected) - [x] This change requires a documentation update

## Problem If you only care about doing async index oeprations, it's cumbersome to have to manage nested async contexts by going through the `PineconeAsyncio` class. ## Solution This is a very simple change that enables people to create an async index client. ## Usage You can now do index operations via asyncio like this: ```python import asyncio from pinecone import Pinecone async def main(): async with Pinecone().IndexAsyncio(host='myhost') as idx: await idx.uspert(...) # do async things asyncio.run(main()) ``` ## Type of Change - [x] New feature (non-breaking change which adds functionality)

## Problem Need to wire up several properties that pass configurations in to the underlying aiohttp library. ## Solution Wire up several properties for `PineconeAsyncio` that are already supported in `Pinecone`: - `additional_headers` to pass extra headers on each request. Useful for internal testing. - `ssl_verify` to turn off ssl verification. Sometimes useful for testing. - `proxy_url` to send traffic through a proxy - `ssl_ca_certs` to specify a custom cert bundle. Expecting a path to a PEM file. Currently `proxy_headers` is not accepted by `PineconeAsyncio` because I haven't figured out how to use these with aiohttp. ## Type of Change - [x] New feature (non-breaking change which adds functionality)

## Problem When migrating the `embed` and `rerank` over from the plugin, I forgot to include these custom return objects. ## Solution Add custom return types, adjust tests to ensure result is iterable. ## Type of Change - [x] Bug fix (non-breaking change which fixes an issue)

## Problem Need to overhaul README, other markdown docs, and docstrings to reflect recent changes and additions to the SDK. ## Solution <img width="600" alt="Kermit typing meme" src="https://media1.giphy.com/media/v1.Y2lkPTc5MGI3NjExeHV4cTM1cjhsYWtmbG03aHgwZ3hleXM2OTUzbzI1YnN2dmloN3BidSZlcD12MV9pbnRlcm5hbF9naWZfYnlfaWQmY3Q9Zw/LmBsnpDCuturMhtLfw/giphy.gif" />

jhamon added 27 commits November 14, 2024 11:06

Restore protobuf changes from 1d0f046

cb1b246

Dave Rigby found that upgrading this protobuf dependency delivered a significant performance improvement, but I needed to temporarily revert it to push out a 5.x release without a breaking change to dependencies.

Fill in copyright info in LICENSE.txt

f85bffc

Pin release workflow to python 3.12

25d3d4b

[refactor] Introduce PluginAware utility class (#443)

980ac3b

Resolve conflicts

f0f1555

jhamon marked this pull request as ready for review February 7, 2025 14:45

jhamon merged commit 74fd5bc into main Feb 7, 2025
58 checks passed

jhamon deleted the release-candidate/2025-01 branch February 7, 2025 14:46

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Merge release-candidate/2025-01 #455

Merge release-candidate/2025-01 #455

Uh oh!

jhamon commented Feb 7, 2025 •

edited

Loading

Uh oh!

Uh oh!

Uh oh!

Merge release-candidate/2025-01 #455

Merge release-candidate/2025-01 #455

Uh oh!

Conversation

jhamon commented Feb 7, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Problem

Solution

Uh oh!

Uh oh!

Uh oh!

jhamon commented Feb 7, 2025 •

edited

Loading