Skip to content

Conversation

jhamon
Copy link
Collaborator

@jhamon jhamon commented Feb 7, 2025

Problem

In preparation to release v6, we need to merge recent changes in the release-candidate/2025-01 branch.

Solution

Merge conflicts seem limited to just a few things in the pyproject.toml and poetry.lock file, as well as a few method signatures modified to indicate None as return type. So I think this should go pretty smoothly.

Dave Rigby found that upgrading this protobuf dependency delivered
a significant performance improvement, but I needed to temporarily
revert it to push out a 5.x release without a breaking change to
dependencies.
## Problem

Needed to consume the 2024-10 API spec.

## Solution

The spec updates involved many changes to operation ids. This should not
touch the user experience, but does mean this diff is somewhat larger
than usual due to the large number of tests for existing functionality
that needed to be adjusted for the new names:
- upsert > upsert_vector
- fetch > fetch_vectors
- query > query_vectors
- delete > delete_vectors
- list > list_vectors
- start_import > start_bulk_import
- describe_import > describe_bulk_import
- etc

### Other changes

#### Cleanup prerelease code no longer needed
- Deleted `pinecone/core_ea/` which was the home of some code generated
off of the 2024-10 spec when bulk import functionality was in a
pre-release state. We no longer need this. That's why this PR has 21k
lines removed.

#### Changes to generation process
- Placed a copy of classes such as `ApiClient`, `Endpoint`,
`ModelNormal`, etc which were previously generated into a new folder,
`pinecone/openapi_support`. Even though these used to be generated, they
contain no dynamic content. So taking them out of the generation process
should make it significantly easier to work on core improvements to the
UX and performance of the generated code, since the source of truth will
no longer be a mustache file (FML)
- Updated the codegen script `./codegen/build-oas.sh 2024-10 false` to
delete instead of de-dupe copies of shared classes like `ApiClient`, and
edit model+api files that continue to be generated to find the classes
they need in the new `openapi_support` folder.

#### mypy fixes
- I needed to make adjustments in some `openapi_support` classes to
satisfy the mypy type checker (`pinecone/core` has been ignored in the
past, and our goal should be to eventually have 100% of code
typechecked).
- Wrote some initial unit tests to characterize some of the behavior in
`ApiClient`. The tests were to help give me more confidence I wasn't
breaking something along the way while making adjustments for mypy.
- Added a dev dependency on a package with types for datetime

## Type of Change

- [ ] Bug fix (non-breaking change which fixes an issue)
- [ ] New feature (non-breaking change which adds functionality)
- [x] Breaking change (fix or feature that would cause existing
functionality to not work as expected)
- [ ] This change requires a documentation update
- [ ] Infrastructure change (CI configs, etc)
- [ ] Non-code change (docs, etc)
- [ ] None of the above: (explain here)
## Problem

Sometimes users incorrectly instantiate the Index client like this:

```python
import pinecone

# Initialize the Index with the host
index = pinecone.Index(index_name, host=index_host)
```

Then they will later get an authentication exception when using it,
since the `Index` class does not have the configuration values it needs
when attempting to perform vector operations.

```python
ForbiddenException: (403) Reason: Forbidden HTTP response headers: HTTPHeaderDict({'Date': 'Wed, 13 Nov 2024 02:06:45 GMT', 'Content-Type': 'text/plain', 'Content-Length': '9', 'Connection': 'keep-alive', 'x-pinecone-auth-rejected-reason': 'Wrong API key', 'www-authenticate': 'Wrong API key', 'server': 'envoy'}) HTTP response body: Forbidden
```

## Solution

- Rename the `Index` implementation to `_Index` so that people will not
accidentally interact with it.
- Add a new stub implementation for `Index` that throws an informative
message.

Bonus:
- Move the Index client docstrings off the implementation class and into
a related abstract base class. This wasn't strictly necessary, but I was
feeling a bit overwhelmed by the size of the `index.py` file. I copied
this approach from the grpc module. pdoc seems to still find and render
the docs okay when doing this.

## Usage

The error message incorporates the args/kwargs the user was attempting
to pass.

One positional arg
```python
>>> import pinecone
>>> i = pinecone.Index('my-index')
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/Users/jhamon/workspace/pinecone-python-client/pinecone/data/index.py", line 113, in __init__
    raise IndexClientInstantiationError(args, kwargs)
pinecone.data.index.IndexClientInstantiationError: You are attempting to access the Index client directly from the pinecone module. The Index client must be instantiated through the parent Pinecone client instance so that it can inherit shared configurations such as API key.

    INCORRECT USAGE:
        ```
        import pinecone

        pc = pinecone.Pinecone(api_key='your-api-key')
        index = pinecone.Index('my-index')
        ```

    CORRECT USAGE:
        ```
        from pinecone import Pinecone

        pc = Pinecone(api_key='your-api-key')
        index = pc.Index('my-index')
        ```
```

Multiple positional args

```python
>>> i = pinecone.Index('my-index', 'https://my-index.blahblah.com')
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/Users/jhamon/workspace/pinecone-python-client/pinecone/data/index.py", line 113, in __init__
    raise IndexClientInstantiationError(args, kwargs)
pinecone.data.index.IndexClientInstantiationError: You are attempting to access the Index client directly from the pinecone module. The Index client must be instantiated through the parent Pinecone client instance so that it can inherit shared configurations such as API key.

    INCORRECT USAGE:
        ```
        import pinecone

        pc = pinecone.Pinecone(api_key='your-api-key')
        index = pinecone.Index('my-index', 'https://my-index.blahblah.com')
        ```

    CORRECT USAGE:
        ```
        from pinecone import Pinecone

        pc = Pinecone(api_key='your-api-key')
        index = pc.Index('my-index', 'https://my-index.blahblah.com')
        ```

```


One keyword arg:

```python
>>> i = pinecone.Index(host='https://my-index.blahblah.com')
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/Users/jhamon/workspace/pinecone-python-client/pinecone/data/index.py", line 113, in __init__
    raise IndexClientInstantiationError(args, kwargs)
pinecone.data.index.IndexClientInstantiationError: You are attempting to access the Index client directly from the pinecone module. The Index client must be instantiated through the parent Pinecone client instance so that it can inherit shared configurations such as API key.

    INCORRECT USAGE:
        ```
        import pinecone

        pc = pinecone.Pinecone(api_key='your-api-key')
        index = pinecone.Index(host='https://my-index.blahblah.com')
        ```

    CORRECT USAGE:
        ```
        from pinecone import Pinecone

        pc = Pinecone(api_key='your-api-key')
        index = pc.Index(host='https://my-index.blahblah.com')
        ```
```

Multiple kwargs

```python
>>> i = pinecone.Index(name='my-index', host='https://my-index.blahblah.com', pool_threads=20)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/Users/jhamon/workspace/pinecone-python-client/pinecone/data/index.py", line 113, in __init__
    raise IndexClientInstantiationError(args, kwargs)
pinecone.data.index.IndexClientInstantiationError: You are attempting to access the Index client directly from the pinecone module. The Index client must be instantiated through the parent Pinecone client instance so that it can inherit shared configurations such as API key.

    INCORRECT USAGE:
        ```
        import pinecone

        pc = pinecone.Pinecone(api_key='your-api-key')
        index = pinecone.Index(name='my-index', host='https://my-index.blahblah.com', pool_threads=20)
        ```

    CORRECT USAGE:
        ```
        from pinecone import Pinecone

        pc = Pinecone(api_key='your-api-key')
        index = pc.Index(name='my-index', host='https://my-index.blahblah.com', pool_threads=20)
        ```
```

Mixed args/kwargs

```python
>>> i = pinecone.Index('my-index', host='https://my-index.blahblah.com', pool_threads=20)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/Users/jhamon/workspace/pinecone-python-client/pinecone/data/index.py", line 113, in __init__
    raise IndexClientInstantiationError(args, kwargs)
pinecone.data.index.IndexClientInstantiationError: You are attempting to access the Index client directly from the pinecone module. The Index client must be instantiated through the parent Pinecone client instance so that it can inherit shared configurations such as API key.

    INCORRECT USAGE:
        ```
        import pinecone

        pc = pinecone.Pinecone(api_key='your-api-key')
        index = pinecone.Index('my-index', host='https://my-index.blahblah.com', pool_threads=20)
        ```

    CORRECT USAGE:
        ```
        from pinecone import Pinecone

        pc = Pinecone(api_key='your-api-key')
        index = pc.Index('my-index', host='https://my-index.blahblah.com', pool_threads=20)
```

## Type of Change

- [x] New feature (non-breaking change which adds functionality)
- [x] Breaking change 

This change is a UX feature that could be considered breaking if someone
was importing `Index` directly and going out of their way to set it up
correctly. This was never documented usage, but somebody skilled at
reading code could have figured out how to do this.
## Problem

With my asyncio client in progress (not in this diff), I found myself
needing to reuse a lot of the openapi request-building boilerplate.

## Solution

Rather than copy/paste and create a lot of duplication, I want to pull
that logic out into a separate class for building request objects. This
change should not break existing behavior in the Index client.

## Todo

This was pretty much just a mechanical extraction. I'd like to cleanup
and standardize some stuff regarding `_check_type`, an openapi param
that we're going to silly lengths to flip the default behavior on, in a
follow-up diff.

## Type of Change

- [x] None of the above: Refactor

## Test Plan

Should still have tests passing
In order to merge results across multiple queries, the SDK must know
which similarity metric an index is using. For dotproduct and cosine
indexes, a larger score is better while for euclidean a smaller score is
better. Unfortunately the data plane API does not currently expose the
metric type and a separate call to the control plane to find out seems
undesirable from a resiliency and performance perspective.

As a workaround, in the initial implementation of `query_namespaces` the
SDK would infer the similarity metric needed to merge results by seeing
whether the scores of query results were ascending or descending. This
worked well, but imposes an implicit limitation that there must be at
least 2 results returned.

We initially believed this would not be a problem but have since learned
that applications using filtering can sometimes filter out all or most
results. So an approach that has the user explicitly telling the SDK
what similarity metric is being used is preferred to handle these edge
cases with 1 or 0 results.

- Add a required kwarg to `query_namespaces` to specify the index
similarity metric.
- Modify `QueryResultsAggregator` to use this similarity metric, and
strip out code that was involved in inferring whether results were
ascending or descending.
- Adjust integration tests to pass new metric kwarg. Except for adding
the new kwarg, query_namespaces integration tests did not need to change
which indicates the underlying behavior is still working as before.

- [x] Bug fix (non-breaking change which fixes an issue)
## Problem

By renaming the `Index` class to `_Index` in
#418, I
accidentally broke plugin installation.

## Solution

Rename the class back to `Index` and instead alias it to `_Index`
through import statements. We still need to alias it within the
`pinecone/data/__init__.py` module definition to avoid collision with
the Index class designed to give users a more informative error when
they incorrectly import the `Index` class from the `pinecone` module.

I went ahead and also cleaned up some stuff where we were unnecessarily
exposing some internal openapi details in the public interface. Users
should never realistically need to construct openapi request objects.

## Type of Change

- [x] Bug fix (non-breaking change which fixes an issue)
- [ ] New feature (non-breaking change which adds functionality)
- [ ] Breaking change (fix or feature that would cause existing
functionality to not work as expected)
- [ ] This change requires a documentation update
- [ ] Infrastructure change (CI configs, etc)
- [ ] Non-code change (docs, etc)
- [ ] None of the above: (explain here)
## Problem

Need to incorporate change for upcoming sparse index support.

## Solution

Incorporating these changes represents a few different types of
challenges

### Control plane changes
  - Create index adjusted to accept new optional field, `vector_type`. 

### Data plane changes
- Codegen for proto changes
- Needed to add codegen scripts to support generation of new grpc
message formats from an updated proto file for the first time. This used
to be done with some scripts directly invoking `protoc` CLI from the
now-deprecated private repo `pinecone-protos`. But since we've moved
away from that pattern, needed to implement something here for the first
time.
- I decided to use the [Buf CLI](https://buf.build/product/cli), which
gives a more modern way to manage the required dependencies for stub
generation via a plugin system.

- Dependency changes 
- Had to bump our protobuf and googleapis-common-protos deps to match
the Buf cli's plugin output. But that's a good thing as we were on
fairly old versions and updating protobuf in particular is supposed to
unlock some nontrivial performance benefit.

- Vector object switcheroo for UX 
- The `Vector` object generated by OpenAPI for the REST pathway was
unfortunately insisting that an empty array `values=[]` be sent when
upserting for sparse indexes. This seems annoying and stupid if you are
focused on the sparse use case, so I ended up moving away from having
users use that generated object in favor of my own custom Vector
dataclass. The blast radius of this change was fairly large, requiring
updates to the various vector factory classes and hundreds of tests.
- For those keeping score, there are now 5 similar things that can be
called `Vector` inside the SDK:
    - a class generated with OpenAPI
    - a grpc Message object
- a user-supplied dictionary with specific keys defined by a `TypedDict`
- a user-supplied tuple with either 2 or 3 elements representing id,
values, and metadata
    - **New: a `@dataclass` I created in this PR**
- Out of these "vector" things listed above, the SDK used to provide the
OpenAPI vector class to those who write code along the lines of `from
pinecone import Vector`. This was already bad due to the
interoperability problems between the GRPC and REST code paths this
brought into the vector factory classes (e.g. if you were attempting to
pass an OpenAPI Vector object into the GRPC-based upsert implementation
extra logic was needed to convert that into the grpc Message object).
But with the introduction of sparse, there was no workaround that would
give the user the ability to instantiate the OpenAPI vector without also
passing `values=[]` which just seemed like a bridge too far from a UX
standpoint. So the decision was made to use this name `Vector` to refer
to the custom dataclass instead and have that be the main vehicle for
user input.
- This object switch also caused a bunch of mypy type checks and tests
to start failing in subtle ways that required extra time and effort to
track down since even though the new and old objects were functionally
equivalent to the user they are not the same to the type checker so many
type annotations and tests had to be adjusted. But seems worth it for
the better UX, and also gets us onto a standard footing for how input
should be supplied to both REST and GRPC index clients.
- I started writing new integration tests for sparse indexes, but
they're disabled in this PR because they currently can't run on prod.


### Other changes

- I disabled tests on the CI dev release workflow. Seems like I end up
doing this every time I am working on a complex change and want to do
some hands on testing with google colab even when 100% of tests aren't
passing. I think I will leave the tests commented out until I can
figiure out how to wire up a config option that controls them.
- I created some named type definitions under `pinecone/data/types` for
dictionary objects the user can pass in various places for vector,
sparse values, metadata, filter, etc. This is to increase the
consistency and readability of types in the code. When they were defined
inline, sometimes small inconsistencies creep in. This wasn't really
planned in advance, but I had enough challenges wrangling mypy that it
seemed simpler to actually centralize some of these definitions into one
place than try to iron out minor inconsistencies any other way.
- A lot of fixture stuff was moved out of
tests/integration/data/conftest and into individual test files. This
unified "upsert everything, then run all the tests" approach first began
when the product surface was much smaller and freshness was a much
bigger problem so it was more advantageous to do all the post-upsert
waiting at one time before running any tests. In practice, this has been
a big obstacle to development because it means you can't easily run a
single test file without the heavy duty global setup and it creates a
big opportunity for test pollution if any tests modify state in the
index. Breaking this up and putting setup closer to the places where it
is tested solves this problem and makes it easier to work with
individual test files.

## Usage

```python

import random
from pinecone import ServerlessSpec, Vector, SparseVector

pc = Pinecone(api_key='key')

# Create sparse index
index_name = f"sparse-testing-{random.randint(0,10000)}"
pc.create_index(
    name=index_name,
    metric='dotproduct',
    spec=ServerlessSpec(cloud='aws', region='us-east-1'),
    vector_type='sparse'
)

# Get the index client
sparse_index = pc.Index(name=index_name)

# Upsert sparse (random data just for illustration purposes)
def unique_random_integers(n, range_start, range_end):
    if n > (range_end - range_start + 1):
        raise ValueError("Range too small for the requested number of unique integers")
    return random.sample(range(range_start, range_end + 1), n)

# Generate some random sparse vectors
sparse_index.upsert(
    vectors=[
        Vector(
            id=str(i),
            sparse_values=SparseValues(
                indices=unique_random_integers(10, 0, 10000),
                values=[random.random() for j in range(10)]
            )
        ) for i in range(10000)
    ],
    batch_size=500,
)

# Query sparse
sparse_index.query(
    top_k=10,
    sparse_vector={
        "indices": [1,2,3,4,5],
        "values": [random.random()]*5
    }
)
```


## Type of Change

- [x] New feature (non-breaking change which adds functionality)
…#426)

## Problem

Want to expose kwargs for setting and modifying index tags

## Solution

The usual. The generated bits backing this feature already got created
when I generated off the `2025-01` spec.

I added a small amount of logic to handle merging new and existing tags
when using configure.

## Usage

```python
>>> from pinecone import Pinecone, ServerlessSpec
>>> pc = Pinecone(api_key='key')

# Create index with optional tags dictionary
>>> pc.create_index(
...   name='myindex',
...   dimension=10,
...   metric='cosine',
...   spec=ServerlessSpec(cloud='aws', region='us-west-2'),
...   vector_type='dense',
...   tags={'env': 'testing', 'author': 'jhamon'}
... )
>>> pc.describe_index(name='myindex')
{
    "name": "myindex",
    "metric": "cosine",
    "host": "myindex-dojoi3u.svc.apw5-4e34-81fa.pinecone.io",
    "spec": {
        "serverless": {
            "cloud": "aws",
            "region": "us-west-2"
        }
    },
    "status": {
        "ready": true,
        "state": "Ready"
    },
    "vector_type": "dense",
    "dimension": 10,
    "deletion_protection": "disabled",
    "tags": {
        "author": "jhamon",
        "env": "testing"
    }
}

# Update a tag value. This will be merged with existing tags.
>>> pc.configure_index(
...   name='myindex',
...   tags={'env': 'production'}
... )
>>> pc.describe_index(name='myindex')
{
    "name": "myindex",
    "metric": "cosine",
    "host": "myindex-dojoi3u.svc.apw5-4e34-81fa.pinecone.io",
    "spec": {
        "serverless": {
            "cloud": "aws",
            "region": "us-west-2"
        }
    },
    "status": {
        "ready": true,
        "state": "Ready"
    },
    "vector_type": "dense",
    "dimension": 10,
    "deletion_protection": "disabled",
    "tags": {
        "author": "jhamon",
        "env": "production"
    }
}

# Remove a tag by sending empty string value
>>> pc.configure_index(
...   name='myindex',
...   tags={'author': ''}
... )
>>> pc.describe_index(name='myindex')
{
    "name": "myindex",
    "metric": "cosine",
    "host": "myindex-dojoi3u.svc.apw5-4e34-81fa.pinecone.io",
    "spec": {
        "serverless": {
            "cloud": "aws",
            "region": "us-west-2"
        }
    },
    "status": {
        "ready": true,
        "state": "Ready"
    },
    "vector_type": "dense",
    "dimension": 10,
    "deletion_protection": "disabled",
    "tags": {
        "env": "production"
    }
}
```

## Type of Change
- [x] New feature (non-breaking change which adds functionality)
## Problem

The purpose of this PR is to introduce a class, `AsyncioIndex` that
provides an async version of the functionality found in the `Index`
client. This includes standard index data plane operations such as
`upsert`, `query`, etc as well as bulk import operations
(`start_import`, `list_imports`, etc).

## Solution

This is a very complex diff with many moving parts.

- New dependency on `aiohttp`, an asyncio-compatible http client.
- New dev dependency on `pytest-asyncio` to support async testing
- Heavy refactoring in `pinecone/openapi_support` to introduce
asyncio-variants of existing classes: `AsyncioApiClient`,
`AsyncioEndpoint`, and `AiohttpRestClient`. I don't love the way any of
these are currently laid out, but for simplicity sake I decided to hew
close to the existing organization since this was already going to be a
complex change.
- Adjustments to our private python openapi templates in order to
generate asyncio versions of api client (e.g.
`AsyncioVectorOperationsApi`) objects and reference the objects named
above.
- Create a new class, `AsyncioIndex` that uses these asyncio variant
objects. Since the majority of the logic (validation, etc) inside each
data plane method of `Index` was previously extracted into
`IndexRequestFactory`, the amount of actual new code needed inside this
class was minimal async from signature changes to use `async` / `await`.
- Add new integration test covering asyncio usage with both sparse and
dense indexes.
- Very mechnical refactoring to also bring bulk import functionality
into the AsyncioIndex class as a mixin. I did not add automated tests
for these due to the external dependencies required to properly
integration test this (e.g. parquet files hosted on S3). Will need to
manually verify these in testing.

Also:
- Drop python 3.8, which is now end of life
- Removed `ddtrace` dev dependency for logging test info in datadog.
This was giving me a lot of annoying errors when running tests locally.
I will troubleshoot and bring it back in later.
- Updated `jinja` and `virtualenv` versions in our poetry.lock file to
resolve dependabot alerts
- Work to implement the asyncio codepath for GRPC was previously handled
in a different diff

## Usage

In a standalone script, you might do something like this:

```python
import random
import asyncio
from pinecone import Pinecone

async def main():
    pc = Pinecone(api_key="key")
    async with pc.AsyncioIndex(name="index-name") as index:
        tasks = [
            index.query(
                vector=[random.random()] * 1024,
                namespace="ns1",
                include_values=False,
                include_metadata=True,
                top_k=2
        ) for _ in range(20)]
        
        # Execute 20 queries in parallel
        results = await asyncio.gather(*tasks)
        print(results)
    
asyncio.run(main())
```

## Type of Change

- [ ] Bug fix (non-breaking change which fixes an issue)
- [x] New feature (non-breaking change which adds functionality)
- [ ] Breaking change (fix or feature that would cause existing
functionality to not work as expected)
- [x] This change requires a documentation update
- [x] Infrastructure change (CI configs, etc)
- [ ] Non-code change (docs, etc)
- [ ] None of the above: (explain here)

## Test Plan

Describe specific steps for validating this change.
## Problem

We want to migrate the content of the inference plugin into the core
SDK. This should improve UX by allowing code completion to function
properly.

## Solution

- Begin by copying & modifying code from
[plugin](https://github.com/pinecone-io/python-plugin-inference)
- Lazily instantiate the class from the `Pinecone` parent class when
someone attempts to use inference capabilities. This saves a little bit
of startup overhead if someone is not using these functions.
- Migrated tests from the plugin and added coverage of some additional
models
- Removed dependency on the `pinecone-plugin-inference` package, since
we now have all the functionality of that plugin in the core SDK.

In addition to copy & modify:
- Implemented Enum objects, `RerankModel` and `EmbedModel`, to serve
mainly as documentation and support for code suggestions. You can still
pass in any string you like to keep this forward compatible with any
future models.
- Inference class now scans for plugins extending its own capability
during initialization. This is so that we can add more experimental
things into the `inference` namespace via future plugins if desired. I
should really define a base class that does this scan for all classes by
default but have not gotten around to it yet.
- Added some warning messages if the user attempts to interact with the
Inference class directly instead of going through the parent object.
This is following a pattern we started for the Index client based on
customer success team feedback.
- Removed old integration tests verifying the old plugin was being
correctly installed.

## Usage: Embed

```python
from pinecone import Pinecone

pc = Pinecone(api_key='key')

# You can pass the model name as a string if you know the correct value
embeddings = pc.inference.embed(
    model="multilingual-e5-large",
    inputs=["The quick brown fox jumps over the lazy dog.", "lorem ipsum"],
    parameters={"input_type": "query", "truncate": "END"},
)
```

Or, if you'd like you can import an enum of available options

```python
from pinecone import Pinecone, EmbedModel

pc = Pinecone(api_key='key')

# You can get the model name from an enum
embeddings = pc.inference.embed(
    model=EmbedModel.Multilingual_E5_Large,
    inputs=["The quick brown fox jumps over the lazy dog.", "lorem ipsum"],
    parameters={"input_type": "query", "truncate": "END"},
)
```

Or, for the very clever/lazy, the enum is also attached to the
`inference` namespace

```python
from pinecone import Pinecone

pc = Pinecone(api_key='key')

embeddings = pc.inference.embed(
    model=pc.inference.EmbedModel.Multilingual_E5_Large,
    inputs=["The quick brown fox jumps over the lazy dog.", "lorem ipsum"],
    parameters={"input_type": "query", "truncate": "END"},
)
```

## Usage: Rerank

Same enum mechanic is now available for Rerank options

```python
from pinecone import Pinecone, RerankModel

pc = Pinecone(api_key='key')

model = RerankModel.Bge_Reranker_V2_M3
result = client.inference.rerank(
    model=model,
    query="i love dogs",
    documents=["dogs are pretty cool", "everyone loves dogs", "I'm a cat person"],
    return_documents=False,
)
```


## Type of Change

- [ ] Bug fix (non-breaking change which fixes an issue)
- [x] New feature (non-breaking change which adds functionality)
- [ ] Breaking change (fix or feature that would cause existing
functionality to not work as expected)
- [x] This change requires a documentation update
- [ ] Infrastructure change (CI configs, etc)
- [ ] Non-code change (docs, etc)
- [ ] None of the above: (explain here)
## Problem

Many configuration fields take string inputs even though there is a
limited range of accepted values. It's poor UX having to go into
documentation or examples in order to know which string values are
available. This also means support via type hints from code editors are
not available to keep people moving quickly.

## Solution

- Create Enum classes for control plane configuration fields under
`pinecone.enum`:
  - General index configs: `Metric`, `VectorType`, `DeletionProtection`
- Serverless spec: `CloudProvider`, `AwsRegion`, `GcpRegion`,
`AzureRegion`
  - Pod spec: `PodIndexEnvironment`, `PodType`

kwargs that accept these values are loosely typed as the union of the
enum type and string. This should prevent unnecessary breaking changes
and maintain flexibility to accept new values that may not be avaialble
or known at the time this SDK release is published. For example, if in
the future pinecone can deploy to more Azure regions, this loose typing
would allow a person to pass that configuration as region without
necessarily having to update their SDK to satisfy to a type check.

## Usage: Serverless

```python
# Old way, which still works but requires you to know what values are available
from pinecone import Pinecone, ServerlessSpec

pc = Pinecone(api_key='key')

pc.create_index(
    name="my-index",
    dimension=1024,
    metric="cosine",
    spec=ServerlessSpec(
        cloud="aws", 
        region="us-west-2"
    ),
    vector_type="sparse"
)
```

```python
# New way, using enum types
from pinecone import (
    Pinecone, 
    ServerlessSpec, 
    Metric, 
    VectorType, 
    CloudProvider, 
    AwsRegion
)

pc = Pinecone(api_key='key')

pc.create_index(
    name="my-index",
    dimension=1024,
    metric=Metric.COSINE,
    spec=ServerlessSpec(
        cloud=CloudProvider.AWS, 
        region=AwsRegion.US_WEST_2
    ),
    vector_type=VectorType.SPARSE
)

```

## Usage: Pods

```python
# old way, you have to know all the magic strings

from pinecone import Pinecone, PodSpec

pc = Pinecone(api_key='key')

pc.create_index(
    name="my-index",
    dimension=1024,
    spec=PodSpec(
        pod_type='s1.x4'
        environment="us-east1-gcp"
    ),
)

# Later, when scaling
pc.configure_index(
    name="my-index",
    pod_type="s1.x8"
)
```

```python
# New way, using enum types
from pinecone import (
    Pinecone, 
    PodSpec, 
    PodIndexEnvironment,
    PodType
)

pc = Pinecone(api_key='key')

pc.create_index(
    name="my-index",
    dimension=1024,
    spec=PodSpec(
        environment=PodIndexEnvironment.US_EAST1_GCP,
        pod_type=PodType.S1_X4
    )
)

# Later, when scaling
pc.configure_index(
    name="my-index",
    pod_type=PodType.S1_X8
)
```

## Type of Change

- [ ] Bug fix (non-breaking change which fixes an issue)
- [X] New feature (non-breaking change which adds functionality)
- [ ] Breaking change (fix or feature that would cause existing
functionality to not work as expected)
- [ ] This change requires a documentation update
- [ ] Infrastructure change (CI configs, etc)
- [ ] Non-code change (docs, etc)
- [ ] None of the above: (explain here)
## Problem

We need to migrate `create_index_for_model` functionality from the
records plugin into the core of the SDK to provide improved UX around
code completions and error handling
 
## Solution

- Copy & modify implementation of `create_index_for_model` and
`IndexEmbed` from records plugin
- I did some moderately heavy refactoring of this code and also the
existing `create_index` method to reduce the amount of duplication.
- Add records plugin to list of deprecated plugins that will halt
startup with an exception
- Copy relevant tests from plugin and supplement with additional tests
to ensure enum types and edge cases are handled properly even after
refactoring
- Reuse recently-defined enums to make the method signature more
self-documenting.

## Todo

- Data plane operations in the records plugin will be handled in a
separate PR

## Usage

These would all be considered valid usage. Enums are available to help
know what values are accepted, but you can type the literal strings if
you prefer. This flexibility also keeps compatibility with existing
usage of the plugin.

```python
from pinecone import (
    Pinecone, 
    EmbedModel, 
    CloudProvider, 
    AwsRegion, 
    IndexEmbed, 
    Metric
)

pc = Pinecone(api_key='key')

# All hand-crafted literals
pc.create_index_for_model(
    name='index-name',
    cloud='aws',
    region='us-east-1',
    embed={
        "model": "multilingual-e5-large", 
        "field_map": {"text": "my-sample-text"},
        "metric": "cosine"
    },
)

# All enum values
pc.create_index_for_model(
    name='index-name',
    cloud=CloudProvider.AWS,
    region=AwsRegion.US_EAST_1,
    embed=IndexEmbed(
        model=EmbedModel.Multilingual_E5_Large, 
        field_map={"text": "my-sample-text"},
        metric=Metric.COSINE
    ),
)

# Mixed literals and enums
pc.create_index_for_model(
    name='index-name',
    cloud='aws',
    region=AwsRegion.US_EAST_1,
    embed={
        "model": EmbedModel.Multilingual_E5_Large, 
        "field_map": {"text": "my-sample-text"},
        "metric": "cosine"
    },
)
```

## Type of Change

- [x] New feature (non-breaking change which adds functionality)
## Problem

Migrating `search_records` (aliased to `search`) and `upsert_records`
from the `pinecone-plugin-records` plugin.

## Solution

Working off the content of the records plugin, I have done the
following:

- Adjusted the codegen script to fix the way openapi generator handles
underscore fields such as `_id` and `_score`
- Adjusted the rest library code in `rest_urllib3.py` and
`rest_aiohttp.py` to handle record uploading with content-type
`application/x-ndjson`
- Copied and modified the integration tests from the plugin
- Extracted a lot of the guts of the `upload_records` and
`search_records` methods into the request factory where they could more
easily be unit tested. The logic around parsing user inputs into the
openapi request objects is surprisingly complicated, so I added quite a
lot of new unit tests checking some of those edge cases.
- Compared to the plugin implementation, the major changes are:
  - Made `search` an alias of `search_records`
- Moved away from usages of `.pop()` which mutates the input objects;
this could be confusing for users if they are using those objects for
anything else
  - Added better typing of dict fields
  - Incorporated optional use of enum values for `RerankModel`
- Added asyncio variants of these methods, although most of the guts are
shared in the request factory.

I already handled disallowing the records plugin in yesterday's PR #438 

## Usage

```python
from pinecone import Pinecone, CloudProvider, AwsRegion, EmbedModel, RerankModel

pc = Pinecone(api_key="key")

# Create an index for your embedding model
index_model = pc.create_index_for_model(
    name="my-model-index",
    cloud=CloudProvider.AWS,
    region=AwsRegion.US_EAST_1,
    embed={
        "model": EmbedModel.Multilingual_E5_Large,
        "field_map": {"text": "my_text_field"}
    }
)

# Create an index client
index = pc.Index(host=index_model.host)

# Upsert records
namespace = "target-namespace"
index.upsert_records(
    namespace=namespace,
    records=[
        {
            "_id": "test1",
            "my_text_field": "Apple is a popular fruit known for its sweetness and crisp texture.",
        },
        {
            "_id": "test2",
            "my_text_field": "The tech company Apple is known for its innovative products like the iPhone.",
        },
        {
            "_id": "test3",
            "my_text_field": "Many people enjoy eating apples as a healthy snack.",
        },
        {
            "_id": "test4",
            "my_text_field": "Apple Inc. has revolutionized the tech industry with its sleek designs and user-friendly interfaces.",
        },
        {
            "_id": "test5",
            "my_text_field": "An apple a day keeps the doctor away, as the saying goes.",
        },
        {
            "_id": "test6",
            "my_text_field": "Apple Computer Company was founded on April 1, 1976, by Steve Jobs, Steve Wozniak, and Ronald Wayne as a partnership.",
        },
    ],
)

# Search for similar records
response = index.search(
    namespace=namespace,
    query={
        "inputs":{
            "text": "Apple corporation",
        },
        "top_k":3,
    },
    rerank={
        "model": RerankModel.Bge_Reranker_V2_M3,
        "rank_fields": ["my_text_field"],
        "top_n": 3,
    },
)
```

These methods also have asyncio variants available

```python
import asyncio
from pinecone import Pinecone, RerankModel

async def main():
    # Create an index client

    pc = Pinecone(api_key='key')
    index = pc.AsyncioIndex(host='host')

    # Upsert records
    namespace = "target-namespace"
    records = [
        {
            "_id": "test1",
            "my_text_field": "Apple is a popular fruit known for its sweetness and crisp texture.",
        },
        {
            "_id": "test2",
            "my_text_field": "The tech company Apple is known for its innovative products like the iPhone.",
        },
        {
            "_id": "test3",
            "my_text_field": "Many people enjoy eating apples as a healthy snack.",
        },
        {
            "_id": "test4",
            "my_text_field": "Apple Inc. has revolutionized the tech industry with its sleek designs and user-friendly interfaces.",
        },
        {
            "_id": "test5",
            "my_text_field": "An apple a day keeps the doctor away, as the saying goes.",
        },
        {
            "_id": "test6",
            "my_text_field": "Apple Computer Company was founded on April 1, 1976, by Steve Jobs, Steve Wozniak, and Ronald Wayne as a partnership.",
        },
    ]
    await index.upsert_records(
        namespace=namespace,
        records=records,
    )

    # Search for similar records
    response = await index.search(
        namespace=namespace,
        query={
            "inputs":{
                "text": "Apple corporation",
            },
            "top_k":3,
        },
        rerank={
            "model": RerankModel.Bge_Reranker_V2_M3,
            "rank_fields": ["my_text_field"],
            "top_n": 3,
        },
    )
    
asyncio.run(main())
```

## Type of Change

- [ ] Bug fix (non-breaking change which fixes an issue)
- [x] New feature (non-breaking change which adds functionality)
- [ ] Breaking change (fix or feature that would cause existing
functionality to not work as expected)
- [ ] This change requires a documentation update
- [ ] Infrastructure change (CI configs, etc)
- [ ] Non-code change (docs, etc)
- [ ] None of the above: (explain here)
## Problem

Previous work implemented asyncio for the db data plane, and now we want
to roll out a similar approach for the db control plane and inference as
well.

## Solution

- Extract request construction logic out of `Pinecone` and move it to a
request factory
- Implement `PineconeAsyncio` using the request factory to keep most of
the method-specific logic the same.
- Add new integration tests using the asyncio code path. These are
mostly modified from the existing serverless integration tests.
- Update tests for the asyncio index client to reflect new setup steps
- Some refactorings around async context management to address log
warnings being shown from aiohttp

## Usage

The async version of the client has some async setup/teardown related to
the underlying aiohttp library being used. You can either use the `async
with` syntax to have the async context automatically managed for you.
Or, if you prefer, you can take the responsibility to close the async
context yourself by using `close()`.

#### Context management option 1: Using `async with`

```python
import asyncio
from pinecone import (
    PineconeAsyncio,
    ServerlessSpec,
    CloudProvider,
    AwsRegion,
)

async def main():
    async with PineconeAsyncio(api_key="key") as pc:
        await pc.create_index(
            name="my-index",
            metric="cosine",
            spec=ServerlessSpec(
                cloud=CloudProvider.AWS, 
                region=AwsRegion.US_EAST_1
            ),
        )
        
asyncio.run(main())
```

#### Context management option 2: Manually `close()`

```python
import asyncio
from pinecone import (
    PineconeAsyncio,
    ServerlessSpec,
    CloudProvider,
    AwsRegion,
)

async def main():
    pc = PineconeAsyncio(api_key="key")
    await pc.create_index(
        name="my-index",
        metric="cosine",
        spec=ServerlessSpec(
            cloud=CloudProvider.AWS, 
            region=AwsRegion.US_EAST_1
        ),
    )
    await pc.close() # <-- Don't forget to close the client when you are done mking network calls
        
asyncio.run(main())
```

#### Sparse index example

```python
import asyncio
import random
from pinecone import (
    PineconeAsyncio,
    ServerlessSpec,
    CloudProvider,
    AwsRegion,
    Metric,
    VectorType,
    Vector,
    SparseValues,
)

async def main():
    async with PineconeAsyncio() as pc:
        # Create a sparse index
        index_name = "my-index2"
        
        if not await pc.has_index(index_name):
            await pc.create_index(
                name=index_name,
                metric=Metric.DOTPRODUCT,
                spec=ServerlessSpec(
                    cloud=CloudProvider.AWS, 
                    region=AwsRegion.US_EAST_1
                ),
                vector_type=VectorType.SPARSE,
                tags={
                    "env": "testing",
                }
            )
        
        # Get the index host
        description = await pc.describe_index(name=index_name)
        
        # Make an index client
        async with pc.Index(host=description.host) as idx:

            # Upsert some sparse vectors
            await idx.upsert(
                vectors=[
                    Vector(
                        id=str(i), 
                        sparse_values=SparseValues(
                            indices=[j for j in range(100)], 
                            values=[random.random() for _ in range(100)]
                        )
                    ) for i in range(50)
                ]
            )
            
            # Query the index
            query_results = await idx.query(
                top_k=5,
                sparse_vector=SparseValues(
                    indices=[5, 10, 20], 
                    values=[0.5, 0.5, 0.5]
                ),
            )
            print(query_results)

asyncio.run(main())
```

## Type of Change

- [x] New feature (non-breaking change which adds functionality)
## Problem

The more type hints we have in our package, the better people will be
able to understand how to use it.

## Solution

Much of the untyped code in the package is derived from old generated
code that we have extracted. So there's a fair amount of refactoring in
this difif to try to break out some smaller classes and functions with
seams that you can start to analyze and type; when everything is just
big mutable state blobs it is quite hard to reason about.

Along the way I uncovered that bulk import features were in a broken
state because some of those operation ids was modified since the last
release and these functions are not well-covered with automated tests.
This sort of thing really highlights why we need better type coverage in
the package.

## Type of Change

- [x] Bug fix (non-breaking change which fixes an issue)
- [x] None of the above: Refactoring to improve type safety
## Problem

Want to support python 3.13, drop support for python 3.8

## Solution

Dependency changes:
- Remove httpx, explored in testing but not needed for the release
- Move pytest-asyncio from dependencies into dev dependencies
- Adjust grpcio and pandas (dev dependency) for 3.13 compatibility

## Type of Change

- [x] Bug fix (non-breaking change which fixes an issue)
## Problem

Rather than adding a new dependency for everyone, we want asyncio to be
an extras install similar to how grpc is handled.

## Solution

Adjust pyproject.toml to migrate aiohttp into an asyncio extras that
will be installed like `pinecone[asyncio]`. Adjust test configuration to
install the required dependencies.

## Type of Change

- [x] Infrastructure change (CI configs, etc)
## Problem

We want to deprecate a few kwargs and give more helpful error messages.

## Solution

- Deprecate `config` and `openapi_config` kwargs; as far as I know these
were only used for tests. They have not appeared in documentation. It
adds quite a bit of complexity trying to merge configuration from all
these sources so going forward we would prefer to just expect
configuration from a single source, which are named kwargs.
- Add `NotImplementedError` messages that clearly explain some features
that are not implemented yet for PineconeAsyncio. This is preferrable to
allowing users to pass kwargs that have no effect, then getting
frustrated when they seem not to work.
- Give more informative error when it looks like the user has passed an
invalid value as the host

## Type of Change

- [x] Breaking change (fix or feature that would cause existing
functionality to not work as expected)
- [x] This change requires a documentation update
## Problem

We want to minimize the number of required dependencies we have, and
tqdm is non-essential. Moreover, common notebook environments like
Google Colab will already have tqdm loaded even if we do not declare
this an explicit dependency.

## Solution

Instead of having a specific dependency on tqdm, we want to detect and
use it if it is available in the environment. Otherwise just noop with a
stub implementation of our own.

## Type of Change

- [x] Breaking change (fix or feature that would cause existing
functionality to not work as expected)
- [x] This change requires a documentation update
## Problem

If you only care about doing async index oeprations, it's cumbersome to
have to manage nested async contexts by going through the
`PineconeAsyncio` class.

## Solution

This is a very simple change that enables people to create an async
index client.

## Usage

You can now do index operations via asyncio like this:

```python
import asyncio
from pinecone import Pinecone

async def main():
    async with Pinecone().IndexAsyncio(host='myhost') as idx:
        await idx.uspert(...) # do async things

asyncio.run(main())
```

## Type of Change

- [x] New feature (non-breaking change which adds functionality)
## Problem

Need to wire up several properties that pass configurations in to the
underlying aiohttp library.

## Solution

Wire up several properties for `PineconeAsyncio` that are already
supported in `Pinecone`:
- `additional_headers` to pass extra headers on each request. Useful for
internal testing.
- `ssl_verify` to turn off ssl verification. Sometimes useful for
testing.
- `proxy_url` to send traffic through a proxy
- `ssl_ca_certs` to specify a custom cert bundle. Expecting a path to a
PEM file.

Currently `proxy_headers` is not accepted by `PineconeAsyncio` because I
haven't figured out how to use these with aiohttp.

## Type of Change

- [x] New feature (non-breaking change which adds functionality)
## Problem

When migrating the `embed` and `rerank` over from the plugin, I forgot
to include these custom return objects.
 
## Solution

Add custom return types, adjust tests to ensure result is iterable.

## Type of Change

- [x] Bug fix (non-breaking change which fixes an issue)
## Problem

Need to overhaul README, other markdown docs, and docstrings to reflect
recent changes and additions to the SDK.

## Solution

<img width="600" alt="Kermit typing meme"
src="https://media1.giphy.com/media/v1.Y2lkPTc5MGI3NjExeHV4cTM1cjhsYWtmbG03aHgwZ3hleXM2OTUzbzI1YnN2dmloN3BidSZlcD12MV9pbnRlcm5hbF9naWZfYnlfaWQmY3Q9Zw/LmBsnpDCuturMhtLfw/giphy.gif"
/>
@jhamon jhamon marked this pull request as ready for review February 7, 2025 14:45
@jhamon jhamon merged commit 74fd5bc into main Feb 7, 2025
58 checks passed
@jhamon jhamon deleted the release-candidate/2025-01 branch February 7, 2025 14:46
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant