[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/pinecone-io/examples/blob/master/docs/quick-tour/interacting-with-the-index.ipynb) [![Open nbviewer](https://raw.githubusercontent.com/pinecone-io/examples/master/assets/nbviewer-shield.svg)](https://nbviewer.org/github/pinecone-io/examples/blob/master/docs/quick-tour/interacting-with-the-index.ipynb)

# Interacting with a Pinecone index

Pinecone creates an index for your input vectors,
and it lets you query their nearest neighbors.
A Pinecone index supports the following operations:

* `upsert`: insert data formatted as `(id, vector)` tuples into the index, or replace existing `(id, vector)` tuples with new vector values. Optionally, you can attach metadata for each vector so you can use them in the query by specifying conditions. The upserted vector will look like `(id, vector, metadata)`.
* `delete`: delete vectors by id.
* `query`: query the index and retrieve the top-k nearest neighbors based on dot-product, cosine-similarity, Euclidean distance, and more.
* `fetch`: fetch vectors stored in the index by id.
* `describe_index_stats`: get statistics about the index.

## Prerequisites

Install dependencies.

In [1]:
!pip install -qU \
  pinecone-client==3.0.0 \
  pandas==2.0.3

Set up Pinecone.

Before getting started, decide whether to use serverless or pod-based index.

In [None]:
import os

use_serverless = os.environ.get("USE_SERVERLESS", "False").lower() == "true"

## Creating an Index

Now the data is ready, we can set up our index to store it.

We begin by initializing our connection to Pinecone. To do this we need a [free API key](https://app.pinecone.io).

In [None]:
from pinecone import Pinecone

# initialize connection to pinecone (get API key at app.pc.io)
api_key = os.environ.get('PINECONE_API_KEY') or 'PINECONE_API_KEY'
environment = os.environ.get('PINECONE_ENVIRONMENT') or 'PINECONE_ENVIRONMENT'

# configure client
pc = Pinecone(api_key=api_key)

Now we setup our index specification, this allows us to define the cloud provider and region where we want to deploy our index. You can find a list of all [available providers and regions here](https://docs.pinecone.io/docs/projects).

In [None]:
from pinecone import ServerlessSpec, PodSpec

if use_serverless:
    spec = ServerlessSpec(cloud='aws', region='us-west-2')
else:
    spec = PodSpec(environment=environment)

In [3]:
index_name = "interacting-with-the-index"

In [4]:
import time

# Delete index if exists
if index_name in pc.list_indexes().names():
    pc.delete_index(index_name)

# Create index
pc.create_index(
    name=index_name, 
    dimension=2, 
    metric="euclidean",
    spec=spec
)

# wait for index to be ready before connecting
while not pc.describe_index(index_name).status['ready']:
    time.sleep(1)

# Connect to the index
index = pc.Index(index_name)

### Insert vectors


In [5]:
# Generate some data
import pandas as pd

df = pd.DataFrame()
df["id"] = ["A", "B", "C", "D", "E"]
df["vector"] = [[1., 1.], [2., 2.], [3., 3.], [4., 4.], [5., 5.]]
df

Unnamed: 0,id,vector
0,A,"[1.0, 1.0]"
1,B,"[2.0, 2.0]"
2,C,"[3.0, 3.0]"
3,D,"[4.0, 4.0]"
4,E,"[5.0, 5.0]"


We perform upsert operations in our index. This call will insert a new vector in the index or update the vector if the id was already present.

In [6]:
# Upsert the vectors
AB_df = df[:2]
index.upsert(vectors=zip(AB_df.id, AB_df.vector))

{'upserted_count': 2}

### Fetch vectors

In [7]:
# Fetch vectors by ID
fetch_results = index.fetch(ids=["A", "B"])
fetch_results

{'namespace': '',
 'vectors': {'A': {'id': 'A', 'metadata': {}, 'values': [1.0, 1.0]},
             'B': {'id': 'B', 'metadata': {}, 'values': [2.0, 2.0]}}}

### Query top-k vectors

In [9]:
# Query top-k nearest neighbors
query_results = index.query(vector=[1.1, 1.1], top_k=2)
query_results

{'matches': [{'id': 'A', 'score': 0.0199999809, 'values': []},
             {'id': 'B', 'score': 1.61999989, 'values': []}],
 'namespace': ''}

### Update vectors by ID

In [10]:
# Fetch current vectors by ID
fetch_result = index.fetch(ids=["A"])
fetch_result

{'namespace': '',
 'vectors': {'A': {'id': 'A', 'metadata': {}, 'values': [1.0, 1.0]}}}

In [11]:
# Update vectors by ID
index.upsert(vectors=[("A",[0., 0.])])

{'upserted_count': 1}

In [12]:
# Fetch vector by the same ID again
fetch_result = index.fetch(ids=["A"])
fetch_result

{'namespace': '',
 'vectors': {'A': {'id': 'A', 'metadata': {}, 'values': [0.0, 0.0]}}}

### Delete vectors by ID

In [13]:
# Delete vectors by ID
index.delete(ids=["A"])

{}

In [14]:
# Deleted vectors are empty
fetch_results = index.fetch(ids=["A", "B"])
fetch_results

{'namespace': '',
 'vectors': {'B': {'id': 'B', 'metadata': {}, 'values': [2.0, 2.0]}}}

### Get index statistics

In [15]:
# Index statistics
index.describe_index_stats()

{'dimension': 2,
 'index_fullness': 1e-05,
 'namespaces': {'': {'vector_count': 1}},
 'total_vector_count': 1}

### Delete the index

In [16]:
# delete the index
pc.delete_index(index_name)