# Interacting with a Pinecone service

A Pinecone service creates an index for your input vectors,
and it lets you query their nearest neighbors.
A Pinecone service supports the following operations:

* `upsert`: insert data formatted as `(id, vector)` tuples into the index, or replace existing `(id, vector)` tuples with new vector values.
* `delete`: delete vectors by id.
* `query`: query the index and retrieve the top-k nearest neighbors based on dot-product, cosine-similarity, Euclidean distance, and more. 
* `fetch`: fetch vectors stored in the index by id.
* `info`: get statistics about the index.

## Prerequisites

Install dependencies.

In [1]:
!pip install -qU pip pinecone-client pandas

Set up Pinecone.

In [2]:
import pinecone
import os

api_key = os.getenv("PINECONE_API_KEY") or "USE_YOUR_API_KEY"
pinecone.init(api_key=api_key)

Check Pinecone version compatibility.

In [3]:
import pinecone.info

version_info = pinecone.info.version()
server_version = ".".join(version_info.server.split(".")[:2])
client_version = ".".join(version_info.client.split(".")[:2])
notebook_version = "0.8"

assert (
    notebook_version == server_version
), "This notebook is outdated. Consider using the latest version of the notebook."
assert client_version == server_version, "Please upgrade pinecone-client."

## Interacting with a service

In [4]:
import pinecone.graph
import pinecone.service
import pinecone.connector
import pandas as pd
import numpy as np

In [5]:
service_name = "pinecone-example"

# Deploy a service
graph = pinecone.graph.IndexGraph(metric="euclidean")
pinecone.service.deploy(service_name=service_name, graph=graph)

# Create a connection
conn = pinecone.connector.connect(service_name)

### Insert vectors

In [6]:
# Generate some data

df = pd.DataFrame()
df["id"] = ["A", "B", "C", "D", "E"]
df["vector"] = [[1, 1], [2, 2], [3, 3], [4, 4], [5, 5]]
df

Unnamed: 0,id,vector
0,A,"[1, 1]"
1,B,"[2, 2]"
2,C,"[3, 3]"
3,D,"[4, 4]"
4,E,"[5, 5]"


In [7]:
# Method 1: in-memory, upsert everything at once using `collect()`
AB_df = df[:2]
acks_AB = conn.upsert(items=zip(AB_df.id, AB_df.vector)).collect()
acks_AB

[IndexResult(id='A'), IndexResult(id='B')]

In [8]:
# Method 2: in-memory, batch-by-batch
CDE_df = df[2:]
db_cursor = conn.upsert(items=zip(CDE_df.id, CDE_df.vector))
acks_C = db_cursor.take(1)
acks_DE = db_cursor.take(2)
print(acks_C)
print(acks_DE)

[IndexResult(id='C')]
[IndexResult(id='D'), IndexResult(id='E')]


In [9]:
# Method 3: stream with generators.
# This interface allows you to, for instance, consume Kafka streams.
stream_iterator = (("A{}".format(ii), [101, 201]) for ii in range(10))
print(stream_iterator)

for ack in conn.upsert(items=stream_iterator).stream():
    print(ack)

<generator object <genexpr> at 0x7f6d13812cf0>
IndexResult(id='A0')
IndexResult(id='A1')
IndexResult(id='A2')
IndexResult(id='A3')
IndexResult(id='A4')
IndexResult(id='A5')
IndexResult(id='A6')
IndexResult(id='A7')
IndexResult(id='A8')
IndexResult(id='A9')


### Fetch vectors

In [10]:
# Fetch vectors by ID
fetch_results = conn.fetch(ids=["A", "B", "C"]).collect()
list(map(print, fetch_results))

FetchResult(id='A', vector=array([1., 1.], dtype=float32))
FetchResult(id='B', vector=array([2., 2.], dtype=float32))
FetchResult(id='C', vector=array([3., 3.], dtype=float32))


[None, None, None]

### Query top-k vectors

In [11]:
# Query top-k nearest neighbors
query_results = conn.query(queries=[[1.1, 1.1], [2.2, 2.2]], top_k=2).collect()
list(map(print, query_results))

QueryResult(ids=['A', 'B'], scores=[-0.019999980926513672, -1.619999885559082], data=None)
QueryResult(ids=['B', 'C'], scores=[-0.07999992370605469, -1.279998779296875], data=None)


[None, None]

### Update vectors by ID

In [12]:
# Fetch current vectors by ID
fetch_results = conn.fetch(ids=["A"]).collect()
list(map(print, fetch_results))

FetchResult(id='A', vector=array([1., 1.], dtype=float32))


[None]

In [13]:
# Update vectors by ID
acks = conn.upsert(items=[("A", [0, 0])]).collect()
acks

[IndexResult(id='A')]

In [14]:
# Fetch vectors by the same ID again
fetch_results = conn.fetch(ids=["A"]).collect()
list(map(print, fetch_results))

FetchResult(id='A', vector=array([0., 0.], dtype=float32))


[None]

### Delete vectors by ID

In [15]:
# Delete vectors by ID
acks = conn.delete(ids=["A"]).collect()
print(acks)

[DeleteResult(id='A')]


In [16]:
# Deleted vectors are empty
fetch_results = conn.fetch(ids=["A", "B"]).collect()
list(map(print, fetch_results))

FetchResult(id='A', vector=array([], dtype=float32))
FetchResult(id='B', vector=array([2., 2.], dtype=float32))


[None, None]

### Get index statistics

In [17]:
# Index statistics
conn.info()

InfoResult(index_size=14)

### Stop the service

In [18]:
# stop the service
pinecone.service.stop(service_name=service_name)

{'success': True}