# **Pinecone tutorial**
This is the sample notebook from the Pinecone site.

# **How to run in Colab:**

This notebook can be run on Google Colab and stand alone python development environments. Click here to run on colab:

[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/datariders/tutorials/blob/main/vectordb/pinecone/pinecone_tutorial.ipynb)


# **References:**
https://docs.pinecone.io/guides/get-started/quickstart


In [1]:
!pip3 install pinecone-client



In [2]:
import os
from getpass import getpass

#if os.environ.get("OPENAI_API_KEY") is None:
pinecone_api_key = getpass("🔑 Enter your Pinecone API key and hit Enter:")
os.environ["PINECONE_API_KEY"] = pinecone_api_key
print("Pinecone key has been entered")

🔑 Enter your Pinecone API key and hit Enter:··········
Pinecone key has been entered


In [3]:
from pinecone import Pinecone, ServerlessSpec

pc = Pinecone(api_key=pinecone_api_key)
print(" pc: ", pc)

 pc:  <pinecone.control.pinecone.Pinecone object at 0x7f884caa64a0>


# **Create index**

Then create a serverless index named "quickstart" that performs nearest-neighbor search using the cosine distance metric for 2-dimensional vectors:

In [4]:
# Giving our index a name
index_name = "hello-pinecone"

In [5]:
# Delete the index, if an index of the same name already exists
if index_name in pc.list_indexes().names():
    pc.delete_index(index_name)

In [6]:
import time

dimensions = 3
pc.create_index(name=index_name,
                dimension=dimensions,
                metric="cosine",
                spec=ServerlessSpec(
                cloud="aws",
                region="us-east-1")
                )

# wait for index to be ready before connecting
while not pc.describe_index(index_name).status['ready']:
    time.sleep(1)

index = pc.Index(index_name)
print(" index: ", index)

 index:  <pinecone.data.index.Index object at 0x7f884cb69180>


In [7]:
import pandas as pd

df = pd.DataFrame(
    data={
        "id": ["A", "B"],
        "vector": [[1., 1., 1.], [1., 2., 3.]]
    })

display(df)

Unnamed: 0,id,vector
0,A,"[1.0, 1.0, 1.0]"
1,B,"[1.0, 2.0, 3.0]"


In [8]:
index.upsert(vectors=zip(df.id, df.vector))  # insert vectors

{'upserted_count': 2}

In [9]:
index.describe_index_stats()

{'dimension': 3,
 'index_fullness': 0.0,
 'namespaces': {},
 'total_vector_count': 0}

In [10]:
index.query(
    vector=[2., 2., 2.],
    top_k=5,
    include_values=True) # returns top_k matches

{'matches': [], 'namespace': '', 'usage': {'read_units': 1}}

In [11]:
pc.delete_index(index_name)