
# Pinecone: Quickstart — Dump & Retrieve with Open‑Source Embeddings

This notebook shows how to:
- Initialize the Pinecone client (serverless)
- Create an index
- Embed sample data with a free, open-source model (`sentence-transformers/all-MiniLM-L6-v2`)
- Upsert vectors into Pinecone
- Run retrieval (vector similarity search)

> **Note:** You need a Pinecone API key. Create an account by login with Outlook/google at https://login.pinecone.io/
> This notebook uses Pinecone **serverless** (no pods to manage).


Detailed Tutorial and docs: https://docs.pinecone.io/guides/get-started/quickstart


## Prerequisites

- Python 3.9+
- A Pinecone API key (set as `PINECONE_API_KEY` in your environment)
- Internet access (to install packages and call Pinecone)


In [1]:

# If running locally, uncomment to install dependencies.
# !pip install --upgrade pip
!python -m pip install pinecone sentence-transformers langchain-pinecone langchain langchain-huggingface --quiet


In [2]:

import os
import time
from dataclasses import dataclass
from typing import List, Dict, Any
from pinecone import Pinecone, ServerlessSpec

from langchain_huggingface import HuggingFaceEmbeddings
from langchain_pinecone import PineconeVectorStore

  from .autonotebook import tqdm as notebook_tqdm

For example, replace imports like: `from langchain_core.pydantic_v1 import BaseModel`
with: `from pydantic import BaseModel`
or the v1 compatibility namespace if you are working in a code base that has not been fully upgraded to pydantic 2 yet. 	from pydantic.v1 import BaseModel

  from langchain_pinecone.vectorstores import Pinecone, PineconeVectorStore



## 1) Configure credentials and client
Make sure your API key is available as an environment variable:


In [None]:
# if you have already added API key to env file, then this cell is not needed
import os
os.environ['PINECONE_API_KEY'] = "<PINECONE_API_KEY>"

In [4]:

PINECONE_API_KEY = os.getenv("PINECONE_API_KEY")
if not PINECONE_API_KEY:
    raise RuntimeError("PINECONE_API_KEY not found. Please set it in your environment.")
else:
  print("PINECONE API KEY found")

pc = Pinecone(api_key=PINECONE_API_KEY)

PINECONE API KEY found



## 2) Embedding Model

In [5]:
# You can swap model_name to 'sentence-transformers/all-MiniLM-L12-v2',
# 'BAAI/bge-small-en-v1.5' etc. If you do, the dimension will be re-detected below.
MODEL_NAME = "sentence-transformers/all-MiniLM-L6-v2"
embeddings = HuggingFaceEmbeddings(
    model_name=MODEL_NAME,
    model_kwargs={"device": "cpu"},                 # change to "cuda" if you have a GPU
    encode_kwargs={"normalize_embeddings": True},   # cosine works well with normalized vectors
)

# Dynamically detect the embedding dimension
test_dim = len(embeddings.embed_query("dimension probe"))


## 3) Create (or reuse) a serverless index
We will create a small `cosine` index sized for 384‑dimensional vectors (the embedding size of `all-MiniLM-L6-v2`).  
Change the name if you want to keep multiple test indexes.


In [6]:

INDEX_NAME = "tredenceb3" # you can change the name - trial version allows only one index per account, you can login to pinecone and delete the index if needed
METRIC = "cosine"

# Create the index if it doesn't exist
existing = [idx["name"] for idx in pc.list_indexes()]
if INDEX_NAME not in existing:
    print(f"Creating index '{INDEX_NAME}' ...")
    pc.create_index(
        name=INDEX_NAME,
        dimension=test_dim,
        metric=METRIC,
        spec=ServerlessSpec(cloud="aws", region="us-east-1"),)
    # optional: wait a moment for the index to be ready
    time.sleep(5)
else:
    print(f"Index '{INDEX_NAME}' already exists, reusing it.")

index = pc.Index(INDEX_NAME)
print(index.describe_index_stats())

Index 'tredenceb3' already exists, reusing it.
{'dimension': 384,
 'index_fullness': 0.0,
 'metric': 'cosine',
 'namespaces': {'': {'vector_count': 5}},
 'total_vector_count': 5,
 'vector_type': 'dense'}


In [7]:
texts = [
    "LangChain helps developers build LLM applications with composable tools and chains.",
    "FastAPI is a modern, high-performance web framework for building APIs with Python.",
    "Vector databases store high-dimensional vectors and enable efficient similarity search.",
    "Pinecone is a fully managed vector database service with serverless indexes.",
    "Transformers use self-attention to capture long-range dependencies in sequences.",
]
metadatas = [
    {"source": "docs",  "topic": "LLM apps"},
    {"source": "docs",  "topic": "APIs"},
    {"source": "notes", "topic": "vector db"},
    {"source": "notes", "topic": "pinecone"},
    {"source": "wiki",  "topic": "transformers"},
]
ids = [f"doc-{i+1}" for i in range(len(texts))]  # Optional: control your own IDs




## 4) Upsert vectors into Pinecone
We attach `id`, `values` (the vector), and optional `metadata` per record.


In [8]:
vectorstore = PineconeVectorStore(
    index_name=INDEX_NAME,
    embedding=embeddings,
    namespace=None,           # set a namespace string if you want to isolate data
    pinecone_api_key=PINECONE_API_KEY,  # optional; will default to env var
)

# Upsert (add) texts into Pinecone via LangChain:
vectorstore.add_texts(texts=texts, metadatas=metadatas, ids=ids)
print("Upsert complete.")

Upsert complete.



## 5) Retrieval (semantic search)
We will embed a query and search for the top‑k nearest neighbors by cosine similarity.


In [9]:
query = "How do I build applications with large language models?"
k = 3

# Get documents and scores:
docs_and_scores = vectorstore.similarity_search_with_score(query, k=k)

print(f"\nTop {k} results for query: {query!r}\n")
for rank, (doc, score) in enumerate(docs_and_scores, start=1):
    # 'doc' is a LangChain Document with .page_content and .metadata
    print(f"[{rank}] score={score:.4f}")
    print("   text   :", doc.page_content)
    print("   meta   :", doc.metadata)
    print()


Top 3 results for query: 'How do I build applications with large language models?'

[1] score=0.4255
   text   : LangChain helps developers build LLM applications with composable tools and chains.
   meta   : {'source': 'docs', 'topic': 'LLM apps'}

[2] score=0.3577
   text   : FastAPI is a modern, high-performance web framework for building APIs with Python.
   meta   : {'source': 'docs', 'topic': 'APIs'}

[3] score=0.1905
   text   : Transformers use self-attention to capture long-range dependencies in sequences.
   meta   : {'source': 'wiki', 'topic': 'transformers'}




## (Optional) 6) Clean up
Uncomment to delete the index when done.


In [None]:

# pc.delete_index(INDEX_NAME)
# print(f"Deleted index: {INDEX_NAME}")



---

### Troubleshooting Tips

- **Auth**: Ensure `PINECONE_API_KEY` is set (and valid).
- **Region**: If your account is set to a specific region/cloud, adjust `ServerlessSpec(cloud, region)` accordingly.
- **Model**: If you prefer another open-source embedding model (e.g., `all-MiniLM-L12-v2`, `bge-small-en`), just swap it and update `DIMENSION` to match.
- **Throughput**: For larger data, batch your upserts and consider concurrency with backoff.
