# Pinecone Rerank

> This notebook shows how to use **PineconeRerank** for two-stage vector retrieval reranking using Pinecone's hosted reranking API as demonstrated in `langchain_pinecone/libs/pinecone/rerank.py`.

## Setup
Install the `langchain-pinecone` package.

In [None]:
%pip install -qU "langchain-pinecone"

## Credentials
Set your Pinecone API key to use the reranking API.

In [None]:
import os
from getpass import getpass

os.environ["PINECONE_API_KEY"] = os.getenv("PINECONE_API_KEY") or getpass(
    "Enter your Pinecone API key: "
)

## Instantiation
Use `PineconeRerank` to rerank a list of documents by relevance to a query.

In [3]:
from langchain_core.documents import Document
from langchain_pinecone import PineconeRerank

# Initialize reranker
reranker = PineconeRerank(model="bge-reranker-v2-m3")

# Sample documents
documents = [
    Document(page_content="Paris is the capital of France."),
    Document(page_content="Berlin is the capital of Germany."),
    Document(page_content="The Eiffel Tower is in Paris."),
]

# Rerank documents
query = "What is the capital of France?"
reranked_docs = reranker.compress_documents(documents, query)

# Print results
for doc in reranked_docs:
    score = doc.metadata.get("relevance_score")
    print(f"Score: {score:.4f} | Content: {doc.page_content}")

Score: 0.9998 | Content: Paris is the capital of France.
Score: 0.1950 | Content: The Eiffel Tower is in Paris.
Score: 0.0042 | Content: Berlin is the capital of Germany.


## Usage
### Reranking with Top-N
Specify `top_n` to limit the number of returned documents.

In [4]:
# Return only top-1 result
reranker_top1 = PineconeRerank(model="bge-reranker-v2-m3", top_n=1)
top1_docs = reranker_top1.compress_documents(documents, query)
print("Top-1 Result:")
for doc in top1_docs:
    print(f"Score: {doc.metadata['relevance_score']:.4f} | Content: {doc.page_content}")

Top-1 Result:
Score: 0.9998 | Content: Paris is the capital of France.


## Reranking with Custom Rank Fields
If your documents are dictionaries or have custom fields, use `rank_fields` to specify the field to rank on.

In [5]:
# Sample dictionary documents with 'text' field
docs_dict = [
    {
        "id": "doc1",
        "text": "Article about renewable energy.",
        "title": "Renewable Energy",
    },
    {"id": "doc2", "text": "Report on economic growth.", "title": "Economic Growth"},
    {
        "id": "doc3",
        "text": "News on climate policy changes.",
        "title": "Climate Policy",
    },
]

# Initialize reranker with rank_fields
reranker_text = PineconeRerank(model="bge-reranker-v2-m3", rank_fields=["text"])
climate_docs = reranker_text.rerank(docs_dict, "Latest news on climate change.")

# Show IDs and scores
for res in climate_docs:
    print(f"ID: {res['id']} | Score: {res['score']:.4f}")

ID: doc3 | Score: 0.9892
ID: doc1 | Score: 0.0006
ID: doc2 | Score: 0.0000


We can rerank based on title field

In [6]:
economic_docs = reranker_text.rerank(docs_dict, "Economic forecast.")

# Show IDs and scores
for res in economic_docs:
    print(
        f"ID: {res['id']} | Score: {res['score']:.4f} | Title: {res['document']['title']}"
    )

ID: doc2 | Score: 0.8918 | Title: Economic Growth
ID: doc3 | Score: 0.0002 | Title: Climate Policy
ID: doc1 | Score: 0.0000 | Title: Renewable Energy


## Reranking with Additional Parameters
You can pass model-specific parameters (e.g., `truncate`) directly to `.rerank()`.

How to handle inputs longer than those supported by the model. Accepted values: END or NONE.
END truncates the input sequence at the input token limit. NONE returns an error when the input exceeds the input token limit.

In [7]:
# Rerank with custom truncate parameter
docs_simple = [
    {"id": "docA", "text": "Quantum entanglement is a physical phenomenon..."},
    {"id": "docB", "text": "Classical mechanics describes motion..."},
]

reranked = reranker.rerank(
    documents=docs_simple,
    query="Explain the concept of quantum entanglement.",
    truncate="END",
)
# Print reranked IDs and scores
for res in reranked:
    print(f"ID: {res['id']} | Score: {res['score']:.4f}")

ID: docA | Score: 0.6950
ID: docB | Score: 0.0001


## Use within a chain


Create a retreiver from a vector store

In [8]:
# First setup a vector store
from pinecone import Pinecone, ServerlessSpec
from langchain_openai import OpenAIEmbeddings
from langchain_pinecone import PineconeVectorStore
import os

pinecone_client = Pinecone(api_key=os.environ.get("PINECONE_API_KEY"))
index_name = "langchain-test-index"  # change if desired

if not pinecone_client.has_index(index_name):
    pinecone_client.create_index(
        name=index_name,
        dimension=1536,
        metric="cosine",
        spec=ServerlessSpec(cloud="aws", region="us-east-1"),
    )

# Pinecone index
pinecone_index = pinecone_client.Index(index_name)


# Embeddings
embeddings = OpenAIEmbeddings(model="text-embedding-3-small")

# Vector store
vector_store = PineconeVectorStore(index=pinecone_index, embedding=embeddings)

### Manage vector store

Once you have created your vector store, we can interact with it by adding and deleting different items.

#### Add items to vector store

We can add items to our vector store by using the `add_documents` function.

In [9]:
from uuid import uuid4

from langchain_core.documents import Document

document_1 = Document(
    page_content="I had chocolate chip pancakes and scrambled eggs for breakfast this morning.",
    metadata={"source": "tweet"},
)

document_2 = Document(
    page_content="The weather forecast for tomorrow is cloudy and overcast, with a high of 62 degrees.",
    metadata={"source": "news"},
)

document_3 = Document(
    page_content="Building an exciting new project with LangChain - come check it out!",
    metadata={"source": "tweet"},
)

document_4 = Document(
    page_content="Robbers broke into the city bank and stole $1 million in cash.",
    metadata={"source": "news"},
)

document_5 = Document(
    page_content="Wow! That was an amazing movie. I can't wait to see it again.",
    metadata={"source": "tweet"},
)

document_6 = Document(
    page_content="Is the new iPhone worth the price? Read this review to find out.",
    metadata={"source": "website"},
)

document_7 = Document(
    page_content="The top 10 soccer players in the world right now.",
    metadata={"source": "website"},
)

document_8 = Document(
    page_content="LangGraph is the best framework for building stateful, agentic applications!",
    metadata={"source": "tweet"},
)

document_9 = Document(
    page_content="The stock market is down 500 points today due to fears of a recession.",
    metadata={"source": "news"},
)

document_10 = Document(
    page_content="I have a bad feeling I am going to get deleted :(",
    metadata={"source": "tweet"},
)

documents = [
    document_1,
    document_2,
    document_3,
    document_4,
    document_5,
    document_6,
    document_7,
    document_8,
    document_9,
    document_10,
]
uuids = [str(uuid4()) for _ in range(len(documents))]
vector_store.add_documents(documents=documents, ids=uuids)


['0ffee2f7-7313-4a5c-a4b4-f1be7bcdfe58',
 '1e6e5965-d7fd-498d-b9d2-b6e3a30a25bc',
 'a11e9373-b6aa-447c-95fd-936574eb6e29',
 '40115da7-56ca-4394-b3f0-2c5f195cd321',
 'cec08688-05cc-423e-a8dc-469126c9fa6b',
 '73aaf41e-4d9e-4ea3-9c3e-892e1fc100ac',
 '0a55243c-653c-4325-b972-4a80874608a9',
 'a929a129-4ee6-4134-8438-a09aa3336c57',
 'caf87d2a-6fb6-47bf-be9e-c11d001ddf92',
 '0e37f7dd-317d-4a3e-bd29-dc404f380ba1']

### Turning into retriever
You can also transform the vector store into a retriever for easier usage in your chains.

In [10]:
retriever = vector_store.as_retriever(
    search_type="similarity_score_threshold",
    search_kwargs={"k": 1, "score_threshold": 0.4},
)

### Combining into chain

In [11]:
from langchain.retrievers.contextual_compression import ContextualCompressionRetriever
from langchain_pinecone import PineconeRerank 
from langchain_openai import ChatOpenAI

llm = ChatOpenAI(temperature=0)
compressor = PineconeRerank(model="bge-reranker-v2-m3")

compression_retriever = ContextualCompressionRetriever(
    base_compressor=compressor, base_retriever=retriever
)

compressed_docs = compression_retriever.invoke(
    "What is the weather tomorrow?"
)

for doc in compressed_docs:
    print(f"Document text: {doc.page_content}")
    print(f"Relevance score: {doc.metadata.get("relevance_score")}")
    print(f"Source: {doc.metadata.get("source")}")


Document text: The weather forecast for tomorrow is cloudy and overcast, with a high of 62 degrees.
Relevance score: 0.9839091
Source: news


### Using this retriever within a QA pipeline

In [16]:
from langchain import hub
from langchain_core.output_parsers import StrOutputParser
from langchain_core.runnables import RunnablePassthrough

# See full prompt at https://smith.langchain.com/hub/rlm/rag-prompt
prompt = hub.pull("rlm/rag-prompt")


def format_docs(docs):
    return "\n\n".join(doc.page_content for doc in docs)


qa_chain = (
    {
        "context": compression_retriever | format_docs,
        "question": RunnablePassthrough(),
    }
    | prompt
    | llm
    | StrOutputParser()
)

qa_chain.invoke("What is the weather tomorrow?")



'The weather tomorrow will be cloudy and overcast with a high of 62 degrees.'

## API reference
- `PineconeRerank(model, top_n, rank_fields, return_documents)`
- `.rerank(documents, query, rank_fields=None, model=None, top_n=None, truncate="END")`
- `.compress_documents(documents, query)` (returns `Document` objects with `relevance_score` in metadata)

## Related
- Retriever [conceptual guide](https://python.langchain.com/docs/concepts/retrievers/)
- Retriever [how-to guides](https://python.langchain.com/docs/how_to/#retrievers)