<a href="https://colab.research.google.com/github/run-llama/llama_index/blob/main/docs/examples/node_postprocessor/ColbertRerank.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Colbert Rerank

If you're opening this Notebook on colab, you will probably need to install LlamaIndex 🦙.

In [None]:
!pip install llama-index
!pip install llama-index-core
!pip install --quiet transformers torch
!pip install llama-index-embeddings-openai
!pip install llama-index-llms-openai

In [None]:
from llama_index.core import (
    VectorStoreIndex,
    SimpleDirectoryReader,
)

Download Data

In [None]:
!mkdir -p 'data/paul_graham/'
!wget 'https://raw.githubusercontent.com/run-llama/llama_index/main/docs/examples/data/paul_graham/paul_graham_essay.txt' -O 'data/paul_graham/paul_graham_essay.txt'

In [None]:
import os

os.environ["OPENAI_API_KEY"] = "sk-"

In [None]:
# load documents
documents = SimpleDirectoryReader("./data/paul_graham/").load_data()

# build index
index = VectorStoreIndex.from_documents(documents=documents)

#### Retrieve top 10 most relevant nodes, then filter with Colbert Rerank

In [None]:
colbert_reranker = ColbertRerank(
    top_n=5, model="colbert-ir/colbertv2.0", tokenizer="colbert-ir/colbertv2.0"
)

query_engine = index.as_query_engine(
    similarity_top_k=5,
    node_postprocessors=[colbert_reranker],
)
response = query_engine.query(
    "What did Sam Altman do in this essay?",
)

In [None]:
for node in response.source_nodes:
    print(node.id_)
    print(node.node.get_content()[:100])
    print(node.score)

df822f69-ca67-4525-9d22-6b636f0da813
When I was dealing with some urgent problem during YC, there was about a 60% chance it had to do wit
0.6470144987106323
d5ef44ce-1ead-4bf5-a283-afcfe19c2574
Now that I could write essays again, I wrote a bunch about topics I'd had stacked up. I kept writing
0.6377773284912109
8adbfafc-8fda-4193-bb59-7db4bffe877c
Much to my surprise, the time I spent working on this stuff was not wasted after all. After we start
0.6206888556480408
9f17995f-a462-467a-833c-11f6162a218f
[15] We got 225 applications for the Summer Founders Program, and we were surprised to find that a l
0.6143158674240112
ab7b0041-88aa-4513-8309-7f3af43badb2
Up till that point YC had been controlled by the original LLC we four had started. But we wanted YC 
0.5917402505874634
