To mitigate the lost-in-middle effect, we can re-order documents after retrieval such that the most relevant documents are positioned at extrema and the least relevant documents are positioned in the middle

In [2]:
from langchain_chroma import Chroma
from langchain_huggingface import HuggingFaceEmbeddings
embeddings = HuggingFaceEmbeddings(model_name="sentence-transformers/all-MiniLM-L6-v2")

  from tqdm.autonotebook import tqdm, trange


In [3]:
texts =[
    "Basquetball is a great sport.",
    "Fly me to the moon is one of my favourite songs.",
    "The Celtics are my favourite team.",
    "This is a document about the Boston Celtics",
    "I simply love going to the movies",
    "The Boston Celtics won the game by 20 points",
    "This is a random text",
    "Elden Ring is one of the best games in the last 15 years",
    "L.Rornet is one of the best Celtics players",
    "Larry Bird was a iconic NBA player"
]

In [4]:
#create the retriever
retriever = Chroma.from_texts(texts,embedding=embeddings).as_retriever(search_kwargs={"k": 10})

In [5]:
query = "What can you tell me about Celtics?"

In [6]:
docs = retriever.invoke(query)
docs

[Document(metadata={}, page_content='This is a document about the Boston Celtics'),
 Document(metadata={}, page_content='The Celtics are my favourite team.'),
 Document(metadata={}, page_content='L.Rornet is one of the best Celtics players'),
 Document(metadata={}, page_content='The Boston Celtics won the game by 20 points'),
 Document(metadata={}, page_content='Larry Bird was a iconic NBA player'),
 Document(metadata={}, page_content='Elden Ring is one of the best games in the last 15 years'),
 Document(metadata={}, page_content='Basquetball is a great sport.'),
 Document(metadata={}, page_content='I simply love going to the movies'),
 Document(metadata={}, page_content='Fly me to the moon is one of my favourite songs.'),
 Document(metadata={}, page_content='This is a random text')]

## Using LongContextReOrder


In [7]:
from langchain_community.document_transformers import LongContextReorder

In [8]:
# Reorder the documents:
# Less relevant documents will be at the middle of the list and 
# morre relevant elements at beginning / end.
reordering = LongContextReorder()
reordered_docs = reordering.transform_documents(docs)
reordered_docs

[Document(metadata={}, page_content='The Celtics are my favourite team.'),
 Document(metadata={}, page_content='The Boston Celtics won the game by 20 points'),
 Document(metadata={}, page_content='Elden Ring is one of the best games in the last 15 years'),
 Document(metadata={}, page_content='I simply love going to the movies'),
 Document(metadata={}, page_content='This is a random text'),
 Document(metadata={}, page_content='Fly me to the moon is one of my favourite songs.'),
 Document(metadata={}, page_content='Basquetball is a great sport.'),
 Document(metadata={}, page_content='Larry Bird was a iconic NBA player'),
 Document(metadata={}, page_content='L.Rornet is one of the best Celtics players'),
 Document(metadata={}, page_content='This is a document about the Boston Celtics')]

In [10]:
from langchain.chains.combine_documents import create_stuff_documents_chain
from langchain_core.prompts import PromptTemplate
from langchain_openai import OpenAI 
from dotenv import load_dotenv
load_dotenv()
llm = OpenAI()

In [13]:
prompt_template = """ 
Given the below context:
-------
{context}
--------
Please answer the question only on the context: 
{question}
"""
prompt = PromptTemplate(template = prompt_template, input_variables = ["context", "question"])

In [14]:
#chains
chain = create_stuff_documents_chain(llm=llm, prompt=prompt)
response = chain.invoke({"context":reordered_docs,"question":query})
print(response)


Based on the given context, we can tell that Celtics is a sports team, specifically a basketball team, and is the favorite team of the person speaking. The Boston Celtics, a professional basketball team, won a game by 20 points. Larry Bird, an iconic NBA player, is associated with the Celtics. L. Rornet is also mentioned as one of the best Celtics players. Additionally, the person speaking is a fan of the Celtics and mentions them frequently. It can also be inferred that the person may be from or has a connection to Boston, as they mention the city and a specific song about the city.
