<a href="https://colab.research.google.com/github/kavyajeetbora/nlp_rag/blob/master/langchain_masterclass/04_RAG.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# RAG - Retrieval Augmented Generated

Retrieval-Augmented Generation (RAG) is an AI framework that enhances the accuracy and relevance of responses generated by large language models (LLMs). It combines two techniques:

1. Retrieval: The model first retrieves relevant information from external sources, such as databases, documents, or the web.
2. Generation: It then uses this retrieved information to inform and enhance the generation of responses.

This approach ensures that the responses are accurate, relevant, and contextually enriched by the most up-to-date and specific information available2. RAG is particularly useful for creating more reliable and effective AI systems across various applications

# Setup environment

In [21]:
!pip install -q langchain langchain_community langchain-openai chromadb randomname langchain_huggingface

In [69]:
import os
import shutil
from langchain_openai import ChatOpenAI, OpenAIEmbeddings
from langchain_huggingface import HuggingFaceEmbeddings
from dotenv import load_dotenv

from langchain_community.document_loaders import TextLoader
from langchain_community.vectorstores import Chroma
from langchain.text_splitter import CharacterTextSplitter, RecursiveCharacterTextSplitter, TokenTextSplitter
import randomname
from glob import glob
from langchain.prompts import ChatPromptTemplate
from langchain.schema.output_parser  import StrOutputParser
from langchain.schema.runnable import RunnableLambda

In [3]:
if os.path.exists(".env"):
    os.remove(".env")

from google.colab import files
uploaded = files.upload()
if uploaded:
    if load_dotenv(".env"):
        print("Uploaded and Loaded Sucessfully")

Saving .env to .env
Uploaded and Loaded Sucessfully


## 1. Load LLM Model

In [4]:
model = ChatOpenAI(model='gpt-3.5-turbo-0125')

# High level RAG Pipeline using LangChain

## Load the source text

In [5]:
!wget -q https://raw.githubusercontent.com/kavyajeetbora/nlp_rag/refs/heads/master/data/taare_zameen_par.txt -O taare_zameen_par.txt
!wget -q https://raw.githubusercontent.com/kavyajeetbora/nlp_rag/refs/heads/master/data/swades.txt -O swades.txt
!wget -q https://raw.githubusercontent.com/kavyajeetbora/nlp_rag/refs/heads/master/data/munna_bhai.txt -O munna_bhai.txt
!wget -q https://raw.githubusercontent.com/kavyajeetbora/nlp_rag/refs/heads/master/data/lagaan.txt -O lagaan.txt

In [None]:
text_file_path = "taare_zameen_par.txt"

if os.path.exists(text_file_path):
    loader = TextLoader(text_file_path)
    documents = loader.load()
    print("Source Document Loaded Sucessfully")
else:
    print(f"No file called {text_file_path} found")

Source Document Loaded Sucessfully


In [None]:
documents[0].metadata

{'source': 'taare_zameen_par.txt'}

## Split the text into Chunks

Why chunking? as we know there is limited number of characters/tokens we can pass on to our LLM model for generation. So we need to break down our large chunk of text into smaller consumable pieces

In [None]:
text_splitter = CharacterTextSplitter(separator = " ", chunk_size=300, chunk_overlap=0)
docs = text_splitter.split_documents(documents)
len(docs)

14

In [None]:
## Here is the first chunk
docs[0]

Document(metadata={'source': 'taare_zameen_par.txt'}, page_content="Ishaan is an 8-year-old boy living in Mumbai, who has trouble following school. He is assumed by all to simply hate learning and deemed a troublemaker, and is belittled for it. He has even repeated the 3rd standard due to his academic failures from the previous year. Ishaan's imagination,")

## Load Embedding Model

Embedding models are mostly encoder based model (for example BERT and RoBERTa architecture)

Here is a brief of encoder only models:

1. An encoder-only model is a type of machine learning model that focuses on understanding text, but doesn't generate new text:
2. What they do ?

    Encoder-only models are designed to analyze the meaning of words and sentences in a text, and produce task-specific outputs like labels or token predictions.
3. What they're good for?

    They're well-suited for tasks that require understanding text, like text classification, question answering, and sentiment analysis.
4. How they work
    
    Encoder-only models process input text in a bidirectional manner, considering the context of each word from both the left and right sides. This allows them to understand the full meaning of a text.
5. Examples
    
    BERT (Bidirectional Encoder Representations from Transformers) and RoBERTa are examples of encoder-only models.

Encoder-only models are different from decoder-only models, which are used for other types of generative tasks like Q&A

In [None]:
embeddings = OpenAIEmbeddings(model='text-embedding-3-small')

## Create a Vector Database

We will store all the embeddings from the source in Vector Database.

We will use [`langchain_community.Chroma.from_documents`](https://api.python.langchain.com/en/latest/vectorstores/langchain_community.vectorstores.chroma.Chroma.html#langchain_community.vectorstores.chroma.Chroma.from_documents)

In [None]:
os.makedirs("db", exist_ok=True)

random_suffix = randomname.get_name()

persistent_directory = f"db/chroma-({random_suffix})"

## If already there, delete and create a new one
for folder in glob("db/chroma*"):
    if os.path.exists(folder):
        shutil.rmtree(folder)

os.mkdir(persistent_directory)

vector_db = Chroma.from_documents(
    documents = docs,
    collection_name = "movie_embeddings_v1",
    embedding = embeddings,
    persist_directory = persistent_directory
)

## Retrieving Relevant Chunks based on a query

In [None]:
query = "Who was Ishaan ?"

retriever = vector_db.as_retriever(
    search_type = "similarity_score_threshold",
    search_kwargs = {"k": 2, "score_threshold": 0.5}
)

relevant_docs = retriever.invoke(query)

In [None]:
for doc in relevant_docs:
    print("Source:",doc.metadata['source'])
    print(doc.page_content)
    print("-"*30)

Source: taare_zameen_par.txt
Ishaan is an 8-year-old boy living in Mumbai, who has trouble following school. He is assumed by all to simply hate learning and deemed a troublemaker, and is belittled for it. He has even repeated the 3rd standard due to his academic failures from the previous year. Ishaan's imagination,
------------------------------


## Creating Metadata for documents

while embedding the documents, it is always a good idea to add metadata to each document like adding title, filename, filesize, pages, author etc etc

This is important when retrieving any document as we may want to know the reference when our LLM is generating any text for validation purpose

In [None]:
## creating a vector database from many documents
documents = []
for txt_file_path in glob("*.txt"):

    loader = TextLoader(txt_file_path)
    docs = loader.load()

    for doc in docs:
        doc.metadata = {"soure": txt_file_path}
        documents.append(doc)

In [None]:
## Split the documents into chunks
text_splitter = CharacterTextSplitter(separator=" ", chunk_size=500, chunk_overlap=0)
chunks = text_splitter.split_documents(documents)
len(chunks)

62

In [None]:
len(chunks[0].page_content)

494

In [None]:
os.makedirs("db", exist_ok=True)

In [None]:
def create_vector_database(chunks: list,name=""):

    random_suffix = randomname.get_name()

    persistent_directory = f"db/chroma-{name}-({random_suffix})"

    ## If already there, delete and create a new one

    for folder in glob(f"db/chroma-{name}*"):
        if os.path.exists(folder):
            shutil.rmtree(folder)

    os.mkdir(persistent_directory)

    vector_db = Chroma.from_documents(
        documents = chunks,
        collection_name = "movie_embeddings_v2",
        embedding = embeddings,
        persist_directory = persistent_directory
    )

    return vector_db

Retrive the text using a query:

In [None]:
vector_db = create_vector_database(chunks, name='first')

retriever = vector_db.as_retriever(
    search_type = "similarity_score_threshold",
    search_kwargs = {"k": 3, "score_threshold": 0.1}
)

In [None]:
query = "Explain the character of Munna"
retriever.invoke(query)

[Document(metadata={'soure': 'munna_bhai.txt'}, page_content='and forgive him. Munna ends up marrying Suman after learning of her true identity as "Chinki", and together, they open a real hospital in Munna\'s family village. Circuit also gets married a year later and has a son nicknamed "Short Circuit". Asthana resigns as the dean and becomes the head doctor, employing Munna\'s methods, while Rustom succeeds him. As the film concludes, Anand, restored to normal mental health, narrates the story to a few children at the hospital as he is about to leave for'),
 Document(metadata={'soure': 'munna_bhai.txt'}, page_content="Munna Bhai film series, the film follows Munna Bhai, a don in the Mumbai underworld, trying to please his father by pretending to be a doctor, but when a doctor, Asthana (Irani), exposes his lies and tarnishes his father's honor, Munna enrolls in a medical college. Chaos ensues when Munna, upon finding that Asthana is the dean of the college, vows revenge, while also spa

In [None]:
query = "A.R Rahman composed the music of which movies ?"
retriever.invoke(query)

[Document(metadata={'soure': 'swades.txt'}, page_content='was composed by A. R. Rahman, with lyrics penned by Javed Akhtar.\n\nSwades was theatrically released on 17 December 2004, and it opened to rave reviews from critics, with praise for the performances of Khan, Joshi and Ballal, and the story, screenplay, and soundtrack. However, it emerged as a commercial failure at the box office.\n\nAt the 50th Filmfare Awards, Swades received 8 nominations, including Best Film, Best Director (Gowarikar) and Best Music Director (Rahman), and won Best Actor (Khan)'),
 Document(metadata={'soure': 'swades.txt'}, page_content="and Best Background Score (Rahman).\n\nIt was dubbed in Tamil as Desam and released on 26 January 2005, coinciding with Indian Republic Day. Despite its commercial failure, Swades is regarded ahead of its time and is now considered a cult classic of Hindi cinema and one of the best films in Shah Rukh Khan's filmography. [10][11] The film is owned by Red Chillies Entertainment

In [None]:
query = "In which movies the director was Ashutosh Gowariker ?"
retriever.invoke(query)

[Document(metadata={'soure': 'lagaan.txt'}, page_content='Lagaan: Once Upon a Time in India, or simply Lagaan, (transl.\u2009Land tax) is a 2001 Indian Hindi-language epic period musical[5] sports drama film written and directed by Ashutosh Gowariker. The film was produced by Aamir Khan, who stars alongside debutant Gracy Singh and British actors Rachel Shelley and Paul Blackthorne. Set in 1893, during the late Victorian period of British colonial rule in India, the film follows the inhabitants of a village in Central India, who, burdened by high taxes and'),
 Document(metadata={'soure': 'swades.txt'}, page_content="Swades: We, the People (transl.\u2009Homeland) is a 2004 Indian Hindi-language drama film co-written, directed and produced by Ashutosh Gowariker.[3] The film stars Shah Rukh Khan, Gayatri Joshi and Kishori Ballal while Daya Shankar Pandey, Rajesh Vivek, Lekh Tandon appear in supporting roles.\n\nThe plot was based on two episodes of the series Vaapsi on Zee TV's Yule Love 

## Text Splitters

There are different strategies when splitting the whole text into small chunks:

- RecursiveTextSplitter
- CharacterTextSplitter
etc etc

Refer to more splitters in the [docs](https://js.langchain.com/docs/concepts/text_splitters/)

In [None]:
## Create the text splitters

char_text_splitter = CharacterTextSplitter(chunk_size=500, chunk_overlap=0, separator=".")
token_text_splitter = TokenTextSplitter(chunk_size=300)
recursive_text_splitter = RecursiveCharacterTextSplitter(chunk_size=500, chunk_overlap=100)

## Split the texts
char_docs = char_text_splitter.split_documents(documents)
token_docs = token_text_splitter.split_documents(documents)
recr_docs = recursive_text_splitter.split_documents(documents)


## Now create the vector database for each of the splitter
char_db = create_vector_database(char_docs, name='char')
token_db = create_vector_database(token_docs, name='token')
recr_db = create_vector_database(recr_docs,name="recr")

## Create the retriever objects
char_db_retriever = char_db.as_retriever(
    search_type = "similarity_score_threshold",
    search_kwargs = {"k": 3, "score_threshold": 0.1}
)

token_db_retriever = token_db.as_retriever(
    search_type = "similarity_score_threshold",
    search_kwargs = {"k": 3, "score_threshold": 0.1}
)

recr_db_retriever = recr_db.as_retriever(
    search_type = "similarity_score_threshold",
    search_kwargs = {"k": 3, "score_threshold": 0.1}
)

In [None]:
def format_retrieve(retriever, query):
    relevant_doc = retriever.invoke(query)
    print(relevant_doc.page_content)

In [None]:
query = "In which movies the director was Ashutosh Gowariker ?"
for retriever in [char_db_retriever, token_db_retriever, recr_db_retriever]:
    print("\n\n")
    relevant_docs = retriever.invoke(query)
    for doc in relevant_docs:
        print(doc.page_content)
        print("-"*30)




Swades: We, the People (transl. Homeland) is a 2004 Indian Hindi-language drama film co-written, directed and produced by Ashutosh Gowariker.[3] The film stars Shah Rukh Khan, Gayatri Joshi and Kishori Ballal while Daya Shankar Pandey, Rajesh Vivek, Lekh Tandon appear in supporting roles.

The plot was based on two episodes of the series Vaapsi on Zee TV's Yule Love Stories (1994–95) which had Gowariker playing the role of Mohan Bhargav
------------------------------
Lagaan: Once Upon a Time in India, or simply Lagaan, (transl. Land tax) is a 2001 Indian Hindi-language epic period musical[5] sports drama film written and directed by Ashutosh Gowariker. The film was produced by Aamir Khan, who stars alongside debutant Gracy Singh and British actors Rachel Shelley and Paul Blackthorne
------------------------------
It received widespread critical acclaim for Gowariker's direction, Khan's performance, dialogues, soundtrack, and the film's anti-imperialist stance. With earnings of ₹65.9

## Embedding Models

There are plently of embedding models out there

Here are links to some:
1. [OpenAI Embeddings](https://platform.openai.com/docs/guides/embeddings)
2. [sentence-transformers/all-MiniLM-L6-v2](https://huggingface.co/sentence-transformers/all-MiniLM-L6-v2)

you can also explore more on (Hugging Face Hub](https://huggingface.co/models?pipeline_tag=sentence-similarity&sort=likes)

In this section, we will compare these two embeddings and check the performance for retrieving the information

Outline

1. First we will scan all the text files and create a document object along with metedata
2. Then we will chunk the documents
3. Then use one of the embedding models to convert to numerical representation
4. Then store these numerical vectors in vector database
5. Compare the results with a single query to see how the embeddings are performing

### Load all the documents

In [11]:
combined_documents = []
for txt_path in glob("*.txt"):
    text_loader = TextLoader(txt_path)
    docs = text_loader.load()
    for doc in docs:
        combined_documents.append(doc)

len(combined_documents)

4

### Chunking the documents

In [20]:
text_splitter = RecursiveCharacterTextSplitter(chunk_size = 500, chunk_overlap = 100)
chunks = text_splitter.split_documents(combined_documents)

## find the average character length in all the chunks

chars_count = [len(c.page_content) for c in chunks]
chars_avg_count = sum(chars_count)/len(chunks)


print(f"All the documents were split into: {len(chunks)} chunks with average chunk length of {chars_avg_count:.0f}.")

All the documents were split into: 62 chunks with average chunk length of 388.


### Embedding the chunks

Loading embeddings from huggingface using [`HuggingFaceEmbeddings`](https://python.langchain.com/api_reference/huggingface/embeddings/langchain_huggingface.embeddings.huggingface.HuggingFaceEmbeddings.html)

In [23]:
open_ai_embedding = OpenAIEmbeddings(model='text-embedding-3-small')
huggingface_embeddding = HuggingFaceEmbeddings(
    model_name = "sentence-transformers/all-MiniLM-L6-v2",
    model_kwargs = {'device': 'cpu'}
)

The secret `HF_TOKEN` does not exist in your Colab secrets.
To authenticate with the Hugging Face Hub, create a token in your settings tab (https://huggingface.co/settings/tokens), set it as secret in your Google Colab and restart your session.
You will be able to reuse this secret in all of your notebooks.
Please note that authentication is recommended but still optional to access public models or datasets.


modules.json:   0%|          | 0.00/349 [00:00<?, ?B/s]

config_sentence_transformers.json:   0%|          | 0.00/116 [00:00<?, ?B/s]

README.md:   0%|          | 0.00/10.7k [00:00<?, ?B/s]

sentence_bert_config.json:   0%|          | 0.00/53.0 [00:00<?, ?B/s]

config.json:   0%|          | 0.00/612 [00:00<?, ?B/s]

model.safetensors:   0%|          | 0.00/90.9M [00:00<?, ?B/s]

tokenizer_config.json:   0%|          | 0.00/350 [00:00<?, ?B/s]

vocab.txt:   0%|          | 0.00/232k [00:00<?, ?B/s]

tokenizer.json:   0%|          | 0.00/466k [00:00<?, ?B/s]

special_tokens_map.json:   0%|          | 0.00/112 [00:00<?, ?B/s]

1_Pooling/config.json:   0%|          | 0.00/190 [00:00<?, ?B/s]

## Create Vector Database

In [26]:
def create_vector_database_v2(chunks: list, name:str, embeddings):

    random_suffix = randomname.get_name()

    persistent_directory = f"db/chroma-{name}-({random_suffix})"

    ## If already there, delete and create a new one

    for folder in glob(f"db/chroma-{name}*"):
        if os.path.exists(folder):
            shutil.rmtree(folder)

    os.mkdir(persistent_directory)

    vector_db = Chroma.from_documents(
        documents = chunks,
        collection_name = name,
        embedding = embeddings,
        persist_directory = persistent_directory
    )

    return vector_db

In [27]:
os.makedirs("db", exist_ok=True)

In [28]:
openai_vector_db = create_vector_database_v2(
    chunks = chunks,
    name = "open_ai_vector_database",
    embeddings = open_ai_embedding
)

In [29]:
huggingface_vector_db = create_vector_database_v2(
    chunks = chunks,
    name = "huggingface_vector_database",
    embeddings = huggingface_embeddding
)

### Retrieve the information from Vector Database

In [31]:
open_ai_retriever = openai_vector_db.as_retriever(
    search_type = "similarity_score_threshold",
    search_kwargs = {"k": 3, "score_threshold": 0.1}
)

huggingface_ai_retriever = huggingface_vector_db.as_retriever(
    search_type = "similarity_score_threshold",
    search_kwargs = {"k": 3, "score_threshold": 0.1}
)

In [34]:
query = "In which movies the director was Ashutosh Gowariker ?"

for retriever in [open_ai_retriever, huggingface_ai_retriever]:
    print("\n\n")
    relevant_docs = retriever.invoke(query)
    for doc in relevant_docs:
        print(doc.page_content)




foray into film production. Gowariker was inspired by aspects of sports drama Naya Daur (1957) in developing the film. The language featured in the film was based on Awadhi, but was diluted with standard Hindi for modern audiences. Principal photography took place in villages near Bhuj. Nitin Chandrakant Desai served as art director, while Bhanu Athaiya was the costume designer. The original soundtrack was composed by A. R. Rahman, with lyrics written by Javed Akhtar.
Swades: We, the People (transl. Homeland) is a 2004 Indian Hindi-language drama film co-written, directed and produced by Ashutosh Gowariker.[3] The film stars Shah Rukh Khan, Gayatri Joshi and Kishori Ballal while Daya Shankar Pandey, Rajesh Vivek, Lekh Tandon appear in supporting roles.
Lagaan: Once Upon a Time in India, or simply Lagaan, (transl. Land tax) is a 2001 Indian Hindi-language epic period musical[5] sports drama film written and directed by Ashutosh Gowariker. The film was produced by Aamir Khan, who star

## Retriever




For retrieving the most relevant documents from a vector store, there are many different methods to score the relevancies:

1. Top - K Similar Results
2. MMR
3. Similarity Scores with a threshold

[`langchain_core.vectorstores.base.VectorStore.as_retriever`](https://python.langchain.com/api_reference/core/vectorstores/langchain_core.vectorstores.base.VectorStore.html#langchain_core.vectorstores.base.VectorStore.as_retriever)

**Ranking Metric**

Two key techniques for ranking document relevancy: cosine similarity and Maximal Marginal Relevance (MMR).

1. **Cosine Similarity**: This method focuses solely on finding the most similar documents to the query by measuring the cosine of the angle between the query vector and document vectors. While effective for identifying closely related documents, it often results in a more homogeneous set of results, lacking diversity.

2. **Maximal Marginal Relevance (MMR)**: MMR enhances the diversity of search results by balancing relevance and novelty. Unlike cosine similarity, MMR introduces a mechanism to penalize selections that are too similar to previously chosen items. This ensures that the final set of documents is not only relevant but also diverse, reducing redundancy and providing a broader range of information. This approach is particularly beneficial in scenarios where a wide variety of information is desired.

In summary, while cosine similarity is great for finding the closest matches, MMR goes a step further by ensuring that the results are both relevant and diverse, making it a sophisticated technique for improving the quality of search results in a RAG pipeline

In [43]:
open_ai_mmr_retriever = openai_vector_db.as_retriever(
    search_type = "mmr",
    search_kwargs = {"k": 5}
)

open_ai_similarity_retriever = openai_vector_db.as_retriever(
    search_type = "similarity",
    search_kwargs = {"k": 5}
)

query = "In which movies the director was Ashutosh Gowariker ?"
print(query)
for retriever in [open_ai_mmr_retriever, open_ai_similarity_retriever]:
    print("\n\n")
    relevant_docs = retriever.invoke(query)
    for i, doc in enumerate(relevant_docs):
        print(f"{i+1}.",doc.page_content)
        print()

In which movies the director was Ashutosh Gowariker ?



1. foray into film production. Gowariker was inspired by aspects of sports drama Naya Daur (1957) in developing the film. The language featured in the film was based on Awadhi, but was diluted with standard Hindi for modern audiences. Principal photography took place in villages near Bhuj. Nitin Chandrakant Desai served as art director, while Bhanu Athaiya was the costume designer. The original soundtrack was composed by A. R. Rahman, with lyrics written by Javed Akhtar.

2. Swades: We, the People (transl. Homeland) is a 2004 Indian Hindi-language drama film co-written, directed and produced by Ashutosh Gowariker.[3] The film stars Shah Rukh Khan, Gayatri Joshi and Kishori Ballal while Daya Shankar Pandey, Rajesh Vivek, Lekh Tandon appear in supporting roles.

3. Lagaan: Once Upon a Time in India, or simply Lagaan, (transl. Land tax) is a 2001 Indian Hindi-language epic period musical[5] sports drama film written and directed by 

We can see that the mmr gives more diverse results

## Re-ranking to improve RAG

**Reranking techniques to improve Retrieval-Augmented Generation (RAG) pipelines:**

<img src="https://miro.medium.com/v2/resize:fit:1400/format:webp/1*q1QQSnNCu-OoN1Ttgh-AVw.png" height=400/>

1. Initial Retrieval: The process starts by retrieving the top K documents from a vector database based on similarity scores. This step is fast and efficient, using methods like cosine similarity.

2. Reranking: After the initial retrieval, a reranker model re-evaluates these top K documents to improve their relevance to the query. This model can be more sophisticated and context-aware, often using techniques like cross-attention.

3. Cross-Attention: This technique allows the reranker to consider the interaction between the query and each document more deeply, capturing nuanced relationships and improving the ranking quality.

4. Reordering: Based on the new relevance scores from the reranker, the documents are reordered to ensure the most relevant ones are at the top.

5. Final Selection: The top N documents from the reranked list are selected for the next steps in the RAG pipeline, such as being passed to the language model for generating a response.

6. Model Sophistication: While the reranker is often more advanced than the initial retrieval model, it doesn't always have to be. The key is that the reranker provides a different perspective or additional layer of evaluation to enhance the quality of the results.

7. ChromaDB: ChromaDB supports reranking, and you can use models like cross-encoders to re-evaluate and improve the initial search results.

Overall, reranking helps refine the initial retrieval results by applying a more detailed and context-aware evaluation, leading to better and more relevant document selection

References

1. [Rerankers and Two-Stage Retrieval](https://www.pinecone.io/learn/series/rag/rerankers/)
2. https://adasci.org/a-hands-on-guide-to-enhance-rag-with-re-ranking/

## Generation part of RAG

Now here we will Augment the relevant information that we have retrieved from the vector database into a prompt which will be passed on to our LLM model for answering.

In [63]:
prompt_template = ChatPromptTemplate.from_messages(
    messages = [
        ("system", "Based on the relevant information below\n\n{relevant_information}"),
        ("human", "Answer the following: {query}")
    ]
)

query = "In which movies the director was Ashutosh Gowariker ?"

results = open_ai_mmr_retriever.invoke(query)
relevant_information = "\n".join([f"{i+1}. {r.page_content}" for i, r in enumerate(results)])

this is how the prompt looks like:

In [62]:
print(prompt_template.invoke(
    {
        "relevant_information": relevant_information,
        "query": query
    }
).messages[0].content)

Based on the relevant information below

1. foray into film production. Gowariker was inspired by aspects of sports drama Naya Daur (1957) in developing the film. The language featured in the film was based on Awadhi, but was diluted with standard Hindi for modern audiences. Principal photography took place in villages near Bhuj. Nitin Chandrakant Desai served as art director, while Bhanu Athaiya was the costume designer. The original soundtrack was composed by A. R. Rahman, with lyrics written by Javed Akhtar.
2. Swades: We, the People (transl. Homeland) is a 2004 Indian Hindi-language drama film co-written, directed and produced by Ashutosh Gowariker.[3] The film stars Shah Rukh Khan, Gayatri Joshi and Kishori Ballal while Daya Shankar Pandey, Rajesh Vivek, Lekh Tandon appear in supporting roles.
3. Lagaan: Once Upon a Time in India, or simply Lagaan, (transl. Land tax) is a 2001 Indian Hindi-language epic period musical[5] sports drama film written and directed by Ashutosh Gowariker

In [66]:
chain =  prompt_template | model | StrOutputParser()

In [71]:
message = chain.invoke({"query": query, "relevant_information": relevant_information})
message.content

'The director Ashutosh Gowariker was behind the movies "Swades: We, the People" and "Lagaan: Once Upon a Time in India."'