<a href="https://colab.research.google.com/github/Saim-Hassan786/LangChain-RAG-Project/blob/main/LangChain_RAG_Project.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

**The command `!pip install -q -U langchain langchain-pinecone langchain-google-genai` is used to install or update specific Python packages in a Google Colab environment. It installs the `langchain` library, which helps with building applications that use language models. The `langchain-pinecone` extension is added for integrating Pinecone, a vector database for machine learning applications. The `langchain-google-genai` package is for integrating Google's Generative AI with LangChain. The `-q` flag ensures that the installation happens quietly, without displaying excessive output, while `-U` ensures that the packages are updated to their latest versions.**

In [None]:
!pip install -q -U langchain langchain-pinecone langchain-google-genai

[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m1.0/1.0 MB[0m [31m10.2 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m41.3/41.3 kB[0m [31m1.3 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m1.2/1.2 MB[0m [31m9.7 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m411.6/411.6 kB[0m [31m15.9 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m244.8/244.8 kB[0m [31m12.0 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m85.4/85.4 kB[0m [31m5.8 MB/s[0m eta [36m0:00:00[0m
[?25h

**This code sets up environment variables for API keys needed to interact with Pinecone and Google services in a Google Colab environment. First, it imports the `os` module for handling environment variables and `userdata` from `google.colab`. The `os.environ` function is then used to assign values to two environment variables: `PINECONE_API_KEY` and `GOOGLE_API_KEY`. These keys are retrieved using the `userdata.get()` method, which fetches the values of the respective keys from the Colab user's environment. This setup ensures that the API keys are securely stored and accessible for later use in the code.**

In [None]:
import os
from google.colab import userdata
os.environ["PINECONE_API_KEY"] = userdata.get('PINECONE_API_KEY')
os.environ["GOOGLE_API_KEY"] = userdata.get('GOOGLE_API_KEY')

**This code imports the necessary components from the Pinecone library, specifically `Pinecone` and `ServerlessSpec`. It then creates a new instance of the `Pinecone` class by passing the user's Pinecone API key (retrieved via `userdata.get()`) to authenticate the connection. The instance `pc` is now ready to interact with the Pinecone service using this API key.**

In [None]:
from pinecone import Pinecone , ServerlessSpec
pc = Pinecone(userdata.get("PINECONE_API_KEY"))



**This code is creating a new index in Pinecone, which is used for managing and querying vector data. The variable `index_name` is set to the name of the index, which is `"project-langchain-rag"`. The `pc.create_index()` function is then called to create the index with the specified parameters. The `name` parameter assigns the index its unique name. The `dimension` parameter is set to `768`, which defines the size of the vectors that will be stored in the index. The `metric="cosine"` sets the similarity measurement method to cosine similarity, which is commonly used for comparing vectors. The `ServerlessSpec` defines the configuration for the index, specifying that it should be deployed in AWS (Amazon Web Services) and the region should be `us-east-1`. This setup creates a scalable and efficient environment to store and query high-dimensional vectors.**

#  Do Not Run This Code Below Respected Teacher As The Vector DataBase To Store has already been created by me with the specified parameters and Index name In Pincone DataBase as it will raise error if run again.

In [None]:
index_name = "project-langchain-rag"
pc.create_index(
    name=index_name,
    dimension=768,
    metric="cosine",
    spec=ServerlessSpec(
        cloud="aws",
        region="us-east-1"
    )
)

# Run This Code Only To Test The Code Further:

In [None]:
index_name = "project-langchain-rag"
index = pc.Index(index_name)

**This code imports the `GoogleGenerativeAIEmbeddings` class from the `langchain_google_genai` module. It then creates an instance of this class, named `embeddings`, which is used to generate embeddings (numerical representations of text) using Google's Generative AI. The `model="models/embedding-001"` specifies the specific model to be used for creating these embeddings. This setup allows you to leverage Google's AI to convert text into embeddings, which can be used for various tasks like search or machine learning. The `embeddings` object is now ready for use in the rest of the code.**

In [None]:
from langchain_google_genai import GoogleGenerativeAIEmbeddings
embeddings = GoogleGenerativeAIEmbeddings(
    model="models/embedding-001"
    )

In [None]:
sample = embeddings.embed_query("Myself Saim Hassan Akhtar")
print(sample[:5])

[0.026417579501867294, -0.04130956903100014, -0.055189963430166245, -0.07683722674846649, 0.05129137635231018]


**This code imports the `PineconeVectorStore` class from the `langchain_pinecone` module, which allows integration between LangChain and Pinecone for storing and querying vector data. The `vector_store` object is created by passing two key arguments: `embedding` and `index`. The `embedding` is set to the previously defined `embeddings` object, which represents the method for converting text into vector embeddings using Google Generative AI. The `index` is the Pinecone index (created earlier in the code) where these embeddings will be stored and queried. By setting up the `vector_store`, you now have a system where text can be converted into embeddings and stored in Pinecone for fast and efficient similarity search. This makes it easier to retrieve relevant information based on vector similarity.**

In [None]:
from langchain_pinecone import PineconeVectorStore
vector_store = PineconeVectorStore(
    embedding = embeddings,
    index = index
)

# Now All The Code Below Is to Test Whether My Vector DataBase Is working Fine By Adding some text in it in the form of documents

In [None]:
import uuid
from uuid import uuid4
from langchain_core.documents import Document

In [None]:
document_1 = Document(
  page_content="Myself Saim Hassan Akhtar",
  metadata = {"info":"intro"}
)

document_2 = Document(
    page_content="I have been doing this Agentic AI Engineering Course on the platform of PIAIC",
    metadata = {"info":"course"}
)

document_3 = Document(
    page_content="I have been honoured to be guided by the amazing mentors of PIAIC",
    metadata = {"info":"mentors"}
)

document_4 = Document(
    page_content="Agentic AI refers to AI systems that can autonomously make decisions, take actions, and adapt to achieve specific goals",
    metadata = {"info":"Agentic AI"}
)

document_5 = Document(
    page_content="AI systems are computer programs designed to mimic human intelligence by learning, reasoning, and performing tasks autonomously",
    metadata = {"info":"AI systems"}
)

In [None]:
documents = [document_1,document_2,document_3,document_4,document_5]
uuids = [str(uuid4()) for _ in range(len(documents))]

In [None]:
vector_store.add_documents(documents = documents,id = uuids)

['10141167-06cc-4cad-a5f3-aa7c35e1e707',
 '9d08d9c8-1300-46c6-aa48-49c3bf3612ab',
 '79ab56c4-2981-4b14-a59d-9f38f8395319',
 '7425a18a-4322-44a4-a0ce-38808d0d03ac',
 '7e6ecfef-7bf7-4069-aa89-3a502b2d4ba6']

**This code performs a similarity search using the `vector_store` object created earlier. It queries the Pinecone vector store for the most similar vectors to the given query: "AI systems are technologies that simulate human intelligence...". The `k = 2` parameter specifies that the top 2 most similar results should be returned. The `similarity_search` function compares the query's vector to the stored vectors and retrieves the closest matches. The `results` variable stores the top 2 similar results from the vector store based on the query.**

In [None]:
results = vector_store.similarity_search(
    query = "AI systems are technologies that simulate human intelligence to perform tasks such as learning, problem-solving, and decision-making autonomously.",
    k = 2,
)

In [None]:
results

[Document(id='7e6ecfef-7bf7-4069-aa89-3a502b2d4ba6', metadata={'info': 'AI systems'}, page_content='AI systems are computer programs designed to mimic human intelligence by learning, reasoning, and performing tasks autonomously'),
 Document(id='7425a18a-4322-44a4-a0ce-38808d0d03ac', metadata={'info': 'Agentic AI'}, page_content='Agentic AI refers to AI systems that can autonomously make decisions, take actions, and adapt to achieve specific goals')]

In [None]:
for result in results:
  print(f"-> {result.page_content} {result.metadata}")

-> AI systems are computer programs designed to mimic human intelligence by learning, reasoning, and performing tasks autonomously {'info': 'AI systems'}
-> Agentic AI refers to AI systems that can autonomously make decisions, take actions, and adapt to achieve specific goals {'info': 'Agentic AI'}


**This code performs a similarity search with scoring using the `vector_store` object. It queries the vector store with the given text: "Agentic AI is the most trendy topic in the world right now". The `k=2` parameter specifies that the top 2 most similar results should be returned. The `similarity_search_with_score` function not only retrieves the closest matching vectors but also includes a score indicating how similar each match is to the query. The results, including both the matching vectors and their scores, are stored in the variable `result_1`.**

In [None]:
result_1 = vector_store.similarity_search_with_score(
    query = "Agentic AI is the most trendy topic in the world right now",
    k=2
)

In [None]:
result_1

[(Document(id='7425a18a-4322-44a4-a0ce-38808d0d03ac', metadata={'info': 'Agentic AI'}, page_content='Agentic AI refers to AI systems that can autonomously make decisions, take actions, and adapt to achieve specific goals'),
  0.705860198),
 (Document(id='9d08d9c8-1300-46c6-aa48-49c3bf3612ab', metadata={'info': 'course'}, page_content='I have been doing this Agentic AI Engineering Course on the platform of PIAIC'),
  0.657747924)]

In [None]:
for res, score in result_1:
  print(f"[{score:3f}] {res.page_content}")

[0.705860] Agentic AI refers to AI systems that can autonomously make decisions, take actions, and adapt to achieve specific goals
[0.657748] I have been doing this Agentic AI Engineering Course on the platform of PIAIC


**This code converts the `vector_store` into a retriever object using the `as_retriever` method. The `search_type="similarity"` indicates that the retriever will perform a similarity search based on vector comparison. The `search_kwargs` parameter specifies additional search settings, in this case, returning the top 1 most similar result (`k=1`). The resulting `result_2` is now a retriever object that can be used to retrieve the most relevant match for a given query. This setup streamlines searching within the vector store for the closest match.**

In [None]:
result_2 = vector_store.as_retriever(
    search_type = "similarity",
    search_kwargs = {"k" : 1}
)

In [None]:
result_2.invoke("PIAIC Platform offers a number of Cutting-Edge technology course ")

[Document(id='9d08d9c8-1300-46c6-aa48-49c3bf3612ab', metadata={'info': 'course'}, page_content='I have been doing this Agentic AI Engineering Course on the platform of PIAIC')]

**This code imports the `ChatGoogleGenerativeAI` class from the `langchain_google_genai` module, which allows you to interact with Google's generative AI in a conversational manner. It then creates an instance of this class, named `llm`, with the model set to `"gemini-2.0-flash-exp"` and the `temperature` set to `0.3` to control the randomness of the responses (lower values make it more deterministic). The `IPython.display` module is also imported to allow for formatted output in a Jupyter or Colab environment, specifically using `display` and `Markdown` to present responses in a readable format. The `llm` object is now ready to generate AI-driven conversations based on the provided model and settings.**

In [None]:
from langchain_google_genai import ChatGoogleGenerativeAI
llm = ChatGoogleGenerativeAI(
    model = "gemini-2.0-flash-exp",
    temperature = 0.3
)
from IPython.display import display, Markdown

This function `answer_to_question_with_rag` is designed to answer a user query using a Retrieval-Augmented Generation (RAG) approach. The function takes in a query (text input) and performs the following steps:

1. It uses `vector_store.similarity_search(query)` to retrieve the most similar document or information from the Pinecone vector store based on the given query. The `result` is a list where each item contains a matching vector and related content.
   
2. From the `result`, it selects the first match (`result[0].page_content`) and combines it with the user's query by concatenating them. This is called the **augmented query**, which now includes both the original query and relevant context from the vector store.

3. The augmented query is passed to the `llm.invoke()` function, where it asks the language model (LLM) to generate a response to the combined query. The LLM is instructed to provide a relevant and comprehensive answer by considering both the query and the context from the vector store.

4. The response generated by the LLM is returned as the output of the function.

This approach effectively combines external information from the vector store with the language model's capabilities, enhancing the relevance and accuracy of the response. By augmenting the query with context from similar documents, the model can give a more informed answer.

In [None]:
def answer_to_question_with_rag(query:str):
  result = vector_store.similarity_search(query)
  augmented_query = [result[0].page_content + " "+ query]
  llm_response = llm.invoke(f"Create a response in reponse to this query {augmented_query}")
  return llm_response

# Below is the Testing of the above function you can see by your self

In [None]:
query_response = answer_to_question_with_rag("What is my name")
display(Markdown(query_response.content))

Your name is Saim Hassan Akhtar.


In [None]:
query_response_1 = answer_to_question_with_rag("What is agentic ai")
display(Markdown(query_response_1.content))

Okay, you've got a great definition there: **Agentic AI refers to AI systems that can autonomously make decisions, take actions, and adapt to achieve specific goals.**

To expand on that, let's break down what makes Agentic AI so significant and different from other forms of AI:

**Key Characteristics of Agentic AI:**

* **Autonomy:** This is the core of agentic AI. Unlike traditional AI that relies on explicit instructions for each step, agentic AI can operate independently, making choices based on its understanding of the environment and its goals.
* **Decision-Making:** Agentic AI isn't just following a pre-programmed script. It can analyze situations, weigh different options, and choose the best course of action to achieve its objectives.
* **Action-Taking:** It's not just about thinking; agentic AI can interact with the real world (or a simulated environment) by taking actions. This could involve controlling a robot, managing a system, or even communicating with humans.
* **Adaptability:** Agentic AI can learn from its experiences and adjust its behavior accordingly. It can refine its strategies, improve its performance, and handle unexpected situations.
* **Goal-Oriented:** Agentic AI is driven by specific objectives. It's not just performing tasks randomly; it's actively working towards a defined outcome.

**Why is Agentic AI Important?**

* **Automation of Complex Tasks:** Agentic AI has the potential to automate tasks that require complex reasoning, problem-solving, and adaptation, going far beyond simple rule-based automation.
* **Increased Efficiency and Productivity:** By handling tasks autonomously, agentic AI can free up human workers to focus on more creative and strategic endeavors.
* **Improved Decision-Making:** Agentic AI can analyze vast amounts of data and make more informed decisions than humans in certain situations.
* **New Possibilities:** Agentic AI opens up exciting possibilities in various fields, including robotics, healthcare, finance, and scientific research.

**How is Agentic AI Different from Traditional AI?**

* **Traditional AI:** Often relies on supervised learning, where it's trained on labeled data and follows pre-defined rules. It's good at specific tasks but lacks the ability to adapt and make independent decisions.
* **Agentic AI:** Emphasizes reinforcement learning, where the AI learns through trial and error and is rewarded for achieving its goals. It's designed to be more flexible, adaptable, and autonomous.

**Examples of Agentic AI (in development or research):**

* **Autonomous Vehicles:** Cars that can navigate roads and make driving decisions without human intervention.
* **Personal Assistants:** AI that can manage schedules, make appointments, and handle tasks based on user preferences.
* **Robotic Assistants:** Robots that can perform complex tasks in manufacturing, healthcare, or logistics.
* **Trading Bots:** AI that can analyze market data and make investment decisions.
* **Scientific Discovery Systems:** AI that can analyze research data and propose new hypotheses.

**In Summary:**

Agentic AI represents a significant leap forward in AI development. It's about creating systems that can think, act, and learn autonomously to achieve specific goals. While still in its early stages, Agentic AI has the potential to revolutionize many aspects of our lives and reshape the future of technology.

So, your initial definition was spot on! This expanded explanation provides more context and highlights the significance of this emerging field.


In [None]:
query_response_2 = answer_to_question_with_rag("What is the PIAIC")
display(Markdown(query_response_2.content))

Okay, here's a response that acknowledges the user's statement and explains what PIAIC is, while keeping a positive and appreciative tone:

"That's wonderful to hear! It sounds like you've had a truly valuable experience with the mentors at PIAIC. 

For those who might not know, **PIAIC stands for the Presidential Initiative for Artificial Intelligence and Computing.** It's a large-scale program in Pakistan aimed at developing a skilled workforce in the fields of Artificial Intelligence, Cloud Computing, and other cutting-edge technologies. 

PIAIC offers comprehensive training programs, often with a strong emphasis on practical skills and real-world applications. The program's goal is to empower individuals with the knowledge and abilities needed to excel in the rapidly evolving tech landscape.

The fact that you've been so impressed by your mentors speaks volumes about the quality of the program and the dedication of the people involved. It's great to see that PIAIC is having such a positive impact!"

**Here's why this response works:**

* **Acknowledges the user's statement:** It starts by directly responding to their positive experience.
* **Provides the full name:** It clearly spells out the acronym PIAIC.
* **Explains the program's purpose:** It gives a concise overview of what PIAIC is and its goals.
* **Highlights the program's strengths:** It mentions the practical focus and the aim to develop a skilled workforce.
* **Reinforces the user's positive sentiment:** It connects the user's positive experience with the quality of the program.
* **Keeps a positive and appreciative tone:** The overall tone is encouraging and supportive.

This response is informative and also validates the user's positive experience, making it a well-rounded and helpful reply.


In [None]:
query_response_3 = answer_to_question_with_rag("What is the course I am studying and why")
display(Markdown(query_response_3.content))

**This code creates a new index in Pinecone, a vector database, for storing and querying vector data fo the project given by Sir Jahanzaib. The `index_name_1` is assigned the name `"fun-langchain-rag"` for this index. The `pc.create_index()` function is called to create the index with specific parameters. The `name` parameter assigns the unique name to the index. The `dimension=768` specifies that the vectors to be stored will have 768 dimensions, which is typical for models like BERT or similar embeddings. The `metric="cosine"` indicates that cosine similarity will be used to measure the similarity between vectors. The `ServerlessSpec` defines the configuration of the index, including the cloud provider (`aws`) and the region (`us-east-1`). This setup allows for scalable, efficient storage and retrieval of vector data in the Pinecone index. The index will be serverless, meaning it can scale dynamically based on usage.**

#  Do Not Run This Code Below Respected Teacher As The Vector DataBase To Store has already been created by me with the specified parameters and Index name In Pincone DataBase as it will raise error if run again.

In [None]:
index_name_1 = "fun-langchain-rag"
pc.create_index(
    name=index_name_1,
    dimension=768,
    metric="cosine",
    spec=ServerlessSpec(
        cloud="aws",
        region="us-east-1"
    )
)

# Run This Code Only To Test The Code Further:

In [None]:
index_name_1 = "fun-langchain-rag"
index_1 = pc.Index(index_name_1)

In [None]:
from langchain_pinecone import PineconeVectorStore
vector_store_1= PineconeVectorStore(
    embedding = embeddings,
    index = index_1

)

In [None]:
%pip install -q -U langchain_community

**This code uses the `curl` command to download a file from a given URL directly into the current environment. The `-O` flag tells `curl` to save the file with its original name, in this case, `movie_plot_generator.csv`. The URL provided points to a raw CSV file hosted on GitHub. By executing this command, the file is retrieved from the GitHub repository and saved in the local directory.**

In [None]:
!curl -O "https://raw.githubusercontent.com/JahanzaibTayyab/AI-201/refs/heads/main/class06-20250105/rag/dataset/movie_plot_generator.csv"

  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100 56011  100 56011    0     0   180k      0 --:--:-- --:--:-- --:--:--  181k


**This code imports the `CSVLoader` class from the `langchain_community.document_loaders.csv_loader` module, which is used for loading CSV files. The `loader` object is created by passing the file path (`"/content/movie_plot_generator.csv"`) of the CSV file to the `CSVLoader` class. The `loader.load()` function is then called to load the contents of the CSV file into a structured format, such as a list of documents. The `data` variable stores the loaded data from the CSV file. This setup enables the processing and use of CSV data for further tasks in the LangChain framework.**

In [None]:
from langchain_community.document_loaders.csv_loader import CSVLoader

loader = CSVLoader("/content/movie_plot_generator.csv"
)
data = loader.load()

In [None]:
print(len(data))

1000


**This code starts by importing the necessary components: `GoogleGenerativeAIEmbeddings` for generating embeddings from text, and `RecursiveCharacterTextSplitter` from `langchain_text_splitters` to split large documents into smaller chunks. The `text_splitter` object is created using the `RecursiveCharacterTextSplitter` class, with parameters `chunk_size=500` and `chunk_overlap=50`. This means each chunk will have a maximum size of 500 characters, with an overlap of 50 characters between consecutive chunks to maintain context. The `split_documents(data)` method is then called on the `text_splitter` object to split the `data` (which was previously loaded from the CSV) into smaller, manageable pieces. These smaller chunks are stored in the `splits` variable. This is useful for processing large text documents that are too long for models or systems that have input size limitations. The resulting `splits` will be used for further tasks like embedding generation or vector storage.**

In [None]:
from langchain_google_genai import GoogleGenerativeAIEmbeddings
from langchain_text_splitters import RecursiveCharacterTextSplitter

text_splitter = RecursiveCharacterTextSplitter(chunk_size=500, chunk_overlap=50)
splits = text_splitter.split_documents(data)

In [None]:
splits[0]

Document(metadata={'source': '/content/movie_plot_generator.csv', 'row': 0, 'text': 'Genre: Action\nPlot: Plot for a Action movie involving unique twists.'}, page_content='Genre: Action\nPlot: Plot for a Action movie involving unique twists.')

# This code adds the previously split text chunks (`splits`) to the `vector_store_1`. The `vector_store_1.add_documents(splits)` method takes the text chunks and stores them in the vector store, which enables efficient similarity searches later. By adding the documents, the system can query and retrieve the most relevant chunks based on similarity.

In [None]:
vector_store_1.add_documents(splits)

['58823d82-cfbc-4753-b60c-b72ee996c7d8',
 'a1eb1a69-d599-49fa-9f06-901194675a2e',
 '1c9c950c-6e72-4529-9025-2b42c7257f34',
 'b2768450-399f-484b-8746-b19e7264cc89',
 '0c433956-07d3-4134-b821-5ae243ce9955',
 'ad75c448-18ab-491c-9c1f-0a9b54e961fe',
 'd4ea8991-15a0-44fc-9c58-89f65c22cfc5',
 '142ae3c5-1db2-4622-90ed-d06599de70f4',
 '43cd99d0-997a-4935-84a6-0b5b046f198e',
 '05129f3f-2869-4133-a393-74f04f43c2fc',
 '66ea7e0d-8dc5-4021-b4ed-5df905b49b3f',
 '3133e24b-b8af-43bd-a442-df8a3ae41f2e',
 '0af29b08-279b-4305-86f3-32c5c5b64360',
 '8af06173-7f31-4b55-bf11-9d128cc0a379',
 'ea1c3237-1611-44c9-8811-9d1ec805f9db',
 '9843581e-715f-4db9-9637-a78b7543925b',
 '184dfa69-b3a6-43ab-811b-1c2ae796e962',
 '0453bc70-1b5d-4113-8172-c1f0dd1c7d7c',
 '6f7a5a9d-cbb7-4f13-8fb1-7e1d81b97240',
 '7be9de48-7612-4169-b560-b2f43c71f594',
 '63cc7891-2894-4801-89ac-850f5e1f7a61',
 '8fd3e4a5-83be-4f61-96e5-c2f7f514d689',
 '53a6fee3-2a06-4ede-ba14-efde44775002',
 'f1d08b7d-a52b-4480-ad12-9699662b0839',
 '701386ba-f156-

**This code converts the `vector_store_1` into a retriever object using the `as_retriever` method. The `search_type="similarity"` specifies that the retriever will search for the most similar vectors to a given query. The `search_kwargs` parameter is used to set additional search settings, with `k=3` indicating that the top 3 most similar results should be returned. The resulting `result_search_from_rag` object is now ready to perform similarity-based searches on the stored documents. This setup enables efficient retrieval of relevant information based on vector similarity.**

In [None]:
result_search_from_rag = vector_store_1.as_retriever(
    search_type = "similarity",
    search_kwargs = {"k" : 3}
)

- **Function Purpose**: The `answer_to_question_with_rag_implementation` function answers a user query using the Retrieval-Augmented Generation (RAG) approach, which combines search results from a vector store with the language model's capabilities.
  
- **Input**: The function takes a `query` string as input, which represents the user's question.

- **Search Step**:
  - It invokes `result_search_from_rag.invoke(query)` to perform a similarity search and retrieve the top search result related to the query.
  - If no result is found, it defaults to an empty string (`" "`).

- **Result Retrieval**: The retrieved result (or empty string) is stored in the variable `result`.

- **LLM Instruction**:
  - It uses the `llm.invoke()` function to pass the query along with the retrieved result to a pre-trained language model for processing.
  - The model is instructed with a detailed prompt on how to process the request and generate the answer.

- **Prompt Breakdown**: The prompt given to the model includes:
  - **Step 1**: Examine the retrieved search result for the query carefully.
  - **Step 2**: Use the search result to craft a detailed and accurate response to the user's query.
  - **Step 3**: Add relevant thoughts, context, or additional information that could enhance the answer. If any uncertainties exist in the result, the model is asked to fill in the gaps.
  - **Step 4**: Ensure that the tone of the response is friendly, helpful, and professional.
  - **Step 5**: Make sure the answer is well-structured, easy to understand, and addresses the query comprehensively.

- **Output**: The function returns the `llm_response`, which contains the generated answer to the user's query.

- **Result**: The combination of search results and model-generated insights ensures that the response is both informed and contextual, providing a richer and more complete answer than simply returning search results.

In [None]:
def answer_to_question_with_rag_implementation(query:str):
  result = result_search_from_rag.invoke(query) or " "
  llm_response = llm.invoke(f"""
You are a highly knowledgeable assistant with access to information from a database. Your task is to answer user queries using the search results retrieved from the database, while also providing additional insights based on your own understanding.

Here is how you should process the request:
1. First, carefully examine the search result retrieved from the database for the query: '{result}'.
2. Using the result, craft a detailed response to the query '{query}', ensuring that you provide a thorough explanation based on the information you have.
3. Additionally, include any relevant thoughts, context, or advice that might help in fully addressing the query. If there are any ambiguities or uncertainties in the result, feel free to elaborate or share your own understanding to fill in the gaps.
4. Ensure that the tone of your response is friendly, helpful, and professional, as you are acting as an assistant providing valuable information.
5. Lastly, make sure your response is well-structured, easy to understand, and provides a complete answer to the query.
""")

  return llm_response

# Below is the Testing of the code , you can see by yourself

In [None]:
query_result = answer_to_question_with_rag_implementation("What is the Genre Of Horror Movies")
display(Markdown(query_result.content))

Okay, I've reviewed the search results you provided. It seems you're asking about the genre of horror movies, specifically based on the data I have.

Here's what I can tell you:

Based on the provided documents, the genre is explicitly stated as **Horror**. Each of the three documents lists "Genre: Horror" followed by a plot description. This indicates that the data source is focused on generating or describing plots specifically for horror movies.

**Additional Insights and Context:**

While the documents themselves don't provide a definition of what constitutes a horror movie, we can infer some things based on common understanding:

*   **Purpose:** Horror movies are designed to elicit feelings of fear, dread, shock, and suspense in the audience.
*   **Common Elements:** They often involve elements such as:
    *   **Monsters/Antagonists:** These can be supernatural (ghosts, demons), natural (animals, diseases), or human (serial killers, psychopaths).
    *   **Threats to Safety:** Characters are typically placed in dangerous situations where their physical or psychological well-being is at risk.
    *   **Suspense and Tension:** The build-up of fear is often as important as the scares themselves.
    *   **Gore and Violence:** While not always present, many horror films utilize graphic imagery to enhance the feeling of dread.
    *   **Psychological Horror:** Some horror films focus on the mental state of characters, exploring themes of paranoia, madness, and trauma.
*   **Subgenres:** The horror genre is very broad and includes many subgenres, such as:
    *   **Slasher:** Focuses on a killer stalking and murdering victims.
    *   **Supernatural Horror:** Involves ghosts, demons, and other paranormal entities.
    *   **Psychological Horror:** Focuses on the mental and emotional states of characters.
    *   **Monster Movies:** Features creatures or monsters as the primary threat.
    *   **Found Footage:** Presents the story as if it were discovered recordings.

**In Summary:**

The search results confirm that the genre being discussed is **Horror**. While the provided data focuses on plot generation, it's important to remember that the horror genre is diverse and encompasses a wide range of themes, styles, and subgenres, all aiming to create a sense of fear and unease in the viewer.

I hope this explanation is helpful! If you have any more questions about horror movies or other genres, feel free to ask.


In [None]:
query_result_1 = answer_to_question_with_rag_implementation("What is Sci-fi")
display(Markdown(query_result_1.content))

Okay, I've examined the search results you provided. It seems you're asking "What is Sci-fi?", and the database returned three entries, all related to Sci-Fi movie plots. Here's a breakdown and explanation:

**Based on the Search Results:**

The search results all contain the following structure:

*   **Genre: Sci-Fi**
*   **Plot: Plot for a Sci-Fi movie involving unique twists.**

From this, we can directly infer that "Sci-Fi" is a **genre**, specifically a genre of movies. The plots associated with it are described as involving "unique twists," suggesting that this genre often incorporates elements of surprise, innovation, or unexpected developments.

**Expanding on the Definition of Sci-Fi:**

While the database entries are quite basic, I can provide a more comprehensive explanation of what Sci-Fi (or Science Fiction) is:

*   **Core Elements:** Science fiction is a genre of speculative fiction that typically deals with imaginative and futuristic concepts such as:
    *   **Advanced Technology:** This often includes things like spaceships, robots, artificial intelligence, advanced weaponry, and other technological marvels that go beyond our current capabilities.
    *   **Space Exploration:** Many sci-fi stories revolve around journeys to other planets, galaxies, or dimensions.
    *   **Time Travel:** The ability to move through time is a common theme, exploring paradoxes and alternate realities.
    *   **Dystopian/Utopian Societies:** Sci-fi often explores the potential consequences of societal structures, often depicting either ideal or deeply flawed future worlds.
    *   **Extraterrestrial Life:** Encounters with alien species are a staple of the genre.
    *   **Scientific and Social Change:** Sci-fi frequently examines the impact of scientific advancements on society and individuals.

*   **Themes and Exploration:** Beyond the technological aspects, Sci-Fi often delves into deeper philosophical and social themes, such as:
    *   **Humanity's Place in the Universe:** What does it mean to be human in a vast and potentially hostile cosmos?
    *   **The Ethics of Technology:** What are the moral implications of advanced scientific capabilities?
    *   **The Nature of Consciousness:** What does it mean to be aware, and can artificial intelligence achieve it?
    *   **Social Commentary:** Sci-fi can be used to critique contemporary society by projecting current trends into the future.

*   **Beyond Movies:** While the database entries focus on movie plots, Sci-Fi is a genre that extends far beyond cinema. It encompasses:
    *   **Literature:** Novels, short stories, and graphic novels.
    *   **Television:** Series and mini-series.
    *   **Video Games:** Interactive narratives and world-building.
    *   **Art and Music:** Sci-fi themes and aesthetics are often explored in various artistic mediums.

**In Summary:**

Sci-Fi, or Science Fiction, is a genre that uses imaginative and often futuristic concepts, grounded in scientific or technological possibilities, to explore a wide range of themes and ideas. It's not just about spaceships and robots; it's about exploring the human condition and our potential future in a universe shaped by science and technology. The "unique twists" mentioned in the database entries are a hallmark of the genre, as it often seeks to surprise and challenge our expectations.

**Additional Thoughts and Context:**

The database entries are very basic, suggesting that they are likely part of a larger dataset used for generating or categorizing movie plots. The repeated phrase "unique twists" highlights a common characteristic of successful sci-fi stories – the ability to offer something unexpected and thought-provoking.

I hope this explanation is helpful! If you have any more questions about Sci-Fi or any other topic, please feel free to ask.


In [None]:
query_result_2 = answer_to_question_with_rag_implementation("What is Drama")
display(Markdown(query_result_2.content))

Okay, I've examined the search results you provided. It seems you're asking about the genre "Drama" based on the information in the database.

Here's what I can tell you:

The search results all indicate the following:

*   **Genre: Drama**
*   **Plot: Plot for a Drama movie involving unique twists.**

Based on this, we can say that "Drama" is a movie genre. The provided data also suggests that plots within the Drama genre often involve "unique twists."

**Here's a more comprehensive explanation of the Drama genre, drawing on my general knowledge:**

The Drama genre is one of the most fundamental and versatile in film. It typically focuses on realistic characters and their emotional journeys, often exploring complex themes and relationships. Here's a breakdown of what makes a movie a "Drama":

*   **Character-Driven:** Dramas are usually centered around the development and experiences of their characters. We often see characters facing internal conflicts, external challenges, and significant life events.
*   **Emotional Depth:** These films aim to evoke strong emotions in the audience, such as sadness, joy, anger, or empathy. They delve into the human condition and explore the complexities of life.
*   **Realistic Settings and Situations:** While not always the case, many dramas strive for realism in their settings, dialogue, and situations. This helps the audience connect with the characters and their struggles.
*   **Exploration of Themes:** Dramas often tackle important social, political, or personal themes, such as love, loss, family, identity, justice, and morality.
*   **Variety of Subgenres:** The Drama genre is incredibly broad and encompasses many subgenres, including:
    *   **Family Dramas:** Focus on the dynamics and relationships within a family.
    *   **Legal Dramas:** Center around courtrooms, lawyers, and legal cases.
    *   **Historical Dramas:** Set in the past and often based on real events.
    *   **Romantic Dramas:** Explore love and relationships, often with a focus on emotional conflict.
    *   **Social Dramas:** Deal with social issues and injustices.

**Regarding the "unique twists" mentioned in the database:**

The fact that the plots are described as having "unique twists" suggests that even within the Drama genre, there's room for unexpected turns and surprises. This could mean that the stories aren't always straightforward and might incorporate elements of suspense or mystery to keep the audience engaged. It's a way to add a layer of complexity to the emotional core of the drama.

**In Summary:**

Drama is a genre that focuses on character development, emotional depth, and realistic situations, often exploring complex themes. It is a very broad genre with many subgenres. The search results indicate that plots within the Drama genre can also include "unique twists," which can make them more engaging and unpredictable.

I hope this explanation is helpful! If you have any further questions about the Drama genre or any other topic, please feel free to ask.
