## Install Important packages

In [1]:
pip install langchain 




In [2]:
pip install google-generativeai

Note: you may need to restart the kernel to use updated packages.


In [3]:
pip install langchain-google-genai

Note: you may need to restart the kernel to use updated packages.


In [4]:
pip install faiss-cpu

Note: you may need to restart the kernel to use updated packages.


## Import modules

In [37]:
import google.generativeai as genai
from langchain_community.document_loaders import TextLoader
from langchain_google_genai import GoogleGenerativeAIEmbeddings
from langchain.text_splitter import CharacterTextSplitter
from langchain_community.vectorstores import FAISS

## Initializing an instance of a Generative Model

In [66]:
model = genai.GenerativeModel('gemini-pro')

## Load the documents

In [48]:
# Load the document, split it into chunks, embed each chunk, and load it into the vector store.
raw_documents = TextLoader('xyz.txt').load()

## Chunking the document

In [49]:
text_splitter = CharacterTextSplitter(chunk_size=20, chunk_overlap=5)    # take 1/5 ratio mostly 
documents = text_splitter.split_documents(raw_documents)

Created a chunk of size 471, which is longer than the specified 20
Created a chunk of size 727, which is longer than the specified 20
Created a chunk of size 673, which is longer than the specified 20
Created a chunk of size 399, which is longer than the specified 20
Created a chunk of size 335, which is longer than the specified 20
Created a chunk of size 687, which is longer than the specified 20
Created a chunk of size 392, which is longer than the specified 20
Created a chunk of size 406, which is longer than the specified 20
Created a chunk of size 401, which is longer than the specified 20
Created a chunk of size 676, which is longer than the specified 20
Created a chunk of size 319, which is longer than the specified 20
Created a chunk of size 316, which is longer than the specified 20
Created a chunk of size 218, which is longer than the specified 20
Created a chunk of size 527, which is longer than the specified 20
Created a chunk of size 310, which is longer than the specifie

## Printing Chunks

In [77]:
for chunks in documents:
    print(list(chunks))
    

[('page_content', "LLM, short for Master of Laws, is an advanced and specialized postgraduate degree in law. It's designed for individuals who already possess a foundational understanding of law and wish to deepen their expertise in a specific area. The curriculum typically covers a wide array of legal subjects, offering students the flexibility to focus on specialized fields like international law, corporate law, intellectual property, human rights, or environmental law, among others."), ('metadata', {'source': 'C:\\Users\\Kushal.Sharma\\Desktop\\Example_Rag.txt'}), ('type', 'Document')]
[('page_content', 'The pursuit of an LLM offers numerous benefits. It allows individuals to gain in-depth knowledge, critical analysis skills, and a nuanced understanding of legal frameworks within their chosen concentration. This advanced degree can significantly enhance career prospects, offering opportunities in academia, law practice, government, or international organizations.Moreover, an LLM oft

## Creating embeddings and storing in vectorstore

In [46]:

# Replace 'model_name' with the actual model name or identifier
model_name = "models/embedding-001"
api_key = "your api key "  # Replace with your Google API key

embeddingsss = GoogleGenerativeAIEmbeddings(model=model_name, google_api_key=api_key) # creating embeddings

db = FAISS.from_documents(documents, embeddingsss)


## Writing Query

In [78]:
query = "What is chatgpt"

## Retrieving matching results from db (vectorstore)

In [79]:
retriever = db.as_retriever()
docs = retriever.get_relevant_documents(query)
docs

[Document(page_content="The model's capabilities encompass understanding nuances in language, syntax, grammar, semantics, and context, allowing it to engage in diverse and coherent conversations. Whether answering questions, assisting with tasks, or engaging in casual dialogue, ChatGPT aims to simulate human-like conversation and generate text that feels natural and contextually appropriate.", metadata={'source': 'C:\\Users\\Kushal.Sharma\\Desktop\\Example_Rag.txt'}),
 Document(page_content='The versatility of ChatGPT allows it to be deployed in numerous applications, including customer service, content generation, language translation, and educational assistance. Its adaptability to different domains and its ability to maintain engaging dialogues have made it a cornerstone in revolutionizing human-computer interactions.', metadata={'source': 'C:\\Users\\Kushal.Sharma\\Desktop\\Example_Rag.txt'}),
 Document(page_content='ChatGPT is based on a deep learning architecture called the Trans

## similarity_search_by_vector in db(vectorstore)

In [27]:
query = "what is chatgpt "
embedding_vector = embeddingsss.embed_query(query)
docs_and_scores = db.similarity_search_by_vector(embedding_vector)
print(docs_and_scores[0].page_content)

The model's capabilities encompass understanding nuances in language, syntax, grammar, semantics, and context, allowing it to engage in diverse and coherent conversations. Whether answering questions, assisting with tasks, or engaging in casual dialogue, ChatGPT aims to simulate human-like conversation and generate text that feels natural and contextually appropriate.


## similarity_search_with_score in db(vectorstore)

In [28]:
results_with_scores = db.similarity_search_with_score(query)
for doc, score in results_with_scores:
    print(f"Content: {doc.page_content}, Metadata: {doc.metadata}, Score: {score}")

Content: The model's capabilities encompass understanding nuances in language, syntax, grammar, semantics, and context, allowing it to engage in diverse and coherent conversations. Whether answering questions, assisting with tasks, or engaging in casual dialogue, ChatGPT aims to simulate human-like conversation and generate text that feels natural and contextually appropriate., Metadata: {'source': 'C:\\Users\\Kushal.Sharma\\Desktop\\Example_Rag.txt'}, Score: 0.29974237084388733
Content: The versatility of ChatGPT allows it to be deployed in numerous applications, including customer service, content generation, language translation, and educational assistance. Its adaptability to different domains and its ability to maintain engaging dialogues have made it a cornerstone in revolutionizing human-computer interactions., Metadata: {'source': 'C:\\Users\\Kushal.Sharma\\Desktop\\Example_Rag.txt'}, Score: 0.31110721826553345
Content: ChatGPT is based on a deep learning architecture called th

## similarity_search in db(vectorstore)

In [44]:
docs = db.similarity_search(query)
docs

[Document(page_content="The model's capabilities encompass understanding nuances in language, syntax, grammar, semantics, and context, allowing it to engage in diverse and coherent conversations. Whether answering questions, assisting with tasks, or engaging in casual dialogue, ChatGPT aims to simulate human-like conversation and generate text that feels natural and contextually appropriate.", metadata={'source': 'C:\\Users\\Kushal.Sharma\\Desktop\\Example_Rag.txt'}),
 Document(page_content='The versatility of ChatGPT allows it to be deployed in numerous applications, including customer service, content generation, language translation, and educational assistance. Its adaptability to different domains and its ability to maintain engaging dialogues have made it a cornerstone in revolutionizing human-computer interactions.', metadata={'source': 'C:\\Users\\Kushal.Sharma\\Desktop\\Example_Rag.txt'}),
 Document(page_content='ChatGPT is based on a deep learning architecture called the Trans

## Fetching Top K Results from db(vectorstore)

In [80]:
results = db.similarity_search(query , k= 10)
for doc in results:
    print(f"Content: {doc.page_content}, Metadata: {doc.metadata}")

Content: The model's capabilities encompass understanding nuances in language, syntax, grammar, semantics, and context, allowing it to engage in diverse and coherent conversations. Whether answering questions, assisting with tasks, or engaging in casual dialogue, ChatGPT aims to simulate human-like conversation and generate text that feels natural and contextually appropriate., Metadata: {'source': 'C:\\Users\\Kushal.Sharma\\Desktop\\Example_Rag.txt'}
Content: The versatility of ChatGPT allows it to be deployed in numerous applications, including customer service, content generation, language translation, and educational assistance. Its adaptability to different domains and its ability to maintain engaging dialogues have made it a cornerstone in revolutionizing human-computer interactions., Metadata: {'source': 'C:\\Users\\Kushal.Sharma\\Desktop\\Example_Rag.txt'}
Content: ChatGPT is based on a deep learning architecture called the Transformer model, specifically trained on vast amount

## Generating Final Prompt

In [70]:
prompt = f"Given the following text chunks:\n{results}\nGenerate a response to the query: {query}"
prompt

'Given the following text chunks:\n[Document(page_content="The model\'s capabilities encompass understanding nuances in language, syntax, grammar, semantics, and context, allowing it to engage in diverse and coherent conversations. Whether answering questions, assisting with tasks, or engaging in casual dialogue, ChatGPT aims to simulate human-like conversation and generate text that feels natural and contextually appropriate.", metadata={\'source\': \'C:\\\\Users\\\\Kushal.Sharma\\\\Desktop\\\\Example_Rag.txt\'}), Document(page_content=\'The versatility of ChatGPT allows it to be deployed in numerous applications, including customer service, content generation, language translation, and educational assistance. Its adaptability to different domains and its ability to maintain engaging dialogues have made it a cornerstone in revolutionizing human-computer interactions.\', metadata={\'source\': \'C:\\\\Users\\\\Kushal.Sharma\\\\Desktop\\\\Example_Rag.txt\'}), Document(page_content=\'Chat

## API Call to Generate Response

In [76]:
response = model.generate_content(prompt)
print(response.text) # This response in fetch from the top k chunks stored in results

ChatGPT is a conversational AI model from Google that's trained to understand and generate human language. It's proficient in understanding context, structure, and nuances in conversations, enabling it to provide informative and contextually relevant responses. Its extensive training on vast amounts of text data allows it to generate coherent text across diverse domains and conversational contexts. ChatGPT finds application in customer service, content generation, language translation, educational assistance, and more.


## Overall Code 

In [None]:
pip install langchain 
pip install google-generativeai
pip install langchain-google-genai
pip install faiss-cpu

In [None]:
import google.generativeai as genai
from langchain_community.document_loaders import TextLoader
from langchain_google_genai import GoogleGenerativeAIEmbeddings
from langchain.text_splitter import CharacterTextSplitter
from langchain_community.vectorstores import FAISS


# Load the document, split it into chunks, embed each chunk, and load it into the vector store.
raw_documents = TextLoader(r'C:xyz.txt').load()
text_splitter = CharacterTextSplitter(chunk_size=20, chunk_overlap=5)    # take 1/5 ratio mostly 
documents = text_splitter.split_documents(raw_documents)

for chunks in documents:
    print(chunks)

# Replace 'model_name' with the actual model name or identifier
model_name = "models/embedding-001"
api_key = "your_api_key"  # Replace with your Google API key

embeddingsss = GoogleGenerativeAIEmbeddings(model=model_name, google_api_key=api_key)

db = FAISS.from_documents(documents, embeddingsss)

query = "what is chatgpt "
embedding_vector = embeddingsss.embed_query(query)
docs_and_scores = db.similarity_search_by_vector(embedding_vector)

print(docs_and_scores[0].page_content)

    

results_with_scores = db.similarity_search_with_score(query)
for doc, score in results_with_scores:
    print(f"Content: {doc.page_content}, Metadata: {doc.metadata}, Score: {score}")

docs = db.similarity_search_with_score(query)
docs

results = db.similarity_search(query , k= 10)
for doc in results:
    print(f"Content: {doc.page_content}, Metadata: {doc.metadata}")

prompt = f"Given the following text chunks:\n{results}\nGenerate a response to the query: {query}"
prompt

model = genai.GenerativeModel('gemini-pro')
response = model.generate_content(prompt)
print(response.text)