In [37]:
# Installing Required Libraries
%pip install python-docx
%pip install python-pptx
%pip install PyPDF2
%pip install langchain
%pip install langchain_community
%pip install langchain_google_genai
%pip install langchain_text_splitters
%pip install sentence-transformers
%pip install faiss-cpu
%pip install cohere



In [38]:
# necessary Imports
from docx import Document
from PyPDF2 import PdfReader
from pptx import Presentation
from langchain_community.llms import Cohere
from langchain_community.vectorstores import FAISS
from langchain_google_genai import GoogleGenerativeAI
from langchain_google_genai import ChatGoogleGenerativeAI
from langchain.embeddings import HuggingFaceEmbeddings
from langchain_core.runnables import RunnablePassthrough
from langchain_core.output_parsers import StrOutputParser
from langchain_core.messages import AIMessage, HumanMessage
from langchain_google_genai import GoogleGenerativeAIEmbeddings
from langchain_text_splitters import RecursiveCharacterTextSplitter
from langchain_core.prompts  import PromptTemplate, ChatPromptTemplate, MessagesPlaceholder

In [44]:

pdf_file = open('/content/The-Brothers-Karamazov.pdf', 'rb')

In [45]:
pdf_text = ""
pdf_reader = PdfReader(pdf_file)
for page in pdf_reader.pages:
    pdf_text += page.extract_text()

In [46]:
all_text = pdf_text
len(all_text)

1955293

In [47]:
# splitting the text into chunks for embeddings creation

text_splitter = RecursiveCharacterTextSplitter(
        chunk_size = 1000,
        chunk_overlap = 200,
        length_function = len,
        separators=['\n', '\n\n', ' ', '']
    )

chunks = text_splitter.split_text(text = all_text)

In [48]:
len(chunks)

2489

In [49]:
import os
os.environ['HuggingFaceHub_API_Token']= 'paste_your_hugginface_token'
os.environ['GOOGLE_API_KEY']= "AIzaSyB___your_google_api_key__20NTA"
os.environ['cohere_api_key'] = "UC3xPd____your_cohere_api_key___kKgboZKBP"

In [50]:
# Initializing embeddings model

embeddings = HuggingFaceEmbeddings(model_name='sentence-transformers/all-MiniLM-L6-v2')

In [51]:
# Indexing the data using FAISS
vectorstore = FAISS.from_texts(chunks, embedding = embeddings)

In [52]:
# creating retriever
retriever = vectorstore.as_retriever(search_type="similarity", search_kwargs={"k": 6})

In [56]:
retrieved_docs = retriever.invoke("Fyodor Pavlovitch Karamazov")

In [57]:
len(retrieved_docs)

6

In [58]:
print(retrieved_docs[0].page_content)

plicity. At last he succeeded in getting on the track of his runaway wife. The poor woman
turned out to be in Petersburg, where she had gone with her divinity student, and where
she had thrown herself into a life of complete emancipation. Fyodor Pavlovitch at once
began bustling about, making preparations to go to Petersburg, with what object he could
not himself have said. He would perhaps have really gone; but having determined to do so
2Chapter 1 - Fyodor P avlovitch Kar amazo vhe felt at once entitled to fortify himself for the journey by another bout of reckless drinking.
And just at that time his wife's family received the news of her death in Petersburg. She had
died quite suddenly in a garret, according to one story, of typhus, or as another version had
it, of starvation. Fyodor Pavlovitch was drunk when he heard of his wife's death, and the
story is that he ran out into the street and began shouting with joy, raising his hands to


In [59]:
prompt_template = """Answer the question as precise as possible using the provided context. If the answer is
                not contained in the context, say "answer not available in context" \n\n
                Context: \n {context}?\n
                Question: \n {question} \n
                Answer:"""

prompt = PromptTemplate.from_template(template=prompt_template)

In [60]:
# function to create a single string of relevant documents given by Faiss.
def format_docs(docs):
    return "\n\n".join(doc.page_content for doc in docs)

In [61]:
# RAG Chain

def generate_answer(question):
    cohere_llm = Cohere(model="command", temperature=0.1, cohere_api_key = os.getenv('cohere_api_key'))

    rag_chain = (
        {"context": retriever | format_docs, "question": RunnablePassthrough()}
        | prompt
        | cohere_llm
        | StrOutputParser()
    )

    return rag_chain.invoke(question)

In [62]:
ans = generate_answer("who who was the third son of Fyodor Pavlovitch Karamazov?")
print(ans)

 Alyosha


In [63]:
ans = generate_answer("who are the brothers?")
print(ans)

 The Brothers Karamazov are the three sons of Fyodor Pavlovitch Karamazov, they are Dmitry, Ivan and Alyosha. 


In [64]:
ans = generate_answer("And always so, all our lives hand in hand! Hurrah for Karamazov!")
print(ans)

 The end of the passage is a part of the novel "The Brothers Karamazov" by Fyodor Dostoevsky, where the protagonist, Ivan Karamazov, is defending his brother, Dimitry Karamazov, who is on trial for the murder of their father. 

The passage describes Dimitry's frantic plan to commit suicide as a way out of his terrible position as a criminal under sentence. The narrator emphasizes Dimitry's complex character, describing him as a man with a broad Karamazov nature capable of combining incongruous contradictions, experiencing the greatest heights and depths.

The quote "And always so, all our lives hand in hand! Hurrah for Karamazov!" is a phrase exclaimed by Kolya, a young boy who is a close friend of Dimitry and the son of a local merchant. This quote highlights the enthusiasm and emotional connection that the characters feel towards each other. 

The author uses this quote to illustrate the complex and contradictory nature of human emotions and experiences, and the deep bonds that can f

In [65]:
ans = generate_answer("who wrote the brothers?")
print(ans)

 Fyodor Dostoevsky


In [66]:
ans = generate_answer("What is the primary goal of the project?")
print(ans)

 The primary goal of the project is to use simple tales from religious texts to "penetrate hearts" and spread grace to people where they are, in their poverty and without immediate expectation of material reward. 


In [67]:
ans = generate_answer("What is the the objective of this  project?")
print(ans)

 The objective of this project is to subdue a flock of millions and make them obey whoever is behind this project. 
