## Load the document

In [3]:
from langchain.document_loaders import PyPDFLoader

loader = PyPDFLoader("roman-empire.pdf")

In [4]:
loaded = loader.load()

print(loaded[0].page_content)  # Print the first 1000 characters of the first page

Download Testbook App 
 
 
 
The Roman Republic, founded in 509 BC was a nation-state of the classical Roman Era that was governed 
by the citizens of Rome. After the Roman Republic was defeated, the Roman Empire was founded in 27 
BCE, and it lasted until the Western Empire finally eclipsed it in the fifth century CE. The Roman Empire 
was centred on the city of Rome. Prior to the Roman Republic, which began in 27 BC and ended in 476 
AD, there was a definite political, social, and cultural heritage that is still evident today. 
Roman Empire is one of the important topics for UPSC IAS exam. It also covers a significant part of GS 
paper-1 syllabus. In this article, we shall discuss the Holy Roman Empire, Roman Civilization, the history 
of Rome, decline and fall of the Roman empire. 
Download World History UPSC Notes here! 
 
https://blogmedia.testbook.com/blog/wp-content/uploads/2022/07/roman-empire-1-a94931b8.png 
 
Roman 
Empire 
UPSC World History Notes


## Chunking

In [5]:
from langchain.text_splitter import RecursiveCharacterTextSplitter

text_splitter = RecursiveCharacterTextSplitter(
    chunk_size=1000,
    chunk_overlap=200,
)

chunks = text_splitter.split_documents(loaded)

len(chunks)

print(chunks[0].page_content)

Download Testbook App 
 
 
 
The Roman Republic, founded in 509 BC was a nation-state of the classical Roman Era that was governed 
by the citizens of Rome. After the Roman Republic was defeated, the Roman Empire was founded in 27 
BCE, and it lasted until the Western Empire finally eclipsed it in the fifth century CE. The Roman Empire 
was centred on the city of Rome. Prior to the Roman Republic, which began in 27 BC and ended in 476 
AD, there was a definite political, social, and cultural heritage that is still evident today. 
Roman Empire is one of the important topics for UPSC IAS exam. It also covers a significant part of GS 
paper-1 syllabus. In this article, we shall discuss the Holy Roman Empire, Roman Civilization, the history 
of Rome, decline and fall of the Roman empire. 
Download World History UPSC Notes here! 
 
https://blogmedia.testbook.com/blog/wp-content/uploads/2022/07/roman-empire-1-a94931b8.png 
 
Roman 
Empire 
UPSC World History Notes


In [6]:
from dotenv import load_dotenv
load_dotenv()

True

In [7]:
import os

COHERE_API_KEY = os.getenv('COHERE_API_KEY')

os.environ['COHERE_API_KEY'] = COHERE_API_KEY



In [8]:
from langchain_cohere import CohereEmbeddings

embeddings = CohereEmbeddings(model="embed-english-v3.0")

## Vector database

In [9]:
from langchain.vectorstores import FAISS

vectorstore = FAISS.from_documents(chunks, embeddings)

retriever = vectorstore.as_retriever()

res = vectorstore.similarity_search("Who was julius Caesar",k=10)
print(res[0].page_content)



Page - 3 
 
Download Testbook App 
73 BC The slave rebellion is led by the gladiator Spartacus. 
45 BC As the first dictator of Rome, Julius Caesar.  
In order to take over as the only ruler of Rome, Caesar performs his renowned Crossing of the 
Rubicon and defeats Pompey in a civil war.  
The Roman Republic is over as a result of this. 
44 BC Marcus Brutus killed Julius Caesar on the Ides of March. In an effort to restore the republic, 
civil war breaks out. 
27 BC Caesar Augustus becomes the first Roman Emperor, ushering in the Roman Empire. 
64 AD Rome is largely on fire. According to legend, Emperor Nero played the lyre while he observed 
the city burn. 
80 AD The Colosseum is constructed.  
One of the finest works of Roman engineering has been completed. There are seats for 50,000 
people. 
121 
AD 
Buildup of the Hadrian Wall.  
A substantial wall is erected across northern England to keep the barbarians out. 
306 
AD 
Constantine ascends to the throne.


## Calling LLM

In [16]:
from langchain_cohere import ChatCohere
llm = ChatCohere()

In [15]:
from langchain.memory import ConversationBufferMemory

In [18]:
from langchain.chains import ConversationalRetrievalChain
chain = ConversationalRetrievalChain.from_llm(
    llm = llm,
    retriever = vectorstore.as_retriever(),
    memory = ConversationBufferMemory(memory_key='chat_history', return_messages=True)
)

In [15]:
res = chain.invoke("Give me a summary of the document")
print(res['answer'])

The document provides an overview of the **Roman Empire** and its historical significance, particularly for the **UPSC IAS exam** and **World History studies**. Here’s a summary of the key points:

1. **Historical Timeline**:  
   - The **Roman Empire** began after the establishment of Rome in **753 BC**, according to legend founded by **Romulus and Remus**.  
   - The **Roman Republic** was founded in **509 BC** after the expulsion of King Tarquin the Proud, marking the start of a 500-year republican period.  
   - The Roman Empire was officially established in **27 BCE** after the fall of the Republic and lasted until the fall of the Western Roman Empire in **476 CE**.  

2. **Key Features**:  
   - Rome evolved from a small settlement into the capital of a vast empire, with a sophisticated economy to support its city and military.  
   - The empire was governed by citizens during the Republic and later by emperors during the Empire.  

3. **Decline and Fall**:  
   - The fall of the

In [52]:
res = chain.invoke("What is Gen AI")
res['answer']  # This will return the answer to the question

'Lo siento, no puedo responder a tu pregunta sobre "Gen AI" porque la información proporcionada en el contexto se refiere principalmente a la historia del Imperio Romano y no contiene detalles sobre inteligencia artificial o términos relacionados. Si tienes alguna pregunta sobre el Imperio Romano, estaré encantado de ayudarte.'

In [12]:
import os
GROQ_API_KEY = os.getenv('GROQ_API_KEY')
os.environ['GROQ_API_KEY'] = GROQ_API_KEY
GROQ_API_KEY

'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'

In [13]:

from langchain_groq import ChatGroq
llm = ChatGroq(
    model = 'llama-3.1-8b-instant',
    api_key= GROQ_API_KEY
)

In [19]:
from langchain.chains import ConversationalRetrievalChain
chain = ConversationalRetrievalChain.from_llm(
    llm = llm,
    retriever = vectorstore.as_retriever(),
    memory = ConversationBufferMemory(memory_key='chat_history', return_messages=True)
)

In [17]:
res = chain.invoke("What is Gen AI")
print(res['answer'])

Gen AI, short for **Generative Artificial Intelligence**, refers to a category of artificial intelligence systems designed to generate new content, such as text, images, music, or other media, based on patterns and data they have been trained on. Unlike traditional AI, which focuses on tasks like classification, prediction, or decision-making, Gen AI creates original outputs that mimic human creativity.

Key characteristics of Gen AI include:

1. **Generative Models**: These models learn the underlying structure of the training data and use it to produce new, similar content. Examples include **Generative Adversarial Networks (GANs)**, **Variational Autoencoders (VAEs)**, and **Transformer-based models** like GPT (Generative Pre-trained Transformer).

2. **Applications**: Gen AI is used in various fields, such as:
   - **Text Generation**: Writing articles, stories, or code (e.g., ChatGPT, GPT-4).
   - **Image Generation**: Creating art, designs, or realistic images (e.g., DALL·E, MidJ

## RetrievalQA Chain

In [25]:
from langchain.chains import RetrievalQA
retrieval_qa = RetrievalQA.from_chain_type(
    llm=llm,
    chain_type="stuff",
    retriever=vectorstore.as_retriever()
)

res = retrieval_qa.invoke("What is RAG")
print(res)

{'query': 'What is RAG', 'result': "RAG stands for **Retrieval-Augmented Generation**. It is a technique in natural language processing (NLP) that combines **information retrieval** with **text generation** to improve the accuracy and relevance of responses generated by language models. Here’s a breakdown of how RAG works and its key components:\n\n\n### **How RAG Works:**\n1. **Retrieval**:\n   - The system first retrieves relevant documents or passages from a large corpus of text (e.g., a database, knowledge base, or external documents) based on the input query.\n   - This retrieval step ensures that the model has access to up-to-date, accurate, and contextually relevant information.\n\n2. **Augmentation**:\n   - The retrieved information is then used to augment the input query, providing additional context to the language model.\n\n3. **Generation**:\n   - The language model generates a response based on both the original query and the retrieved information. This ensures that the ou