## Ingesting PDF

In [1]:
# pip install langchain
# pip install langchain_community
# pip install pypdf

In [2]:
from langchain.document_loaders import PyPDFLoader

In [3]:
local_path = "phy_book_ch2.pdf"
# Local PDF file uploads
if local_path:
    loader = PyPDFLoader(file_path=local_path)
    data = loader.load()
else:
    print("PDF not found")

In [4]:
# Preview first page
print(data[0].page_content)

26  Physics 
Chapter Two  
MOTION 
 
 
[The object, that we see around us either are stationery or in motion. What do we 
actually understand by the words ``rest’’ and ``motion’’. We need different quantities regarding motion to express the characteristics of motion of a moving object. In this chapter we will discuss different quantities regarding motion, their dimensions, units, the 
relations among them etc.] 
By the end of this chapter we will be able to -  
1. Explain the rest and motion  
2. Find out the difference among different types of motion.  
3. Explain the scalar and vector quantities  
4. Analyze the relation among the quantities regarding motion  5. Explain the motion of freely falling bodies  
6. Analyze the relations among the quantities regarding motion with the help of graph 
7. Realize the effect of motion in our life    


## Vector Embeddings

In [5]:
import ollama

In [6]:
# !ollama pull nomic-embed-text

In [7]:
!ollama list

NAME                   	ID          	SIZE  	MODIFIED     
nomic-embed-text:latest	0a109f422b47	274 MB	30 hours ago	
llama3.1:latest        	f66fc8dc39ea	4.7 GB	31 hours ago	


In [8]:
# pip install chromadb

In [10]:
# Split and chunk
from langchain_text_splitters import RecursiveCharacterTextSplitter

text_splitter = RecursiveCharacterTextSplitter(chunk_size = 1000, chunk_overlap = 100)
chunks = text_splitter.split_documents(data)

In [11]:
print(type(chunks))

<class 'list'>


In [12]:
chunks

[Document(metadata={'source': 'phy_book_ch2.pdf', 'page': 0}, page_content='26  Physics \nChapter Two  \nMOTION \n \n \n[The object, that we see around us either are stationery or in motion. What do we \nactually understand by the words ``rest’’ and ``motion’’. We need different quantities regarding motion to express the characteristics of motion of a moving object. In this chapter we will discuss different quantities regarding motion, their dimensions, units, the \nrelations among them etc.] \nBy the end of this chapter we will be able to -  \n1. Explain the rest and motion  \n2. Find out the difference among different types of motion.  \n3. Explain the scalar and vector quantities  \n4. Analyze the relation among the quantities regarding motion  5. Explain the motion of freely falling bodies  \n6. Analyze the relations among the quantities regarding motion with the help of graph \n7. Realize the effect of motion in our life'),
 Document(metadata={'source': 'phy_book_ch2.pdf', 'page':

In [16]:
# for x in chunks: print(x)
# print(chunks[0])

In [13]:
# Extract the texts and metadata from the chunks
texts = [doc.page_content for doc in chunks]
metadatas = [doc.metadata for doc in chunks]

In [14]:
# texts
# metadatas

In [15]:
# Embed the document texts using Ollama
from langchain_community.embeddings import OllamaEmbeddings

embedding = OllamaEmbeddings(model="nomic-embed-text", show_progress=True)
embedded_texts = embedding.embed_documents(texts)

OllamaEmbeddings: 100%|███████████████████| 58/58 [02:04<00:00,  2.14s/it]


In [16]:
embedded_texts

[[1.1463844776153564,
  1.825333833694458,
  -3.1932621002197266,
  -0.6972156763076782,
  1.216423511505127,
  0.4335930347442627,
  1.714890956878662,
  -0.840286374092102,
  0.16558395326137543,
  -1.1353495121002197,
  -0.39517942070961,
  0.3373546600341797,
  0.7954596281051636,
  0.4566328525543213,
  0.37780559062957764,
  -0.73179030418396,
  0.18309469521045685,
  -0.6404469013214111,
  0.11584893614053726,
  1.1190507411956787,
  -0.28205397725105286,
  1.162963628768921,
  -0.7859025001525879,
  -0.549447774887085,
  0.8374054431915283,
  1.1328437328338623,
  -0.6402575969696045,
  -0.32718461751937866,
  -1.1218022108078003,
  -1.4755945205688477,
  1.5231823921203613,
  -0.5194918513298035,
  0.6354901790618896,
  -1.086922526359558,
  -1.8431559801101685,
  -1.2384003400802612,
  1.4208259582519531,
  0.32463008165359497,
  0.3413695693016052,
  0.20880310237407684,
  0.5773135423660278,
  -0.6017276048660278,
  -0.3679463863372803,
  -1.0042507648468018,
  0.6894100904

In [17]:
from langchain.schema import Document

# Create Document objects from texts and metadata
documents = [Document(page_content=text, metadata=meta) for text, meta in zip(texts, metadatas)]

In [18]:
from langchain_chroma.vectorstores import Chroma

# Create the Chroma vector store
vector_db = Chroma.from_documents(
    documents=documents,
    embedding=embedding,
    collection_name="local-rag",
    persist_directory="/../Users/almah/Desktop/Ollama_RAG/db/"
)

print("Embedded Documents stored in ChromaDB successfully!")

OllamaEmbeddings: 100%|███████████████████| 58/58 [02:04<00:00,  2.14s/it]

Embedded Documents stored in ChromaDB successfully!





In [26]:
vector_db

<langchain_chroma.vectorstores.Chroma at 0x25fb9cb38f0>

# Retrieval: llama3.1

In [20]:
# Load the persisted ChromaDB vector store
vector_database = Chroma(
    collection_name="local-rag",
    persist_directory="C:/Users/almah/Desktop/Ollama_RAG/db/",
    embedding_function=embedding
)
print("loaded vector database")

loaded vector database


In [21]:
# Initialize the Ollama LLaMA 3.1 model
from langchain.llms import Ollama

llm = Ollama(model="llama3.1")

In [22]:
from langchain.prompts import PromptTemplate

QUERY_PROMPT = PromptTemplate(
    input_variables=["question"],
    template="""You are an AI language model specialized in physics. Your task is to reformulate the following question into five different versions to retrieve the most relevant physics documents.
    Original question: {question}""",
)

In [23]:
from langchain.retrievers.multi_query import MultiQueryRetriever

retriever = MultiQueryRetriever.from_llm(
    vector_database.as_retriever(search_kwargs={"k": 5}),
    llm,
    prompt=QUERY_PROMPT
)


In [24]:
# RAG prompt
from langchain.prompts import ChatPromptTemplate

template = """You are a helpful assistant trained to answer physics questions based on the provided context.
Use only the context below to answer the following question as clearly as possible:
{context}
Question: {question}
"""

prompt = ChatPromptTemplate.from_template(template)

In [25]:
from langchain_core.runnables import RunnablePassthrough
from langchain_core.output_parsers import StrOutputParser


chain = (
    {"context": retriever, "question": RunnablePassthrough()}
    | prompt
    | llm
    | StrOutputParser()
)

In [26]:
def post_process_answer(answer):
    # Ensure the answer is focused on the physics topic
    # Optionally trim any irrelevant parts or hallucinated information
    return answer.split("Answer:")[-1].strip()

result = chain.invoke(input())
final_answer = post_process_answer(result)
print(final_answer)

 What is linear motion?


OllamaEmbeddings: 100%|█████████████████████| 1/1 [00:03<00:00,  3.13s/it]
OllamaEmbeddings: 100%|█████████████████████| 1/1 [00:02<00:00,  2.04s/it]
OllamaEmbeddings: 100%|█████████████████████| 1/1 [00:02<00:00,  2.04s/it]
OllamaEmbeddings: 100%|█████████████████████| 1/1 [00:02<00:00,  2.07s/it]
OllamaEmbeddings: 100%|█████████████████████| 1/1 [00:02<00:00,  2.19s/it]
OllamaEmbeddings: 100%|█████████████████████| 1/1 [00:02<00:00,  2.05s/it]
OllamaEmbeddings: 100%|█████████████████████| 1/1 [00:02<00:00,  2.24s/it]
OllamaEmbeddings: 100%|█████████████████████| 1/1 [00:02<00:00,  2.05s/it]
OllamaEmbeddings: 100%|█████████████████████| 1/1 [00:02<00:00,  2.11s/it]
OllamaEmbeddings: 100%|█████████████████████| 1/1 [00:02<00:00,  2.11s/it]
OllamaEmbeddings: 100%|█████████████████████| 1/1 [00:02<00:00,  2.15s/it]
OllamaEmbeddings: 100%|█████████████████████| 1/1 [00:02<00:00,  2.15s/it]


Linear motion refers to the type of motion where an object moves in a straight line without changing its direction, meaning all particles of the body travel the same distance, at the same time, and in the same direction. This is different from rotational motion, which involves circular or curved paths, and periodic motion, which involves repetitive movements along a path.


## Testing

In [42]:
chain.invoke(input(""))

OllamaEmbeddings: 100%|██████████| 1/1 [00:03<00:00,  3.47s/it]
OllamaEmbeddings: 100%|██████████| 1/1 [00:02<00:00,  2.07s/it]


"Based on the provided text, it seems like a chapter from a physics textbook discussing motion. The question section towards the end includes multiple-choice questions that require ticking the correct answer among options.\n\nHowever, none of the provided text snippets contain specific questions that can be answered with a single final number. They appear to be excerpts from a chapter discussing various aspects of motion and its related quantities.\n\n\nSo, I will select one of the Multiple Choice Questions (MCQ) as per your request. Let's consider question 3:\n\nThe distance traveled in a given time by a freely falling body from rest will be\n\na) Proportional of the time\nb) Proportional to the square of that time\nc) Inversely proportional to that time\nd) Inversely proportional to the square of that time\n\n\nThe correct answer is (b) Proportional to the square of that time"

In [53]:
chain.invoke(input("What is rotational motion?"))

OllamaEmbeddings: 100%|██████████| 1/1 [00:03<00:00,  3.56s/it]
OllamaEmbeddings: 100%|██████████| 1/1 [00:02<00:00,  2.08s/it]


'Based on the provided context, it appears that the investigation involves measuring the speed of participants running 100 meters in different times.\n\nTo answer this question based on the given context:\n\nThe characteristics of the motion mentioned are periodic and vibratory. This means that the motion is repetitive and oscillatory.\n\nThis motion is periodic because all participants are moving at a constant rate, with some starting earlier than others. The observation that the motion of all participants are periodic and vibratory implies that they are all moving in a rhythmic or oscillating pattern.\n\nTo compare various kinds of motion and write down their differences:\n\n* Linear motion: Refers to motion in a straight line.\n* Periodic motion: Refers to motion that repeats itself over a period of time, such as the motion of participants running 100 meters.\n* Vibratory motion: Refers to motion that involves oscillating or vibrating movements.\n\nThe main difference between these 

In [54]:
chain.invoke(input("What is the unit of acceleration?"))

OllamaEmbeddings: 100%|██████████| 1/1 [00:03<00:00,  3.96s/it]
OllamaEmbeddings: 100%|██████████| 1/1 [00:02<00:00,  2.07s/it]
OllamaEmbeddings: 100%|██████████| 1/1 [00:02<00:00,  2.05s/it]
OllamaEmbeddings: 100%|██████████| 1/1 [00:02<00:00,  2.12s/it]
OllamaEmbeddings: 100%|██████████| 1/1 [00:02<00:00,  2.11s/it]
OllamaEmbeddings: 100%|██████████| 1/1 [00:02<00:00,  2.11s/it]
OllamaEmbeddings: 100%|██████████| 1/1 [00:02<00:00,  2.12s/it]
OllamaEmbeddings: 100%|██████████| 1/1 [00:02<00:00,  2.11s/it]


"Given the text, there are multiple questions that can be answered based on the content. However, since I need to provide a single final answer in the requested format and one question seems most directly addressed:\n\nWhat type of acceleration does the body experience when its speed increases by 5ms^-1 in 1 second towards south?\n\n## Step 1: Identify the type of problem\nThe problem involves understanding the concept of uniform vs. non-uniform acceleration based on a given scenario.\n\n## Step 2: Recall the definition of uniform and non-uniform acceleration\nUniform acceleration occurs when an object's velocity changes at a constant rate over time, meaning its speed increases or decreases by the same amount in the same interval of time each time it is measured. Non-uniform acceleration happens when this condition is not met.\n\n## Step 3: Analyze the given scenario\nThe body's speed increases by 5ms^-1 in 1 second towards south. This indicates a change in velocity over time, but to d

In [55]:
chain.invoke(input("what is linear motion?"))

OllamaEmbeddings: 100%|██████████| 1/1 [00:03<00:00,  3.24s/it]
OllamaEmbeddings: 100%|██████████| 1/1 [00:02<00:00,  2.07s/it]
OllamaEmbeddings: 100%|██████████| 1/1 [00:02<00:00,  2.10s/it]
OllamaEmbeddings: 100%|██████████| 1/1 [00:02<00:00,  2.11s/it]
OllamaEmbeddings: 100%|██████████| 1/1 [00:02<00:00,  2.13s/it]
OllamaEmbeddings: 100%|██████████| 1/1 [00:02<00:00,  2.11s/it]
OllamaEmbeddings: 100%|██████████| 1/1 [00:02<00:00,  2.11s/it]
OllamaEmbeddings: 100%|██████████| 1/1 [00:02<00:00,  2.14s/it]
OllamaEmbeddings: 100%|██████████| 1/1 [00:02<00:00,  2.11s/it]


'The passage discusses different types of motion, scalar and vector quantities, and the concept of acceleration. It explains that motion is a change in position with respect to time and provides examples to illustrate this point.\n\nTo answer the question "What are some activities for demonstrating models of different types of motion?" based on the provided text:\n\nActivities mentioned in the passage for demonstrating models of different types of motion include:\n\n1. Holding a pen in your hand and moving it to and fro.\n2. Marking a long straight line with chalk powder or lime in the school field or play ground nearby, then running along this line to the other end.\n3. Holding hands in a straight line and moving one person\'s right hand straightly by his left hand.\n\nThese activities are meant to help students observe and understand different types of motion such as linear motion, rotational motion, vibratory motion, etc.'