## Data Gathering

In [2]:
from langchain_community.document_loaders import PyPDFLoader
loader=PyPDFLoader('DL 1.pdf')
data=loader.load()

In [3]:
data[0:5]

[Document(metadata={'source': 'DL 1.pdf', 'page': 0}, page_content='Deep Learning\nDnyanesh Sarode'),
 Document(metadata={'source': 'DL 1.pdf', 'page': 1}, page_content='Topics \n•History of AI\n•Type of Data\n•Deep Learning and Its Applications\n•Benefits of Deep Learning over Machine Learning\n•Biological Neuron\n•Perceptron : Structure\n•Structure of Neural Network\n•Optimizers'),
 Document(metadata={'source': 'DL 1.pdf', 'page': 2}, page_content='•1943: \nAlan Turing publishes a paper titled "Computing Machinery and Intelligence" in which he proposes the Turing test \nas a way to measure a machine\'s intelligence.\n•1950: \nJohn McCarthy coins the term "artificial intelligence" at a Dartmouth conference.\n•1956: \nThe Dartmouth conference is held, which is considered to be the start of the modern era of AI research.\n•1957: \nFrank Rosenblatt develops the perceptron, a neural network that can learn to classify patterns.\n•1966: \nMarvin Minsky and Seymour Papert publish the book "P

In [4]:
len(data)

28

## Chunk Operations

In [5]:
from langchain.text_splitter import RecursiveCharacterTextSplitter
spiliter=RecursiveCharacterTextSplitter(chunk_size=1000)
doc=spiliter.split_documents(data)


In [6]:
len(doc)

29

## Embedding

In [7]:
from langchain_chroma import Chroma
from langchain_google_genai import GoogleGenerativeAIEmbeddings

from dotenv import load_dotenv
load_dotenv()

embedding = GoogleGenerativeAIEmbeddings(model='models/embedding-001')
sample=embedding.embed_query('Hello')
sample[0:1]

  from .autonotebook import tqdm as notebook_tqdm


[0.009476273320615292]

## Vector Database

In [8]:
vectorstore=Chroma.from_documents(documents=doc,embedding=GoogleGenerativeAIEmbeddings(model='models/embedding-001'))

In [9]:
retriever=vectorstore.as_retriever(search_type='similarity',search_kwargs={'k':10})

In [11]:
retrieve_doc=retriever.invoke('what is gradient descent ?')
retrieve_doc

[Document(metadata={'page': 16, 'source': 'DL 1.pdf'}, page_content='Gradient Descent\nw = w - η * ∂J/∂w\nb = b - η * ∂J/∂b\nw-weight\nb-bias\nη -Learning rate\n∂J/∂w – Gradient of loss \nfunction\nGradient Descent is the iterative Optimization Algorithm'),
 Document(metadata={'page': 18, 'source': 'DL 1.pdf'}, page_content='Stochastic gradient descent\n•\n Stochastic\n gradient descent is an optimization algorithm that updates the \nweights and biases of a neural network by minimizing the loss function \nusing a single data point\n.\nThe gradient of the loss function is a vector \nthat points in the direction of the steepest \nascent of the loss function. In other words, it is \nthe direction in which the loss function will \nincrease most quickly.\nThe gradient of the loss function can be used \nto update the weights and biases of a neural \nnetwork during training. By updating the \nweights and biases in the direction of the \nnegative gradient of the loss function, we can \nmake th

In [12]:
print(retrieve_doc[5].page_content)

Batch gradient descent
• Batch gradient descent is an optimization algorithm that updates the 
weights and biases of a neural network by minimizing the loss function 
using the entire training dataset.
Optimization refers to the task of 
minimizing/maximizing an objective function
Objective Function : The Function whose value is 
either maximize or minimized
w = w - η * ∂J/∂w
b = b - η * ∂J/∂b
Here, 'w' represents the model's weights, 'b' represents the biases, η is the learning rate, and ∂J/∂w and ∂J/∂b 
are the gradients of the cost function with respect to 'w' and 'b', respectively.


## LLM 

In [13]:
from langchain_google_genai import ChatGoogleGenerativeAI
llm=ChatGoogleGenerativeAI(model='gemini-1.5-pro',temperature=0.2,max_tokens=400)

In [14]:
from langchain.chains import create_retrieval_chain
from langchain.chains.combine_documents import create_stuff_documents_chain
from langchain_core.prompts import ChatPromptTemplate

In [15]:
system_prompt=(
    "You are an assistant for question-answering tasks. "
    "Use the following pieces of retrieved context to answer "
    "the question. If you don't know the answer, say that you "
    "don't know. Use three sentences maximum and keep the "
    "answer concise."
    "\n\n"
    "{context}"
)

prompt=ChatPromptTemplate.from_messages(
    [
        ('system',system_prompt),
        ('human',"{input}"),
    ]


)

In [16]:
question_answer_chain=create_stuff_documents_chain(llm,prompt)
rag_chain=create_retrieval_chain(retriever,question_answer_chain)

In [17]:
response=rag_chain.invoke({'input':'What is gradient Descent?'})
print(response['answer'])

Gradient Descent is an iterative optimization algorithm used to find the minimum of a function.  It updates the weights and biases of a model by moving in the direction of the negative gradient of the loss function.  This process is repeated until a minimum is reached.



In [20]:
response=rag_chain.invoke({'input':'what is neural learning rate'})
print(response['answer'])

The learning rate (η) is a hyperparameter that controls how much the model's weights and biases are adjusted during training.  A smaller learning rate means smaller adjustments and slower learning, while a larger learning rate means larger adjustments and faster but potentially unstable learning.  It determines the step size during gradient descent towards minimizing the loss function.

