In [1]:
# @title
from langchain.document_loaders import PyPDFLoader
from langchain.text_splitter import RecursiveCharacterTextSplitter
from langchain.vectorstores import FAISS
from langchain.chains import create_retrieval_chain
from langchain.chains.combine_documents import create_stuff_documents_chain
from langchain_google_genai import ChatGoogleGenerativeAI
from langchain.prompts import PromptTemplate


  from .autonotebook import tqdm as notebook_tqdm


In [2]:
from dotenv import load_dotenv
load_dotenv()

True

In [4]:

loader = PyPDFLoader("../azure.pdf")
docs = loader.load()

In [6]:

text_splitter = RecursiveCharacterTextSplitter(chunk_size=500,chunk_overlap=100)

In [7]:

documents = text_splitter.split_documents(docs)

In [9]:
documents[10].page_content

'convincing me to write this title. The book improved in manifold ways through valuable comments from all the reviewers, time and again. Adrian Raposo did a commendable job helping develop the content as well as coordinating the overall project management. This book would not have been in its current shape had it not received the perfect touch of the technical editor, Abhishek Kotian, and also all the proofreaders.\nSpecial thanks to my colleagues, Kamal and Mahananda. Kamal took time to get'

In [10]:
import os
gemini_api_key = os.environ['google_api_key']

In [11]:
from langchain_google_genai import GoogleGenerativeAIEmbeddings

embeddings = GoogleGenerativeAIEmbeddings(model="models/embedding-001",google_api_key=gemini_api_key)

In [12]:
llm = ChatGoogleGenerativeAI( model="gemini-1.5-flash",google_api_key=gemini_api_key)

In [13]:
vector_db = FAISS.from_documents(documents,embeddings)

In [14]:
retriever = vector_db.as_retriever()

In [15]:
template = """
You are a PDF document expert specializing in extracting accurate answers from complex texts.
Utilize the provided context to deliver precise and concise answers.

Context:
{context}

Provide a well-informed and detailed answer based on the context:

"""

In [16]:
from langchain_core.prompts import ChatPromptTemplate
prompt = ChatPromptTemplate.from_messages(
    [
        ("system", template),
        ("human", "{input}"),
    ]
)

In [17]:
chain = create_stuff_documents_chain(llm,prompt)

In [18]:
rag_chain = create_retrieval_chain(retriever, chain)

In [19]:
# results = rag_chain.invoke({"input": "What is  Evaluation metrics?"})
# print(results)

{'input': 'What is  Evaluation metrics?', 'context': [Document(metadata={'source': '../azure.pdf', 'page': 120}, page_content='metrics that are defined. Often, one metric may not be sufficient to take a decision. To start with, you may look at accuracy, but at times it might be deceptive. Consider a case where you are making a prediction for a rare disease where in reality, 99 percent negative cases and 1 percent of positive cases appear. If your classification model predicts all the cases as true negatives, then the accuracy is still 99 percent. In this case, the F1 score might be useful as it would give you a clear'), Document(metadata={'source': '../azure.pdf', 'page': 116}, page_content="Consider a case where you need to predict the housing price not as a number, but as \ncategories, such as greater than 100K or less than 100K. In this case, though you are predicting the housing price, you are indeed predicting a class or category for the \nhousing price and hence, it's a classific

In [20]:
# while True:
#     user_question = input("Enter your question (or type 'exit' to quit): ")
#     if user_question.lower() == 'exit':
#         print("Exiting...")
#         break
#     print(user_question)
#     results = rag_chain.invoke({"input": user_question})
#     print("Answer:", results['answer'])

Answer: The confusion matrix is a table that helps visualize the performance of a classification model. It shows how well the model predicted the actual classes by comparing the predicted labels with the true labels. 

In the context provided, the confusion matrix is used to analyze the performance of a model that predicts the "High" class.  It shows the number of instances that were correctly and incorrectly classified as "High" compared to the actual class. 

For example, if a model predicts an instance as "High" but the actual class is "Low," it's considered a **false positive**.  If a model predicts an instance as "Low" but the actual class is "High," it's considered a **false negative**. 

Answer: The provided text focuses on the basics of using Azure Machine Learning Studio, but doesn't provide a specific example. However, I can give you a general example of how to use it for a simple machine learning task:

**Scenario:** You want to predict if a customer will click on an ad base