## Basic working of Google Palm LLM in LangChain

In [1]:
import pandas as pd
import numpy as np

df = pd.read_csv("faqs_data.csv")
df.to_csv('faqs_data_new.csv', encoding="utf-8", index= False)

In [2]:
from langchain_google_genai.llms import GoogleGenerativeAI
import tqdm as notebook_tqdm

In [3]:
import os
from dotenv import load_dotenv

# Checking if the .env is loaded or not - Returns True
load_dotenv()

True

In [4]:
from huggingface_hub.hf_api import HfFolder

HfFolder.save_token('HUGGINGFACE_API_KEY')

In [5]:
GOOGLE_API_KEY = "paste your google api key here"

In [6]:
google_llm = GoogleGenerativeAI(model="gemini-pro", google_api_key = GOOGLE_API_KEY , temperature=0.4)

In [7]:
poem = google_llm.invoke("Write a 4 line poem of my love for samosa")
print(poem)




In [8]:
essay = google_llm.invoke("Write an email requesting refund for electronic items")
print(essay)

Dear [Customer Service Representative Name],

I hope this email finds you well.

I am writing to request a refund for the following electronic items I recently purchased from your store:

* [Product Name] - Order Number: [Order Number]
* [Product Name] - Order Number: [Order Number]

I received the items on [Date] and upon inspection, I discovered that they were not as advertised. Specifically, the [Product Name] has a faulty [Feature] and the [Product Name] does not meet the [Specification].

I have attempted to resolve the issue with the manufacturer, but they have been unresponsive. I have also tried troubleshooting the issue myself, but I have been unsuccessful.

I am very disappointed with the quality of the products and I would like to request a full refund for both items. I have attached a copy of my receipt for your reference.

I understand that your store has a return policy, but I am unable to return the items in person as I live in a different city. I would be happy to retur

### Now let's load data from FAQ csv file

In [9]:
from langchain_community.document_loaders.csv_loader import CSVLoader

csv_loader = CSVLoader(
    file_path = "faqs_data_new.csv", 
    source_column = "prompt",
    encoding="utf-8"
)

data = csv_loader.load()

In [10]:
data

[Document(metadata={'source': 'I have never done programming in my life. Can I take this bootcamp?', 'row': 0}, page_content='prompt: I have never done programming in my life. Can I take this bootcamp?\nresponse: Yes, this is the perfect bootcamp for anyone who has never done coding and wants to build a career in the IT/Data Analytics industry or just wants to perform better in your current job or business using data.'),
 Document(metadata={'source': 'Why should I trust Codebasics?', 'row': 1}, page_content='prompt: Why should I trust Codebasics?\nresponse: Till now 9000 + learners have benefitted from the quality of our courses. You can check the review section and also we have attached their LinkedIn profiles so that you can connect with them and ask directly.'),
 Document(metadata={'source': 'Is there any prerequisite for taking this bootcamp ?', 'row': 2}, page_content='prompt: Is there any prerequisite for taking this bootcamp ?\nresponse: Our bootcamp is specifically designed for

### Hugging Face Embeddings

In [11]:
# from huggingface_hub import notebook_login
# notebook_login()

In [12]:
from langchain_community.embeddings import HuggingFaceInstructEmbeddings

model_name = "hkunlp/instructor-large"
model_kwargs = {'device': 'cpu'}
encode_kwargs = {'normalize_embeddings': True}

In [13]:
from langchain_huggingface.embeddings.huggingface import HuggingFaceEmbeddings

# model_name = "sentence-transformers/all-mpnet-base-v2"
model_kwargs = {'device': 'cpu'}
encode_kwargs = {'normalize_embeddings': True}
hf = HuggingFaceEmbeddings(
    model_name=model_name,
    model_kwargs=model_kwargs,
    encode_kwargs=encode_kwargs
)

  torch.load(os.path.join(input_path, "pytorch_model.bin"), map_location=torch.device("cpu"))


In [14]:
question_embedding = hf.embed_query("What is your refund policy?")
question_embedding

[-0.04449572041630745,
 0.007691530045121908,
 -0.009869121015071869,
 0.020831875503063202,
 0.03185895457863808,
 0.0683884471654892,
 0.009561844170093536,
 0.02428652159869671,
 -0.023269370198249817,
 0.033126119524240494,
 0.039504170417785645,
 -0.00834561325609684,
 0.051079314202070236,
 0.03825225308537483,
 -0.063233882188797,
 -0.048464108258485794,
 -0.07004254311323166,
 -0.0005937233800068498,
 -0.03002571314573288,
 0.013897750526666641,
 0.060373783111572266,
 -0.011695214547216892,
 -0.009512543678283691,
 0.010432074777781963,
 0.016631925478577614,
 0.01704709231853485,
 -0.02367987670004368,
 0.0211300328373909,
 0.06495610624551773,
 -0.061630576848983765,
 0.014581451192498207,
 -0.023546256124973297,
 -0.05309614911675453,
 -0.03272029384970665,
 -0.014265917241573334,
 0.03594067320227623,
 0.020435988903045654,
 0.007273691240698099,
 0.02718525007367134,
 0.018944857642054558,
 0.0028853153344243765,
 0.014749333262443542,
 0.045292776077985764,
 -0.030797313

In [15]:
len(question_embedding)

768

As you can see above, embedding for a sentance "What is your refund policy" is a list of size 768. Looking at the numbers in this list, doesn't give any intuitive understanding of what it is but just assume that these numbers are capturing the meaning of "What is your refund policy". If you are curious to know about embeddings, go to youtube and search "codebasics word embeddings" and you will find bunch of videos with simple, intuitive explanations

### Vector store using FAISS

In [16]:
from langchain_community.vectorstores import FAISS

faiss_vectorstore_db = FAISS.from_documents(
    documents = data,
    embedding = hf
)

faiss_retriever = faiss_vectorstore_db.as_retriever(search_kwargs={'score_threshold': 0.8})

In [17]:
retrieved_answer = faiss_retriever.get_relevant_documents("How about job placement support ?")
retrieved_answer

  warn_deprecated(


[Document(metadata={'source': 'Do you provide any job assistance?', 'row': 11}, page_content='prompt: Do you provide any job assistance?\nresponse: Yes, We help you with resume and interview preparation along with that we help you in building online credibility, and based on requirements we refer candidates to potential recruiters.'),
 Document(metadata={'source': 'Do you provide any virtual internship?', 'row': 14}, page_content='prompt: Do you provide any virtual internship?\nresponse: Yes'),
 Document(metadata={'source': 'Will this bootcamp guarantee me a job?', 'row': 15}, page_content='prompt: Will this bootcamp guarantee me a job?\nresponse: The courses included in this bootcamp are done by 9000+ learners and many of them have secured a job which gives us ample confidence that you will be able to get a job. However, we want to be honest and do not want to make any impractical promises! Our guarantee is to prepare you for the job market by teaching the most relevant skills, knowle

### Create RetrievalQA chain along with prompt template

In [18]:
from langchain_core.prompts import PromptTemplate
from langchain_core.output_parsers import StrOutputParser
from langchain_core.runnables import RunnablePassthrough

prompt_template = """Given the following context and a question, generate an answer based on this context only.
In the answer try to provide as much text as possible from "response" section in the source document context without making much changes.
If the answer is not found in the context, kindly state "I don't know." Don't try to make up an answer.

CONTEXT: {context}

QUESTION: {question}"""


# Instantiation prompt template using from_template (recommended)
customized_prompt = PromptTemplate(
    template = prompt_template,
    input_variables = ["context", "question"]
)

google_llm = GoogleGenerativeAI(model="gemini-pro", google_api_key = GOOGLE_API_KEY , temperature=0.6)

def format_docs(docs):
    return "\n\n".join(doc.page_content for doc in data)


retrieval_chain = (
    {
        "context": faiss_retriever | format_docs,
        "question": RunnablePassthrough(),
    }
    | customized_prompt
    | google_llm
    | StrOutputParser()
)

retrieval_chain.invoke("How about job placement support?")

"I don't know."

In [19]:
retrieval_chain.invoke("Do you provide job assistance and also do you provide job gurantee?")

'Yes, We help you with resume and interview preparation along with that we help you in building online credibility, and based on requirements we refer candidates to potential recruiters. The courses included in this bootcamp are done by 9000+ learners and many of them have secured a job which gives us ample confidence that you will be able to get a job. However, we want to be honest and do not want to make any impractical promises! Our guarantee is to prepare you for the job market by teaching the most relevant skills, knowledge & timeless principles good enough to fetch the job.'

As you can see above, the answer of question comes from two different FAQs within our csv file and it is able to pull those questions and merge them nicely

In [20]:
retrieval_chain.invoke("Do you guys provide internship and also do you offer EMI payments?")

"I don't know."

In [21]:
retrieval_chain.invoke("Do you have JavaScript courses?")

Retrying langchain_google_genai.llms._completion_with_retry.<locals>._completion_with_retry in 4.0 seconds as it raised ResourceExhausted: 429 Resource has been exhausted (e.g. check quota)..
Retrying langchain_google_genai.llms._completion_with_retry.<locals>._completion_with_retry in 4.0 seconds as it raised ResourceExhausted: 429 Resource has been exhausted (e.g. check quota)..


"I don't know."

In [22]:
retrieval_chain.invoke("Should I learn Power BI or Tableau ?")

"I don't know."

In [23]:
retrieval_chain.invoke("I've a MAC computer. Can I use powerbi on it?")

"I don't know."