### Trying out Google PaLM

In [1]:
# pip install google-generativeai
# pip install -r requirements.txt
# conda install -c conda-forge sentence-transformers

from langchain.llms import GooglePalm

api_key = "AIzaSyAPHVmjjz9KehLtMIxCXnhn90kSNmyN6eA"
llm = GooglePalm(google_api_key = api_key, temperature = 0.1)

In [2]:
poem = llm("Write a haiku about travel")
print(poem)

**Travel the world**
**See new places, meet new people**
**Expand your horizons**


### Loading the CSV file

In [3]:
from langchain.document_loaders.csv_loader import CSVLoader

loader = CSVLoader(file_path = "codebasics_faqs.csv", source_column = "prompt")
data = loader.load()

In [4]:
data[0]

Document(page_content='prompt: I have never done programming in my life. Can I take this bootcamp?\nresponse: Yes, this is the perfect bootcamp for anyone who has never done coding and wants to build a career in the IT/Data Analytics industry or just wants to perform better in your current job or business using data.', metadata={'source': 'I have never done programming in my life. Can I take this bootcamp?', 'row': 0})

### Creating Embeddings for the data

- We are using the Instructor Embeddings from HuggingFace (https://instructor-embedding.github.io/)
- This is a unique embedding library because unlike others, this is a single embedder that can generate text embeddings tailored to different downstream tasks and domains, without any further training. 
- Guide to embedding: https://huggingface.co/hkunlp/instructor-large


In [5]:
# Embedding sample example
from langchain.embeddings import HuggingFaceInstructEmbeddings

embeddings = HuggingFaceInstructEmbeddings(query_instruction = "Represent the query for retrieval: ")
e = embeddings.embed_query("What is your refund policy?")

print(len(e))
e[:5]

load INSTRUCTOR_Transformer
max_seq_length  512
768


[-0.04209407791495323,
 0.008169054053723812,
 0.0003134481084998697,
 0.019235705956816673,
 0.03759319335222244]

In [6]:
# Embedding for the project
# Initialize instructor embeddings using the Hugging Face model
instructor_embeddings = HuggingFaceInstructEmbeddings(model_name="hkunlp/instructor-large")

load INSTRUCTOR_Transformer
max_seq_length  512


Embedding for a sentance "What is your refund policy" is a list of size 768. Looking at the numbers in this list, doesn't give any intuitive understanding of what it is but just assume that these numbers are capturing the meaning of "What is your refund policy".

### Vector Store using FAISS

- Usually, people store the data completely in the vector Store
- But to make it simpler, we'll use it in-memory via the retriever object
- When there is a new question, embeddings are created and then the retriever will find similar embedding and return the answer.

In [8]:
from langchain.vectorstores import FAISS

# Create a FAISS instance for vector database from 'data'
# This creates a vector DB!
vectordb = FAISS.from_documents(documents = data,
                                 embedding = instructor_embeddings)

In [9]:
# Create a retriever for querying the vector database
retriever = vectordb.as_retriever(score_threshold = 0.7)

rdocs = retriever.get_relevant_documents("For how long is this course valid?")
rdocs

[Document(page_content='prompt: Once purchased, is this course available for lifetime access?\nresponse: Yes', metadata={'source': 'Once purchased, is this course available for lifetime access?', 'row': 22}),
 Document(page_content='prompt: What is the duration of this bootcamp? How long will it last?\nresponse: You can complete all courses in 3 months if you dedicate 2-3 hours per day.', metadata={'source': 'What is the duration of this bootcamp? How long will it last?', 'row': 8}),
 Document(page_content='prompt: Will the course be upgraded when there are new features in Power BI?\nresponse: Yes, the course will be upgraded periodically based on the new features in Power BI, and learners who have already bought this course will have free access to the upgrades.', metadata={'source': 'Will the course be upgraded when there are new features in Power BI?', 'row': 27}),
 Document(page_content='prompt: Is this bootcamp enough for me in Microsoft Power BI and\n Excel certifications?\nrespo

### Create RetrievalQA chain along with prompt template

 - "context" here is the CSV info we've given

In [10]:
from langchain.prompts import PromptTemplate

prompt_template = """Given the following context and a question, generate an answer based on this context only.
In the answer try to provide as much text as possible from "response" section in the source document context without making much changes.
If the answer is not found in the context, kindly state "I don't know." Don't try to make up an answer.

CONTEXT: {context}

QUESTION: {question}"""


PROMPT = PromptTemplate(
    template=prompt_template, input_variables=["context", "question"]
)
chain_type_kwargs = {"prompt": PROMPT}


from langchain.chains import RetrievalQA

chain = RetrievalQA.from_chain_type(llm=llm,
                            chain_type="stuff",
                            retriever=retriever,
                            input_key="query",
                            return_source_documents=True,
                            chain_type_kwargs=chain_type_kwargs)

In [11]:
chain("Do you guys provide internship and also do you offer EMI payments?")

{'query': 'Do you guys provide internship and also do you offer EMI payments?',
 'result': "Yes, we provide virtual internship and no we don't offer EMI payments.",
 'source_documents': [Document(page_content='prompt: Do you provide any virtual internship?\nresponse: Yes', metadata={'source': 'Do you provide any virtual internship?', 'row': 14}),
  Document(page_content='prompt: Do we have an EMI option?\nresponse: No', metadata={'source': 'Do we have an EMI option?', 'row': 13}),
  Document(page_content='prompt: Do you provide any job assistance?\nresponse: Yes, We help you with resume and interview preparation along with that we help you in building online credibility, and based on requirements we refer candidates to potential recruiters.', metadata={'source': 'Do you provide any job assistance?', 'row': 11}),
  Document(page_content='prompt: How can I contact the instructors for any doubts/support?\nresponse: We have created every lecture with a motive to explain everything in an ea