### Basic working of Google Palm LLM in LangChain

In [1]:
#Install Dependencies
!pip install langchain==0.0.284 python-dotenv==1.0.0 streamlit==1.22.0 tiktoken==0.4.0 faiss-cpu==1.7.4 protobuf~=3.19.0 google-generativeai InstructorEmbedding sentence-transformers



In [13]:
from langchain.vectorstores import FAISS
from langchain.llms import GooglePalm
from langchain.document_loaders.csv_loader import CSVLoader
from langchain.embeddings import HuggingFaceInstructEmbeddings
from langchain.prompts import PromptTemplate
from langchain.chains import RetrievalQA
import logging

# Set the logging level to ignore warnings from the sentence_transformers module
logging.getLogger('sentence_transformers').setLevel(logging.ERROR)

api_key = 'Generate your api key' # get this free api key from https://makersuite.google.com/
llm = GooglePalm(google_api_key=api_key, temperature=0.1)

### Now let's load data from aboutme csv file

In [5]:
# Specify the encoding as 'latin-1'
file_encoding = 'latin-1'
loader = CSVLoader(file_path='/content/aboutme.csv', source_column="prompt", encoding=file_encoding)
# Store the loaded data in the 'data' variable
data = loader.load()


### Hugging Face Embeddings

In [8]:
# Initialize instructor embeddings using the Hugging Face model
instructor_embeddings = HuggingFaceInstructEmbeddings(model_name="deepset/roberta-large-squad2")



config.json:   0%|          | 0.00/696 [00:00<?, ?B/s]

model.safetensors:   0%|          | 0.00/1.42G [00:00<?, ?B/s]

Some weights of RobertaModel were not initialized from the model checkpoint at deepset/roberta-large-squad2 and are newly initialized: ['roberta.pooler.dense.bias', 'roberta.pooler.dense.weight']
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


tokenizer_config.json:   0%|          | 0.00/1.19k [00:00<?, ?B/s]

vocab.json:   0%|          | 0.00/798k [00:00<?, ?B/s]

merges.txt:   0%|          | 0.00/456k [00:00<?, ?B/s]

special_tokens_map.json:   0%|          | 0.00/239 [00:00<?, ?B/s]

### Vector store using FAISS

In [9]:
vectordb_file_path = '/content/faiss_index'
# Create a FAISS instance for vector database from 'data'
vectordb = FAISS.from_documents(documents=data, embedding=instructor_embeddings)
# Save vector database locally
vectordb.save_local(vectordb_file_path)



In [14]:
# Load the vector database from the local folder
vectordb = FAISS.load_local(vectordb_file_path, instructor_embeddings)

# Create a retriever for querying the vector database
retriever = vectordb.as_retriever(score_threshold=0.7)
rdocs = retriever.get_relevant_documents("how about job placement support?")
rdocs

[Document(page_content='prompt: What programming languages is Mayank proficient in?\nresponse: ', metadata={'source': 'What programming languages is Mayank proficient in?', 'row': 266}),
 Document(page_content='prompt: How does Mayank contribute to team goals?\nresponse: Mayank focuses on application features, code quality, and process improvement in his role.', metadata={'source': 'How does Mayank contribute to team goals? ', 'row': 37}),
 Document(page_content='prompt: How does Mayank contribute to team goals?\nresponse: Mayank focuses on application features, code quality, and process improvement in his role.', metadata={'source': 'How does Mayank contribute to team goals? ', 'row': 130}),
 Document(page_content='prompt: How does Mayank prioritize goals when dealing with multiple objectives?\nresponse: Mayank emphasizes prioritization when dealing with multiple goals.', metadata={'source': 'How does Mayank prioritize goals when dealing with multiple objectives? ', 'row': 44})]

As you can see above, the retriever that was created using FAISS and hugging face embedding is now capable of pulling relavant documents from our original CSV file knowledge store. This is very powerful and it will help us further in our project

##### Embeddings can be created using GooglePalm too. Also for vector database you can use chromadb as well as shown below. During our experimentation, we found hugging face embeddings and FAISS to be more appropriate for our use case

### Create RetrievalQA chain along with prompt template 🚀

In [15]:
prompt_template = """Given the following context and a question, generate an answer based on this context only.
In the result only return response.
In the answer try to provide as much text as possible from "response" section in the source document context without making much changes.
If the answer is not found in the context, kindly state "I don't know. you can email your query to my team on email msrajawat298@gmail.com " Don't try to make up an answer.

CONTEXT: {context}

QUESTION: {question}"""

PROMPT = PromptTemplate(
    template=prompt_template, input_variables=["context", "question"]
)

chain = RetrievalQA.from_chain_type(llm=llm,
                                    chain_type="stuff",
                                    retriever=retriever,
                                    input_key="query",
                                    return_source_documents=True,
                                    chain_type_kwargs={"prompt": PROMPT})




### We are all set 👍🏼 Let's ask some questions now

In [18]:
chain('Who is Mayank Singh Kushwah?')

{'query': 'Who is Mayank Singh Kushwah?',
 'result': 'Mayank Singh Kushwah is a software engineer with a passion for learning and growing. He is currently working as a DevOps Engineer at Google.',
 'source_documents': [Document(page_content="prompt: What are Mayank's future career goals?\nresponse: Gain insights into Mayank's aspirations and where he envisions taking his career in the future.", metadata={'source': "What are Mayank's future career goals?", 'row': 31}),
  Document(page_content='prompt: What is the online alias used by Mayank Singh Kushwah?\nresponse: The online alias used by Mayank Singh Kushwah is "msrajawat298." |', metadata={'source': 'What is the online alias used by Mayank Singh Kushwah? ', 'row': 73}),
  Document(page_content='prompt: What is msrajawat298 experience with Ansible?\nresponse: Mayank is gaining expertise in Ansible for configuration management, application deployment, and task automation.', metadata={'source': 'What is msrajawat298 experience with Ans