### Basic working of Google Palm LLM in LangChain


In [5]:
from langchain_google_genai import GoogleGenerativeAI
from dotenv import load_dotenv
import os
# get this free api key from https://makersuite.google.com/

load_dotenv()  # take environment variables from .env (especially openai api key)


llm = GoogleGenerativeAI(model="models/text-bison-001", google_api_key=os.environ["GOOGLE_API_KEY"], temperature=0.1)

In [6]:
poem = llm("Write a 4 line poem for depression")
print(poem)

**Depression**

A dark cloud that follows,
A weight that never lifts,
A thief of joy and hope,
A prison with no escape.


### Now let's load data from StoreInfo csv file


In [7]:
from langchain.document_loaders.csv_loader import CSVLoader

loader = CSVLoader(file_path='dreaddit/train.csv', source_column="post")

# Store the loaded data in the 'data' variable
data = loader.load()

### Hugging Face Embeddings


In [8]:
from InstructorEmbedding import INSTRUCTOR
model = INSTRUCTOR('hkunlp/instructor-large')
sentence = "3D ActionSLAM: wearable person tracking in multi-floor environments"
instruction = "Represent the Science title:"
embeddings = model.encode([[instruction,sentence]])
print(embeddings)

load INSTRUCTOR_Transformer
max_seq_length  512
[[-6.15552627e-02  1.04199704e-02  5.88438474e-03  1.93768851e-02
   5.71417809e-02  2.57655438e-02 -4.01949983e-05 -2.80044544e-02
  -2.92965565e-02  4.91884872e-02  6.78200200e-02  2.18692329e-02
   4.54528667e-02  1.50187155e-02 -4.84451763e-02 -3.25259715e-02
  -3.56492773e-02  1.19935405e-02 -6.83917757e-03  3.03126313e-02
   5.17491512e-02  3.48140411e-02  4.91032843e-03  6.68928549e-02
   1.52824540e-02  3.54217142e-02  1.07743582e-02  6.89828768e-02
   4.44019474e-02 -3.23419608e-02  1.24268020e-02 -2.15528086e-02
  -1.62690766e-02 -4.15058173e-02 -2.42291158e-03 -3.07157822e-03
   4.27047275e-02  1.56428572e-02  2.57812925e-02  5.92843145e-02
  -1.99174173e-02  1.32361818e-02  1.08408015e-02 -4.00610566e-02
  -1.36213051e-03 -1.57032814e-02 -2.53812131e-02 -1.31972972e-02
  -7.83779565e-03 -1.14009101e-02 -4.82025519e-02 -2.58416049e-02
  -4.98769898e-03  4.98239547e-02  1.19490270e-02 -5.55060506e-02
  -2.82120295e-02 -3.3220872

In [9]:
from langchain_community.embeddings import HuggingFaceInstructEmbeddings

# Initialize instructor embeddings using the Hugging Face model
instructor_embeddings = HuggingFaceInstructEmbeddings(model_name='hkunlp/instructor-large')

e = instructor_embeddings.embed_query("Consider this post: ""When moving into their tiny house, they would be given a state I.D. with that tiny house's address on it as well as a list of strict rules they have to follow lest they lose some privileges or even be evicted from their house depending on the rules broken. So basically they won't be homeless anymore but ""homed"" which is why the place would be called ""Homed"". Anyway, the homed will have to pay rent for their tiny homes by either getting a job (which is why it is important for them to get a state I.D.) or doing volunteer work around the community (e.g."" Question: Does the poster suffer from stress?") 

load INSTRUCTOR_Transformer
max_seq_length  512


In [10]:
len(e)

768

In [11]:
e[:5]

[-0.02135140262544155,
 -0.028076183050870895,
 -0.015025210566818714,
 0.03500403091311455,
 0.02205514721572399]

As you can see above, embedding for a sentance "What is your refund policy" is a list of size 768. Looking at the numbers in this list, doesn't give any intuitive understanding of what it is but just assume that these numbers are capturing the meaning of "What is your refund policy". If you are curious to know about embeddings, go to youtube and search "codebasics word embeddings" and you will find bunch of videos with simple, intuitive explanations


### Vector store using FAISS


In [12]:
from langchain_community.vectorstores import FAISS

In [13]:
# Command to generate vector database from data 

# Create a FAISS instance for vector database from 'data'
#vectordb = FAISS.from_documents(documents=data,embedding=instructor_embeddings)

# Create a retriever for querying the vector database
#retriever = vectordb.as_retriever(score_threshold = 0.7)

In [14]:
# Save the FAISS index to an external file - we already performed that so we have faiss_index folder so no need to run
# vectordb.save_local("faiss_index")

In [19]:
vectordb=FAISS.load_local("faiss_index", instructor_embeddings,allow_dangerous_deserialization=True)
retriever = vectordb.as_retriever(score_threshold = 0.7,search_kwargs={"k": 2})

In [20]:
rdocs = retriever.get_relevant_documents("Consider this post: ""When moving into their tiny house, they would be given a state I.D. with that tiny house's address on it as well as a list of strict rules they have to follow lest they lose some privileges or even be evicted from their house depending on the rules broken. So basically they won't be homeless anymore but ""homed"" which is why the place would be called ""Homed"". Anyway, the homed will have to pay rent for their tiny homes by either getting a job (which is why it is important for them to get a state I.D.) or doing volunteer work around the community (e.g."" Question: Does the poster suffer from stress?")
rdocs  

[Document(page_content='post: Consider this post: "What do you think would happen if you invited an individual with mental health issues who had been homeless for many years to move directly from the street into housing? Loyd Pendleton shares how he went from skeptic to believer in the Housing First approach to homelessness -- providing the displaced with short-term assistance to find permanent housing quickly and without conditions -- and how it led to a 91 percent reduction in chronic homelessness over a ten-year period in Utah. <url>" Question: Does the poster suffer from stress?\nresponse: No, the poster does not suffer from stress. Reasoning: The post is discussing a topic related to mental health and homelessness, but there is no indication of the poster\'s emotional state or personal experiences in the post. The language used is neutral and informational, and there is no expression of stress or emotional distress. Therefore, the poster does not appear to be suffering from stress

As you can see above, the retriever that was created using FAISS and hugging face embedding is now capable of pulling relavant documents from our original CSV file knowledge store. This is very powerful and it will help us further in our project


##### Embeddings can be created using GooglePalm too. Also for vector database you can use chromadb as well as shown below. During our experimentation, we found hugging face embeddings and FAISS to be more appropriate for our use case


In [14]:
# google_palm_embeddings = GooglePalmEmbeddings(google_api_key=api_key)

# from langchain.vectorstores import Chroma
# vectordb = Chroma.from_documents(data,
#                            embedding=google_palm_embeddings,
#                            persist_directory='./chromadb')
# vectordb.persist()

### Create RetrievalQA chain along with prompt template 🚀


In [21]:
from langchain.prompts import PromptTemplate
prompt_template = """ You are a psychiatrist - a medical doctor (an M.D. or D.O.) who specializes in mental health, including substance use disorders, anxiety, and detecting depression, By using the provided context and your experties as psychiatrist, evaluate the content of social media post provided in post section and provide the answer to question.

Example: Consider this post: "This is good and bad... I was a STEM major before and probably would have had a challenging and high paid career by now, had my life taken a different course. However, post PTSD, I found myself extremely connected with the arts. I have intense emotional reactions to aesthetic experiences, and I think that’s largely responsible for the fact that I now work in a creative industry. I experience aesthetics in a unique way that I’m not sure resonates with the typical non-affected individual.",
Question: "Does the poster suffer from stress?",
Response : "No, the poster does not suffer from stress. Reasoning: The poster describes a shift in career choices from STEM to the arts due to their intense emotional reactions to aesthetic experiences. While they mention having PTSD, there is no indication in the post that they are currently experiencing stress or emotional distress related to it. The tone of the post is reflective and positive, focusing on the unique way they experience aesthetics rather than any negative impact on their well-being. Therefore, it is not likely that the poster is currently suffering from stress."


CONTEXT: ```{context}```

POST: ```{post}```

QUESTION: ```Does the poster suffer from stress?```"""


PROMPT = PromptTemplate(
    template=prompt_template, input_variables=["context", "post"]
)
chain_type_kwargs = {"prompt": PROMPT}


from langchain.chains import RetrievalQA

chain = RetrievalQA.from_chain_type(llm=llm,
                            chain_type="stuff",
                            retriever=retriever,
                            input_key="query",
                            return_source_documents=True,
                            chain_type_kwargs=chain_type_kwargs)


### We are all set 👍🏼 Let's ask some questions now


In [22]:
chain("Consider this post: When moving into their tiny house, they would be given a state I.D. with that tiny house's address on it as well as a list of strict rules they have to follow lest they lose some privileges or even be evicted from their house depending on the rules broken. So basically they won't be homeless anymore but ""homed"" which is why the place would be called ""Homed"". Anyway, the homed will have to pay rent for their tiny homes by either getting a job (which is why it is important for them to get a state I.D.) or doing volunteer work around the community")

ValueError: Missing some input keys: {'post'}

In [23]:
## USE OF MISTRAL MODEL
hf_repo_id = 'mistralai/Mistral-7B-Instruct-v0.1'

In [29]:
from langchain.llms import HuggingFaceHub
llm = HuggingFaceHub(
            repo_id=hf_repo_id,
            model_kwargs={"temperature": 0.2, "max_length": 32000}, huggingfacehub_api_token = os.environ["HUGGING_FACE_API_KEY"]
        )

  warn_deprecated(


In [31]:
from langchain.memory import ConversationBufferMemory
memory = ConversationBufferMemory(memory_key="chat_history", return_messages=True)
retriever = vectordb.as_retriever(search_kwargs={"k": 2})
qa = RetrievalQA.from_chain_type(llm=llm, chain_type="stuff", retriever=retriever, memory=memory)

In [32]:
## Cite only final response
def process_llm_response(llm_response):
    print(llm_response['result'])

In [34]:
# full example
query = "Consider this post: ""When moving into their tiny house, they would be given a state I.D. with that tiny house's address on it as well as a list of strict rules they have to follow lest they lose some privileges or even be evicted from their house depending on the rules broken. So basically they won't be homeless anymore but ""homed"" which is why the place would be called ""Homed"". Anyway, the homed will have to pay rent for their tiny homes by either getting a job (which is why it is important for them to get a state I.D.) or doing volunteer work around the community (e.g."" Question: Does the poster suffer from stress?"
llm_response = qa(query)
process_llm_response(llm_response)

Use the following pieces of context to answer the question at the end. If you don't know the answer, just say that you don't know, don't try to make up an answer.

post: Consider this post: "What do you think would happen if you invited an individual with mental health issues who had been homeless for many years to move directly from the street into housing? Loyd Pendleton shares how he went from skeptic to believer in the Housing First approach to homelessness -- providing the displaced with short-term assistance to find permanent housing quickly and without conditions -- and how it led to a 91 percent reduction in chronic homelessness over a ten-year period in Utah. <url>" Question: Does the poster suffer from stress?
response: No, the poster does not suffer from stress. Reasoning: The post is discussing a topic related to mental health and homelessness, but there is no indication of the poster's emotional state or personal experiences in the post. The language used is neutral and in