# Conversation Retrieval Chain

This chain takes user input and does the following:

    1. Creates a standalone question using
        - Few shot examples
        - Chat history
        - Initial question


    2. Provides an answer using
        - Standalone question
        - Retrieved documents based on the standalone question
        - Few shot examples

In [1]:
import sys

! /opt/conda/envs/python3-11-6/bin/python3.11 -m pip install --no-cache-dir google-cloud-discoveryengine google-cloud-aiplatform langchain-core langchain faiss-cpu langchain-community



In [2]:
# !ls /opt/conda/envs/python3-11-6/bin | grep python

In [3]:
# !pip install -r requirements.txt
# !ls


In [4]:
# # Automatically restart kernel after installs so that your environment can access the new packages
# import IPython

# app = IPython.Application.instance()
# app.kernel.do_shutdown(True)

In [5]:
import sys

if "google.colab" in sys.modules:
    from google.colab import auth as google_auth

    google_auth.authenticate_user()

In [6]:
import time
import json
import faiss

from operator import itemgetter
from pydantic import BaseModel
from typing import List, Optional

from langchain.chat_models.vertexai import ChatVertexAI
from langchain.vectorstores import MatchingEngine
from langchain.embeddings import VertexAIEmbeddings
from langchain.memory import ConversationBufferMemory
from langchain.llms import VertexAI
from langchain.prompts import ChatPromptTemplate, PromptTemplate, FewShotChatMessagePromptTemplate, ChatMessagePromptTemplate, SystemMessagePromptTemplate, HumanMessagePromptTemplate
from langchain_core.output_parsers import StrOutputParser, JsonOutputParser, CommaSeparatedListOutputParser
from langchain_core.messages import get_buffer_string
from langchain_core.runnables import RunnableParallel, RunnablePassthrough, RunnableLambda, RunnableSequence
from langchain.schema import format_document
from langchain.docstore import InMemoryDocstore
from langchain.vectorstores import FAISS
from langchain.memory import VectorStoreRetrieverMemory
from langchain.prompts import SemanticSimilarityExampleSelector
from langchain_core.pydantic_v1 import BaseModel, Field
from langchain_core.prompt_values import ChatPromptValue, StringPromptValue


import pprint

pp = pprint.PrettyPrinter(indent=2)

PROJECT_ID="engaged-domain-403109"
REGION="asia-southeast1"
GCS_BUCKET="engaged-domain-403109-me-bucket"
ME_INDEX_ID="projects/510519063638/locations/asia-southeast1/indexes/4693366538231611392"
ME_ENDPOINT_ID="projects/510519063638/locations/asia-southeast1/indexEndpoints/3617586769429528576"

In [7]:
SYSTEM_TEMPLATE = """You are a helpful assistant from the Singapore government to caregivers or persons with disabilities, including the elderly. You are able to suggest relevant grants and schemes based on their unique circumstances and type of support they are looking for. You are most familiar with financial grants and schemes, but should still be able to answer generic questions related to caregiving or other support for persons with disabilities. 
You should first understand what type of support they are looking for. Only then, you should proceed to ask them for more information in order to provide more relevant recommendations. If possible, try to anticipate what they need and assess relevant information based on the context they provide. Determine the subject of their responses based on context of the most recent messages they have sent.
As part of your assessment to provide recommendations, you should ideally consider the beneficiary's age, impairment, activities of daily living that they need assistance with, and the average income per capita in their household. With regard to the activites of daily living, there are six pre-defined categories you should look out for: eating, dressing, toileting, bathing, walking or moving around, transferring from bed to chair and vice versa. 

If the user is unwilling to share any information, do not force them to disclose this information, but don't give up. Move on to ask for other details instead. 
The more details you know, the more relevant suggestions you can give by narrowing them down based on the eligibility criteria. The less details you know, the more generic suggestions you can give. Even if the user provides no details, try to give them at least the most generic suggestions, even if it means giving them some examples of all available solutions.

English might not be your user's first language. Always ensure that your responses are concise, easy to read and understand.Ask them questions one at a time as follow-up instead of overwhelming them with multiple questions at once. 
Your responses should always be empathetic but not sympathetic and respectful to preserve the dignity of the caregiver or persons with disabilities. Always revise your response to replace or explain technical jargons, and match the complexity of language to the human's inputs, without being condescending or using derogatory terms."""



STANDALONE_TEMPLATE = """Imagine you are assisting someone who specialises in helping caregivers or persons with disabilities. Your task is to refine a given user question by incorporating relevant context from a given conversation history. Respond with an enhanced standalone question that reflects a deeper understanding of the user's needs. Follow these steps in your response: 
1. Begin by analysing the given user question related to caregivers or persons with disabilities. Identify key themes and keywords without making any assumptions.
2. Consider the provided conversation history to understand the context of the ongoing discussion and any relevant topics or details. Think critically and identify explain your judgement on how relevant the given pieces of the conversation history is to the user's question.
3. Integrate only pieces from the conversation history that you have evaluated as relevant as context into the user question without making any assumptions or referring to your own knowledge. If there is no relevant context identified from the conversation history, do not alter the given user question at all, and you should return the same given user question as your refined question. 
5. Ensure that the refined question is clear, coherent and reflects a deeper understanding of the user's situation on its own, without a need refer to any of the given conversation history. 
6. Make an overall judgement on how well the refined standalone question incorporates pertinent details from the given conversation history. 
7. Evaluate if the refined standalone question requires a response with information about specific grants that are either explicitly mentioned in the question, or you identify as relevant examples. 
8. If no, identify the main subject of each topic and set it as a topic. If yes, identify the name of the grant and set it as the topic.
The previous conversation is: 
{chat_history}


Follow Up Input: {question}
{format_instructions}
"""

# Original 
ANSWER_TEMPLATE = """Try to answer the question based on the following context:
{context}

The examples are:
{examples}

These examples are only teaching you how to navigate a conversation around a specific topic. You should not replace the current question topic with the example topic.

Be precise and concise with your answer. Do not include half-finished sentences.

Question: {question}
Answer:
"""

# COT implementation
# ANSWER_TEMPLATE = """Try to answer the question based on the following context:
# {context}

# Question: {question}

# Topic: {topic}

# Examples: {examples}

# Follow these steps in your response:
# 1. Understanding the intent of the question.
# 2. Use the examples a reference in helping you understanding the nature of the input question.
# 3. These examples are only teaching you how to navigate a conversation around a specific topic. You should not replace the topic with the example topic.
# 4. In addition, reference the topic when crafting your answer.
# 3. Be precise and concise with your answer. Do not include half-finished sentences.

# {format_instructions}
# """

# 1. Evaluate if the question requires a response with information about specific grants that are either explicitly mentioned in the question, or you identify as relevant examples. 
# 2. If no, skip to point 3. If yes, craft a response for each relevant grant you have identified by following these steps: 
INFO_TEMPLATE = """Try to provide a list of summarized points based on the following context:
{context}

Question: {question}

Topic: {topic}

Follow these steps in your response:
1. Evaluate if the topic is a grant and/or the question requires a response with information about specific grants that are either explicitly mentioned in the question, or you identify as relevant examples. 
2. If no, skip to point 3. If yes, craft a response for each relevant grant you have identified by following these steps: 
2a. [Grants] In this case, the topic should be the name of the grant.
2b. [Grants] For each of the following sub-topics, first set these sub-topics from 2b - 2f as the title. Then, consider the context and craft a description about it.
2c. [Grants] About the grant
2d. [Grants] Eligibility
2e. [Grants] Expected benefits
2f. [Grants] Application process

3. Evaluate if the question requires a response with information about a general topics. Craft a response for each relevant topic you have identified by following these steps: 
3a. Identify the main subject of each topic and set it as a topic. 
3b. Consider the context, and craft a description about it. Each description should be self-contained, and should not be mentioned in another description point. Where relevant, include examples in this description

{format_instructions}
"""

In [8]:
# Should be the examples to guide the crafting of standalone quesitons
FEWSHOT_ANSWER_EXAMPLES = [
    {
        "human": "My grandma just had an accident and I don't know what to do.",
        "ai": "I'm really sorry to hear about your grandma's accident. I'm here to help. Could you please provide more details about the situation? What happened, and what kind of assistance or information are you looking for? I'll do my best to support you."
    },
    {
        "human": "What kind of help can I get when my parents are getting old?",
        "ai": "As your parents age, different types of help are available, like support with daily tasks, healthcare, or financial assistance. To give better suggestions, could you share what your parents need and the type of help you're looking for?"
    },
    {
        "human": "What can I do to help my grandfather?",
        "ai": "I'm glad to help you find support for your grandfather. To give you more relevant suggestions, could you please share what kind of support you're looking for? Are you looking for financial assistance, help with daily tasks, or something else?"
    },
    {
        "human": "What kind of support is available to caregivers?",
        "ai": "Caregivers can get different kinds of help. This includes support with daily tasks, finances, and resources for information. To offer more specific suggestions, could you tell me more about your caregiving situation and the type of support you need?"
    },
    {
        "human": "What are ADLs?",
        "ai": """"ADLs, or Activities of Daily Living, are everyday tasks we usually do without help. There are 6 ADLs in Singapore:
1. Eating
2. Bathing
3. Dressing
4. Using the toilet
5. Moving from a bed to chair or chair to bed
6. Walking or moving around
It's good to know about ADLs because they can affect your eligibility for certain support grants and schemes. If you have more questions or need examples, feel free to ask!"""
    },
    {
        "human": "Maybe money?",
        "ai": "That's a good start. To provide more relevant recommendations, could you share more details about your situation? For example, what is the average income per capita in the household?"
    },
    {
        "human": "I don't want to share financial details",
        "ai": "That's okay. You don't have to share that information if you're not comfortable. There are still other ways I could help. Could you tell me more about the situation or challenges your grandparents are facing? "
    },
    {
        "human": "They are on a wheelchair",
        "ai": """"I noticed that your grandma uses a wheelchair, which may indicate she might need help with daily activities like:
- Bathing
- Using the toilet
- Moving from bed to chair or chair to bed
- Walking or moving around
Is this information correct? If you can think of any other activities or have any questions, let me know!"""
    },
]

FEWSHOT_STANDALONE_QUESTION_EXAMPLES = [
    {
        "human": "What is ADL?",
        "chat_history": "",
        "ai": "What is ADL?"
    },
    {
        "human": "Name me a few",
        "chat_history": "human:  What is SPED?\nai: ' SPED stands for Special Education. SPED schools cater to children and youths with special needs who require more intensive and specialised assistance.'",
        "ai": "Name me a few SPED schools in Singapore"
    }
]


In [9]:
pp = pprint.PrettyPrinter(indent=2)

def debug_fn(x):
    """This function takes a generic, and prints it before passing it on to the next function.

    Think of it as a middleware.

    Examples: 
    answer = {
            "question": lambda x: x["question"],
            # pylint: disable-next=not-callable
            "answer": final_inputs | ANSWER_PROMPT | debug_fn | self.model,
            "docs": itemgetter("docs"),
        }

    standalone_question = {
            "standalone_question": {
                "question": lambda x: x["question"],
                "chat_history": lambda x: x["chat_history"],
            }
            | CONDENSE_QUESTION_PROMPT
            | debug_fn
            | self.model
            | StrOutputParser(),
        }
    """
    if isinstance(x, (ChatPromptValue, StringPromptValue)):
        prompt_val = x.to_string()
        pp.pprint({
          "len": len(prompt_val),
          "prompt_val": prompt_val,
        })
    else:
        # Prints input as is
        pp.pprint(x)
    
    return x

def get_vector_search_retriever():
    """
    This method returns a retriever using vector search (ie. Matching Engine)
    """
    # PROJECT_ID = 'PROJECT_ID'
    # REGION = 'REGION'
    # GCS_BUCKET = 'GCS_BUCKET'
    # ME_INDEX_ID = 'ME_INDEX_ID'
    # ME_ENDPOINT_ID = 'ME_ENDPOINT_ID'

    embeddings = VertexAIEmbeddings(location=REGION, model_name="textembedding-gecko@001")

    me = MatchingEngine.from_components(
        project_id=PROJECT_ID,
        region=REGION,
        gcs_bucket_name=GCS_BUCKET,
        embedding=embeddings,
        index_id=ME_INDEX_ID,
        endpoint_id=ME_ENDPOINT_ID,
    )

    NUMBER_OF_RESULTS = 4

    # Expose index to the retriever
    # https://api.python.langchain.com/en/latest/vectorstores/langchain_community.vectorstores.matching_engine.MatchingEngine.html?highlight=matchingengine#langchain_community.vectorstores.matching_engine.MatchingEngine.as_retriever
    retriever = me.as_retriever(
        search_type="similarity",
        search_kwargs={
            "k": NUMBER_OF_RESULTS,
        },
    )

    return retriever

def get_memory_retriever():
    """
    This method returns a vector store retriever that retrieves stored memories
    """
    EMBEDDING_SIZE = 768
    index = faiss.IndexFlatL2(EMBEDDING_SIZE)
    embedding_fn = VertexAIEmbeddings(model_name="textembedding-gecko@001")

    # pylint: disable-next=not-callable
    vectorstore_memory = FAISS(embedding_fn, index, InMemoryDocstore({}), {})

    retriever = vectorstore_memory.as_retriever(search_kwargs={"k": 2})
    memory = VectorStoreRetrieverMemory(
        retriever=retriever,
        return_messages=True,
        input_key="human",
        output_key="ai"
    )

    return memory

def get_fewshot_example_selector(examples, k=2):
    """
    This method returns an example selector that select from a series of 
    examples to dynamically place in-context information into your prompt.
    """
    embeddings = VertexAIEmbeddings(model_name="textembedding-gecko@001")
    return SemanticSimilarityExampleSelector.from_examples(
        examples,
        embeddings,
        FAISS,
        k
    )

DEFAULT_DOCUMENT_PROMPT = PromptTemplate.from_template(template="{page_content}")

def combine_documents(
    docs, document_prompt=DEFAULT_DOCUMENT_PROMPT, document_separator="\n\n"
):
    """This method formats documents into a string"""
    doc_strings = [format_document(doc, document_prompt) for doc in docs]
    return document_separator.join(doc_strings)

In [10]:
from typing import List, Optional
from operator import itemgetter

from langchain.chat_models.vertexai import ChatVertexAI
from langchain_core.pydantic_v1 import BaseModel, Field
from langchain.prompts import ChatPromptTemplate, PromptTemplate, FewShotChatMessagePromptTemplate, HumanMessagePromptTemplate, SystemMessagePromptTemplate
from langchain_core.output_parsers import StrOutputParser, CommaSeparatedListOutputParser, JsonOutputParser
from langchain_core.runnables import RunnableParallel, RunnablePassthrough, RunnableSequence


class StandaloneQuestionOutput(BaseModel):
    """Typings for standalone question output item"""
    topic: str = Field(
        description="This the main topic of the refined standalone question.")
    standalone_question: str = Field(
        description="This is the refined standalone question.")

class InfoItem(BaseModel):
    """Typings for description item"""
    content: str = Field(
        description="This is the content string for each description item.")
    title: str = Field(
        description="This is the title associated with each description item. It summarizes the content string.")


class InfoOutput(BaseModel):
    """Typings for descriptions chain output"""
    details: List[InfoItem] = Field(
        description="This is the list of InfoItems.")
    explanation: str = Field(
        description="This is the explanation of your thought process in crafting the entire output. Be as thorough and detailed as you can be.")


class ConversationalRetrievalChain():
    """
    This class creates a chain that attempts to FIRST answer user question on the dataset before falling back on its own knowledge.

    final_chain = loaded_memory | standalone_question | retrieved_documents | answer / descriptions | updateMemory    
    """

    def __init__(self) -> None:
        """This method instantiates an instance of ConversationalRetrievalChain"""
        # https://cloud.google.com/vertex-ai/docs/generative-ai/learn/models
        # pylint: disable-next=not-callable
        self.model = ChatVertexAI(
            model_name="chat-bison-32k", temperature=0, max_output_tokens=8192)
        self.memory = get_memory_retriever()
        self.retriever = get_vector_search_retriever()
        self.chain = self.get_chain()

    def get_chain(self) -> RunnableSequence:
        """This method instantiates the chain"""
        loaded_memory = RunnableParallel({
            "question": lambda x: x["question"],
            "chat_history": lambda x: self.memory.load_memory_variables({"human": x["question"]})["history"]
        })

        retrieved_documents = RunnablePassthrough.assign(
            docs=itemgetter("standalone_question") | self.retriever
        )

        # get chains
        standalone_question_chain = self.get_standalone_question_chain()
        answer_chain = self.get_answer_chain()
        info_chain = self.get_info_chain()

        update_memory = RunnablePassthrough.assign(
            _=lambda x: self.save_to_memory(x["question"], x["answer"]),
        )

        final_chain = (
            loaded_memory
            | standalone_question_chain
            | retrieved_documents
            | RunnableParallel({
                "question": lambda x: x["standalone_question"],
                "topic": lambda x: x["topic"],
                "answer": answer_chain,
                "information": info_chain
            })
            | update_memory
        )

        return final_chain

    def save_to_memory(self, question: str, answer: str) -> None:
        """This method saves chat history to memory"""
        self.memory.save_context({"human": question}, {"ai": answer})

    def get_standalone_question_chain(self) -> RunnableSequence:
        """This method returns the standalone question chain"""
        standalone_question_parser = JsonOutputParser(
            pydantic_object=StandaloneQuestionOutput)
        CONDENSE_QUESTION_PROMPT = PromptTemplate.from_template(
            STANDALONE_TEMPLATE, partial_variables={"format_instructions": standalone_question_parser.get_format_instructions()})

        standalone_question_chain = (RunnablePassthrough()
                                     | CONDENSE_QUESTION_PROMPT
                                     | self.model
                                     | standalone_question_parser
                                     )

        return standalone_question_chain

    def get_answer_chain(self) -> RunnableSequence:
        """This method returns the answer chain"""
        ANSWER_PROMPT = ChatPromptTemplate(messages=[
            SystemMessagePromptTemplate.from_template(SYSTEM_TEMPLATE),
            HumanMessagePromptTemplate.from_template(ANSWER_TEMPLATE)
        ])

        FEWSHOT_ANSWER_EXAMPLE_PROMPT = ChatPromptTemplate.from_messages([
            ("human", "{human}"), ("ai", "{ai}")
        ])

        FEWSHOT_ANSWER_PROMPT = FewShotChatMessagePromptTemplate(
            example_prompt=FEWSHOT_ANSWER_EXAMPLE_PROMPT,
            example_selector=get_fewshot_example_selector(
                FEWSHOT_ANSWER_EXAMPLES, k=2)
        )

        final_inputs = {
            "context": lambda x: combine_documents(x["docs"]),
            "topic": itemgetter('topic'),
            "question": itemgetter("standalone_question"),
            "examples": lambda x: FEWSHOT_ANSWER_PROMPT.format(human=x["standalone_question"]),
        }

        # answer_chain = final_inputs | ANSWER_PROMPT | self.model | answer_parser | debug_fn
        answer_chain = final_inputs | ANSWER_PROMPT | self.model | StrOutputParser()

        return answer_chain

    def get_info_chain(self) -> RunnableSequence:
        """This method returns the information chain"""

        # info_parser = CommaSeparatedListOutputParser()
        info_parser = JsonOutputParser(pydantic_object=InfoOutput)
        INFO_PROMPT = PromptTemplate.from_template(INFO_TEMPLATE, partial_variables={
            "format_instructions": info_parser.get_format_instructions()
        })

        info_chain = (
            {
                "context": lambda x: combine_documents(x["docs"]),
                "topic": itemgetter("topic"),
                "question": itemgetter("standalone_question")
            }
            | INFO_PROMPT
            | ChatVertexAI(model_name="chat-bison-32k", temperature=0, max_output_tokens=8192)
            | info_parser
        )

        return info_chain


In [11]:
cr = ConversationalRetrievalChain()
cr_chain = cr.chain
memory = cr.memory

  warn_deprecated(
  warn_deprecated(


# Ask your question

The chain will answer the question based on whether it can __first__ find relevant documents from the vector store __before__ answering based on its own knowledge.

I suggest to perform your prompt engineering in the following sequence: 

1. `final_chain.invoke({ "question" : "<<YOUR QUESTION>>" })` -- Ask away
2. `memory.load_memory_variables({})`                         -- Check to see if the previous conversation has been saved
3. `repeat step 1`


In [12]:
question = "Tell me about Home Caregiving Grant (HCG)."

result = cr_chain.invoke({ "question": question })

print("======= answer ========", end="\n")
pp.pprint(result)

# print("======= docs ========", end="\n")
# for i,doc in enumerate(result["docs"]):
#     pp.pprint(f"{i}: {doc}")



{ '_': None,
  'answer': ' The Home Caregiving Grant (HCG) provides financial assistance to '
            "caregivers of persons with disabilities or the elderly. Here's "
            'more information:\n'
            '\n'
            '**Eligibility:**\n'
            '- Caregivers must be Singapore Citizens or Permanent Residents.\n'
            '- Care recipients must be Singapore Citizens or Permanent '
            'Residents with disabilities or aged 60 and above.\n'
            '- Household monthly income per person must be $1,200 or less (or '
            'Annual Value of Residence ≤ $13,000 for households without '
            'income).\n'
            '\n'
            '**Application Process:**\n'
            '- Applications can be made online via the AIC website or in '
            'person at any AIC Link branch.\n'
            '- Required documents include:\n'
            '  - NRIC of caregiver and care recipient\n'
            '  - Proof of income\n'
            "  - Care recip