# Conversational Interface - Chatbot with Titan LLM

> *This notebook should work well with the **`Data Science 2.0`** kernel in SageMaker Studio*

In this notebook, we will build a chatbot using the Foundational Models (FMs) in Amazon Bedrock. For our use-case we use Titan as our FM for building the chatbot.

## Overview

Conversational interfaces such as chatbots and virtual assistants can be used to enhance the user experience for your customers.Chatbots uses natural language processing (NLP) and machine learning algorithms to understand and respond to user queries. Chatbots can be used in a variety of applications, such as customer service, sales, and e-commerce, to provide quick and efficient responses to users. They can be accessed through various channels such as websites, social media platforms, and messaging apps.


## Chatbot using Amazon Bedrock

![Amazon Bedrock - Conversational Interface](./images/chatbot_bedrock.png)


## Use Cases

1. **Chatbot (Basic)** - Zero Shot chatbot with a FM model
2. **Chatbot using prompt** - template(Langchain) - Chatbot with some context provided in the prompt template
3. **Chatbot with persona** - Chatbot with defined roles. i.e. Career Coach and Human interactions
4. **Contextual-aware chatbot** - Passing in context through an external file by generating embeddings.

## Langchain framework for building Chatbot with Amazon Bedrock
In Conversational interfaces such as chatbots, it is highly important to remember previous interactions, both at a short term but also at a long term level.

LangChain provides memory components in two forms. First, LangChain provides helper utilities for managing and manipulating previous chat messages. These are designed to be modular and useful regardless of how they are used. Secondly, LangChain provides easy ways to incorporate these utilities into chains.
It allows us to easily define and interact with different types of abstractions, which make it easy to build powerful chatbots.

## Building Chatbot with Context - Key Elements

The first process in a building a contextual-aware chatbot is to **generate embeddings** for the context. Typically, you will have an ingestion process which will run through your embedding model and generate the embeddings which will be stored in a sort of a vector store. In this example we are using a GPT-J embeddings model for this

![Embeddings](./images/embeddings_lang.png)

Second process is the user request orchestration , interaction,  invoking and returing the results

![Chatbot](./images/chatbot_lang.png)

## Architecture [Context Aware Chatbot]
![4](./images/context-aware-chatbot.png)

## Setup

Before running the rest of this notebook, you'll need to run the cells below to (ensure necessary libraries are installed and) connect to Bedrock.

For more details on how the setup works and ⚠️ **whether you might need to make any changes**, refer to the [Bedrock boto3 setup notebook](../00_Intro/bedrock_boto3_setup.ipynb) notebook.

In [15]:
# Make sure you ran `download-dependencies.sh` from the root of the repository first!

%pip install --force-reinstall \
    dependencies/awscli-1.29.21-py3-none-any.whl \
    dependencies/boto3-1.28.21-py3-none-any.whl \
    dependencies/botocore-1.31.21-py3-none-any.whl
!pip install PyYAML

Processing ./dependencies/awscli-1.29.21-py3-none-any.whl
Processing ./dependencies/boto3-1.28.21-py3-none-any.whl
Processing ./dependencies/botocore-1.31.21-py3-none-any.whl
Collecting docutils<0.17,>=0.10 (from awscli==1.29.21)
  Using cached docutils-0.16-py2.py3-none-any.whl (548 kB)
Collecting s3transfer<0.7.0,>=0.6.0 (from awscli==1.29.21)
  Using cached s3transfer-0.6.1-py3-none-any.whl (79 kB)
Collecting PyYAML<6.1,>=3.10 (from awscli==1.29.21)
  Obtaining dependency information for PyYAML<6.1,>=3.10 from https://files.pythonhosted.org/packages/29/61/bf33c6c85c55bc45a29eee3195848ff2d518d84735eb0e2d8cb42e0d285e/PyYAML-6.0.1-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.metadata
  Using cached PyYAML-6.0.1-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.metadata (2.1 kB)
Collecting colorama<0.4.5,>=0.2.5 (from awscli==1.29.21)
  Using cached colorama-0.4.4-py2.py3-none-any.whl (16 kB)
Collecting rsa<4.8,>=3.1.2 (from awscli==1.29.21)
  Using cached rsa-4.7

In this notebook, we'll also need some extra dependencies:

- [FAISS](https://github.com/facebookresearch/faiss), to store vector embeddings
- [IPyWidgets](https://ipywidgets.readthedocs.io/en/stable/), for interactive UI widgets in the notebook
- [PyPDF](https://pypi.org/project/pypdf/), for handling PDF files

In [5]:
%pip install --quiet "faiss-cpu>=1.7,<2" "ipywidgets>=7,<8" langchain==0.0.249 "pypdf>=3.8,<4"
%pip install opensearch-py

[31mERROR: pip's dependency resolver does not currently take into account all the packages that are installed. This behaviour is the source of the following dependency conflicts.
spyder 5.3.3 requires pyqt5<5.16, which is not installed.
spyder 5.3.3 requires pyqtwebengine<5.16, which is not installed.
jupyterlab 3.4.4 requires jupyter-server~=1.16, but you have jupyter-server 2.7.0 which is incompatible.
jupyterlab-server 2.10.3 requires jupyter-server~=1.4, but you have jupyter-server 2.7.0 which is incompatible.
sagemaker-datawrangler 0.4.3 requires sagemaker-data-insights==0.4.0, but you have sagemaker-data-insights 0.3.3 which is incompatible.
spyder 5.3.3 requires ipython<8.0.0,>=7.31.1, but you have ipython 8.14.0 which is incompatible.
spyder 5.3.3 requires pylint<3.0,>=2.5.0, but you have pylint 3.0.0a6 which is incompatible.
spyder-kernels 2.3.3 requires ipython<8,>=7.31.1; python_version >= "3", but you have ipython 8.14.0 which is incompatible.[0m[31m
[0mNote: you may ne

In [24]:
%pip install pypdf

[0mNote: you may need to restart the kernel to use updated packages.


In [9]:
import json
import os
import sys

import boto3
import yaml

module_path = ".."
sys.path.append(os.path.abspath(module_path))
from utils import bedrock, print_ww


# ---- ⚠️ Un-comment and edit the below lines as needed for your AWS setup ⚠️ ----

# os.environ["AWS_DEFAULT_REGION"] = "<REGION_NAME>"  # E.g. "us-east-1"
# os.environ["AWS_PROFILE"] = "<YOUR_PROFILE>"
# os.environ["BEDROCK_ASSUME_ROLE"] = "<YOUR_ROLE_ARN>"  # E.g. "arn:aws:..."
# os.environ["BEDROCK_ENDPOINT_URL"] = "<YOUR_ENDPOINT_URL>"  # E.g. "https://..."


with open('config.yml', 'r') as file:
    config = yaml.safe_load(file)

b_endpoint = config['bedrock-preview']['endpoint']
b_region = config['bedrock-preview']['region']


boto3_bedrock = bedrock.get_bedrock_client(
    assumed_role=os.environ.get("BEDROCK_ASSUME_ROLE", None),
    endpoint_url=b_endpoint,
    region=b_region,
)

Create new client
  Using region: us-west-2
boto3 Bedrock client successfully created!
bedrock(https://prod.us-west-2.frontend.bedrock.aws.dev)


## Chatbot (Basic - without context)

#### Using CoversationChain from LangChain to start the conversation

Chatbots needs to remember the previous interactions. Conversational memory allows us to do that.There are several ways that we can implement conversational memory. In the context of LangChain, they are all built on top of the ConversationChain.

Note: The model outputs are non-deterministic

In [10]:
from langchain.chains import ConversationChain
from langchain.llms.bedrock import Bedrock
from langchain.memory import ConversationBufferMemory

titan_llm = Bedrock(model_id="amazon.titan-tg1-large", client=boto3_bedrock)
memory = ConversationBufferMemory()
conversation = ConversationChain(
    llm=titan_llm, verbose=True, memory=memory
)

print_ww(conversation.predict(input="Hi there!"))



[1m> Entering new ConversationChain chain...[0m
Prompt after formatting:
[32;1m[1;3mThe following is a friendly conversation between a human and an AI. The AI is talkative and provides lots of specific details from its context. If the AI does not know the answer to a question, it truthfully says it does not know.

Current conversation:

Human: Hi there!
AI:[0m

[1m> Finished chain.[0m
 Hello! How can I help you today?


#### New Questions

Model has responded with intial message, let's ask few questions

In [11]:
print_ww(conversation.predict(input="Give me a few tips on how to start a new garden."))



[1m> Entering new ConversationChain chain...[0m
Prompt after formatting:
[32;1m[1;3mThe following is a friendly conversation between a human and an AI. The AI is talkative and provides lots of specific details from its context. If the AI does not know the answer to a question, it truthfully says it does not know.

Current conversation:
Human: Hi there!
AI:  Hello! How can I help you today?
Human: Give me a few tips on how to start a new garden.
AI:[0m

[1m> Finished chain.[0m
 Sure, thing! Here are some tips for starting a new garden:
1. Do some research on the best practices for gardening in your area. Different regions may have
different climates and soil types that require specific care.
2. Choose a location for your garden that receives plenty of sunlight for the plants you want to
grow.
3. Prepare the soil by removing any weeds, rocks, or other debris. You can also add compost or other
organic matter to improve soil fertility.
4. Select the plants you want to grow, consid

#### Build on the questions

Let's ask a question without mentioning the word garden to see if model can understand previous conversation

In [None]:
print_ww(conversation.predict(input="Cool. Will that work with tomatoes?"))

#### Finishing this conversation

In [None]:
print_ww(conversation.predict(input="That's all, thank you!"))

## Chatbot using prompt template(Langchain)

PromptTemplate is responsible for the construction of this input. LangChain provides several classes and functions to make constructing and working with prompts easy. We will use the default Prompt Template here. [PromptTemplate](https://python.langchain.com/en/latest/modules/prompts/getting_started.html)

In [None]:
from langchain.memory import ConversationBufferMemory
from langchain import PromptTemplate

chat_history = []

# turn verbose to true to see the full logs and documents
qa= ConversationChain(
    llm=titan_llm, verbose=False, memory=ConversationBufferMemory() #memory_chain
)

print(f"ChatBot:DEFAULT:PROMPT:TEMPLATE: is ={qa.prompt.template}")

In [36]:
import ipywidgets as ipw
from IPython.display import display, clear_output

class ChatUX:
    """ A chat UX using IPWidgets
    """
    def __init__(self, qa, retrievalChain = False):
        self.qa = qa
        self.name = None
        self.b=None
        self.retrievalChain = retrievalChain
        self.out = ipw.Output()


    def start_chat(self):
        print("Starting chat bot")
        display(self.out)
        self.chat(None)


    def chat(self, _):
        if self.name is None:
            prompt = ""
        else: 
            prompt = self.name.value
        if 'q' == prompt or 'quit' == prompt or 'Q' == prompt:
            print("Thank you , that was a nice chat !!")
            return
        elif len(prompt) > 0:
            with self.out:
                thinking = ipw.Label(value="Thinking...")
                display(thinking)
                try:
                    if self.retrievalChain:
                        result = self.qa.run({'question': prompt })
                    else:
                        result = self.qa.run({'input': prompt }) #, 'history':chat_history})
                except:
                    result = "No answer"
                thinking.value=""
                print_ww(f"AI:{result}")
                self.name.disabled = True
                self.b.disabled = True
                self.name = None

        if self.name is None:
            with self.out:
                self.name = ipw.Text(description="You:", placeholder='q to quit')
                self.b = ipw.Button(description="Send")
                self.b.on_click(self.chat)
                display(ipw.Box(children=(self.name, self.b)))

Let's start a chat

In [None]:
chat = ChatUX(qa)
chat.start_chat()

## Chatbot with persona

AI assistant will play the role of a career coach. Role Play Dialogue requires user message to be set in before starting the chat. ConversationBufferMemory is used to pre-populate the dialog

In [None]:
memory = ConversationBufferMemory()
memory.chat_memory.add_user_message("You will be acting as a career coach. Your goal is to give career advice to users")
memory.chat_memory.add_ai_message("I am career coach and give career advice")
titan_llm = Bedrock(model_id="amazon.titan-tg1-large",client=boto3_bedrock)
conversation = ConversationChain(
     llm=titan_llm, verbose=True, memory=memory
)

print_ww(conversation.predict(input="What are the career options in AI?"))

##### Let's ask a question that is not specaility of this Persona and the model shouldnn't answer that question and give a reason for that

In [None]:
conversation.verbose = False
print_ww(conversation.predict(input="How to fix my car?"))

## Chatbot with Context 
In this use case we will ask the Chatbot to answer question from the context that it was passed. We will take a csv file and use Titan embeddings Model to create the vector. This vector is stored in FAISS. When chatbot is asked a question we pass this vector and retrieve the answer. 

#### Use a Titan embeddings Model - so we can use that to generate the embeddings for the documents

Embeddings are a way to represent words, phrases or any other discrete items as vectors in a continuous vector space. This allows machine learning models to perform mathematical operations on these representations and capture semantic relationships between them.


This will be used for the RAG [document search capability](https://labelbox.com/blog/how-vector-similarity-search-works/) 

Other Embeddings posible are here. [LangChain Embeddings](https://python.langchain.com/en/latest/reference/modules/embeddings.html)

Imports

In [18]:
from requests.auth import HTTPBasicAuth
import requests
import logging 
import json
import os

Setup Logging

In [20]:
logger = logging.getLogger('langchain')
logger.setLevel(logging.DEBUG)
logger.addHandler(logging.StreamHandler())

In [38]:
from langchain.embeddings import BedrockEmbeddings
from langchain.vectorstores import FAISS
from langchain.vectorstores import OpenSearchVectorSearch
from langchain import PromptTemplate

br_embeddings = BedrockEmbeddings(client=boto3_bedrock)

In [141]:
with open('config.yml', 'r') as file:
    config = yaml.safe_load(file)

es_username = config['credentials']['username']
es_password = config['credentials']['password']

domain_endpoint = config['domain']['endpoint']
domain_index = config['domain']['index']

In [142]:
URL = f'{domain_endpoint}/{domain_index}'
logger.info(f'URL for OpenSearch index = {URL}')

URL for OpenSearch index = https://search-sematic-search-4vgtrb5lpgqsss26pxewnosnjy.eu-west-1.es.amazonaws.com/rfp-nestle


Define the index mapping with a k-NN vector field

In [143]:
mapping = {
    'settings': {
        'index': {
            'knn': True  # Enable k-NN search for this index
        }
    },
    'mappings': {
        'properties': {
            'embedding': {  # k-NN vector field
                'type': 'knn_vector',
                'dimension': 4096  # Dimension of the vector
            },
            'page': {
                'type': 'long'
            },
            'passage': {
                'type': 'text'
            },
            'doc_name': {
                'type': 'keyword'
            }
        }
    }
}

Create the index with the specified mapping

In [144]:
# Check if the index exists using an HTTP HEAD request
response = requests.head(URL, auth=HTTPBasicAuth(es_username, es_password))

# If the index does not exist (status code 404), create the index
if response.status_code == 404:
    response = requests.put(URL, auth=HTTPBasicAuth(es_username, es_password), json=mapping)
    logger.info(f'Index created: {response.text}')
else:
    logger.error('Index already exists!' + str(response.status_code))

Index created: {"acknowledged":true,"shards_acknowledged":true,"index":"rfp-nestle"}


#### Create the embeddings for document search

#### Vector store indexer. 

This is what stores and matches the embeddings.This notebook showcases Chroma and FAISS and will be transient and in memory. The VectorStore Api's are available [here](https://python.langchain.com/en/harrison-docs-refactor-3-24/reference/modules/vectorstore.html)

We will use our own Custom implementation of SageMaker Embeddings which needs a reference to the SageMaker endpoint to call the model which will return the embeddings. This will be used by the FAISS or Chroma to store in memory and be used when ever the User runs a query

#### VectorStore as FAISS 

You can read up about [FAISS](https://arxiv.org/pdf/1702.08734.pdf) in memory vector store here. However for our example it will be the same 

Chroma

[Chroma](https://www.trychroma.com/) is a super simple vector search database. The core-API consists of just four functions, allowing users to build an in-memory document-vector store. By default Chroma uses the Hugging Face transformers library to vectorize documents.

Weaviate

[Weaviate](https://github.com/weaviate/weaviate) is a very posh looking tool - not only does Weaviate offer a GraphQL API with support for vector search. It also allows users to vectorize their content using Weaviate's inbuilt modules or custom modules.

In [134]:
from langchain.document_loaders import PyPDFLoader
from langchain.text_splitter import CharacterTextSplitter
from langchain.indexes.vectorstore import VectorStoreIndexWrapper

loader = PyPDFLoader("data/AWS-Response-to-Supply-Risk-Monitor-Solution-RFP.pdf")

documents_aws = loader.load() #
print(f"documents:loaded:size={len(documents_aws)}")

docs = CharacterTextSplitter(chunk_size=2000, chunk_overlap=400, separator=",").split_documents(documents_aws)
print(f"Documents:after split and chunking size={len(docs)}")

pages=[]
pages.extend(loader.load_and_split(CharacterTextSplitter(chunk_size=2000, chunk_overlap=400, separator=",")))

vectorstore_faiss_aws = FAISS.from_documents(
    documents=pages,
    embedding = br_embeddings, 
    #**k_args
)

print(f"vectorstore_faiss_aws:created={vectorstore_faiss_aws}::")

documents:loaded:size=19
Documents:after split and chunking size=29
vectorstore_faiss_aws:created=<langchain.vectorstores.faiss.FAISS object at 0x7f0ff9de5330>::


In [138]:
start = pages[0].metadata['source'].index('/')+1
end = len(pages[0].metadata['source'])
print(f"PageID={int(pages[0].metadata['page'])+1}, {pages[0].metadata['source'][start:end]}")
print(pages[0].page_content[:10])
 
doc_func = lambda x: x.page_content
passages = list(map(doc_func, pages))

embedded_docs = br_embeddings.embed_documents(passages)
print (len(embedded_docs))

PageID=1, AWS-Response-to-Supply-Risk-Monitor-Solution-RFP.pdf
1 
1 9  M 
29


In [196]:
'''service = 'es' # must set the service as 'es'
region = boto3.Session().region_name
credentials = boto3.Session.get_credentials()
awsauth = AWS4Auth(es_username, es_password, region,service, session_token=credentials.token)'''


# vector store index
docsearch = OpenSearchVectorSearch(
            opensearch_url=domain_endpoint,
            is_aoss=False,
            verify_certs = True,
            http_auth=(es_username, es_password),
            index_name = domain_index,
            embedding_function=br_embeddings)

docs = docsearch.similarity_search(
    "Solution Overview",
     vector_field="embedding",
     text_field="passage",
     metadata_field="*",
     k=5,
)
#print(docs)

'''docsearch = OpenSearchVectorSearch.from_documents(
    docs, 
    index_name=domain_index,
    br_embeddings, 
    opensearch_url=domain_endpoint,
    http_auth=(es_username, es_password),
    timeout = 300,
    use_ssl = True,
    verify_certs = True,
    connection_class = RequestsHttpConnection,
)'''


'docsearch = OpenSearchVectorSearch.from_documents(\n    docs, \n    index_name=domain_index,\n    br_embeddings, \n    opensearch_url=domain_endpoint,\n    http_auth=(es_username, es_password),\n    timeout = 300,\n    use_ssl = True,\n    verify_certs = True,\n    connection_class = RequestsHttpConnection,\n)'

In [166]:
%%time

i = 1
os_bulk_documents_and_index = ''
os_bulk_documents = []; 
os_doc_id = []; 
for page in pages:
    start = page.metadata['source'].index('/')+1
    end = len(page.metadata['source'])
    title = page.metadata['source'][start:end]
    page_num = int(page.metadata['page'])+1
    
    print(f"PageID: {page_num}, Doc Name: {title}")
  
    embedding = embedded_docs[i-1]
    
    #embedding ='test'
   
    document = { 
        'doc_name': title, 
        'page': page_num,
        'passage': page.page_content, 
        'embedding': embedding}
    index= { "index": { "_id": title + "_"+str(i)} }
 
    
    #For bulk insert
    os_doc_id.append(index);
    os_bulk_documents.append(document);
    os_bulk_documents_and_index = os_bulk_documents_and_index + f'{json.dumps(index)}\n{json.dumps(document)}\n'
    i += 1
    #bulk end


PageID: 1, Doc Name: AWS-Response-to-Supply-Risk-Monitor-Solution-RFP.pdf
PageID: 2, Doc Name: AWS-Response-to-Supply-Risk-Monitor-Solution-RFP.pdf
PageID: 3, Doc Name: AWS-Response-to-Supply-Risk-Monitor-Solution-RFP.pdf
PageID: 4, Doc Name: AWS-Response-to-Supply-Risk-Monitor-Solution-RFP.pdf
PageID: 4, Doc Name: AWS-Response-to-Supply-Risk-Monitor-Solution-RFP.pdf
PageID: 5, Doc Name: AWS-Response-to-Supply-Risk-Monitor-Solution-RFP.pdf
PageID: 6, Doc Name: AWS-Response-to-Supply-Risk-Monitor-Solution-RFP.pdf
PageID: 6, Doc Name: AWS-Response-to-Supply-Risk-Monitor-Solution-RFP.pdf
PageID: 7, Doc Name: AWS-Response-to-Supply-Risk-Monitor-Solution-RFP.pdf
PageID: 7, Doc Name: AWS-Response-to-Supply-Risk-Monitor-Solution-RFP.pdf
PageID: 8, Doc Name: AWS-Response-to-Supply-Risk-Monitor-Solution-RFP.pdf
PageID: 8, Doc Name: AWS-Response-to-Supply-Risk-Monitor-Solution-RFP.pdf
PageID: 9, Doc Name: AWS-Response-to-Supply-Risk-Monitor-Solution-RFP.pdf
PageID: 10, Doc Name: AWS-Response-to-

In [164]:
#Bulk insert docs in openSearch Index
#specifying the index in the path means you don’t need to include it in the request body.

response = requests.post(f'{URL}/_bulk', auth=HTTPBasicAuth(es_username, es_password), data=os_bulk_documents_and_index, headers={'Content-Type': 'application/x-ndjson'})
 
if response.status_code not in [200, 201]:
    logger.error(response.status_code)
    logger.error(response.text)
    
print(response)

TypeError: 'Response' object is not subscriptable

In [168]:
start = pages[0].metadata['source'].index('/')+1
end = len(pages[0].metadata['source'])
title = pages[0].metadata['source'][start:end]

response = requests.get(f'{URL}/_doc/{title}_1', auth=HTTPBasicAuth(es_username, es_password))

print (response.text[:300])

{"_index":"rfp-nestle","_id":"AWS-Response-to-Supply-Risk-Monitor-Solution-RFP.pdf_1","_version":2,"_seq_no":2,"_primary_term":1,"found":true,"_source":{"doc_name": "AWS-Response-to-Supply-Risk-Monitor-Solution-RFP.pdf", "page": 1, "passage": "1 \n1 9  M A Y  2 0 2 3  \nResponse to  \nNestl\u00e9  \


#### To run a quick low code test 

We can use a Wrapper class provided by LangChain to query the vector data base store and return to us the relevant documents. Behind the scenes this is only going to run a QA Chain with all default values

In [82]:
wrapper_store_faiss = VectorStoreIndexWrapper(vectorstore=vectorstore_faiss_aws)
print_ww(wrapper_store_faiss.query("solution overview from AWS", llm=titan_llm))


AWS ADA will deliver Nestlé supply risk reduction as set forth in Your RFP. Our Cloud is compliant
and meets Nestlé security and policy requirements and can be ready to support Your Supply risk
processes within weeks. Our Cloud is compliant with all Nestlé policies and standards, including
Nestlé security. Cloud hosting from AWS also enhances Your data security, facilitates Supply Chain
collaboration, and provides a robust infrastructure for Nestlé Cloud journey. For this reason,
Gartner has ranked AWS as the leader in Cloud for the last 12 consecutive years. Overall, AWS
innovations drive Nestlé agility, competitiveness, and cost-effectiveness positioning You as


In [107]:
query = "Nestle's supply chain challenges"

resp = vectorstore_faiss_aws.similarity_search(query, k=3)
for res in resp:
    print(str(res.metadata['page']) + ":", res.page_content[:300])

10: including political instability, trade disputes, and changes in 
import/export regulations, can introduce significant supply risks. Such external factors 
can disrupt established Supply Chains, increase costs, and create uncertainties. 
Companies need to stay informed about political developments, e
10: Executive Summary Error! No style name given. Error! No style name given. Error! No style 
name given. Error! No style name given. Error! No style name given.   
Amazon Confidential  | 19th of May 2023  11 
Nestlé  
Supply Risk Monitor Solution Request for Proposal  
• Port congestion is another ext
9: Executive Summary Error! No style name given. Error! No style name given. Error! No style 
name given. Error! No style name given. Error! No style name given.   
Amazon Confidential  | 19th of May 2023  10 
Nestlé  
Supply Risk Monitor Solution Request for Proposal  
Technical Offer  
[Insert introd


#### Chatbot application

For the chatbot we need context management, history, vector stores, and many other things. We will start by with a ConversationalRetrievalChain

This uses conversation memory and RetrievalQAChain which Allow for passing in chat history which can be used for follow up questions.Source: https://python.langchain.com/en/latest/modules/chains/index_examples/chat_vector_db.html

Set verbose to True to see all the what is going on behind the scenes

In [32]:
from langchain.memory import ConversationBufferMemory
from langchain.chains import ConversationChain
from langchain.chains import ConversationalRetrievalChain
from langchain.chains.conversational_retrieval.prompts import CONDENSE_QUESTION_PROMPT


def create_prompt_template():
    _template = """{chat_history}

Answer only with the new question.
How would you ask the question considering the previous conversation: {question}
Question:"""
    CONVO_QUESTION_PROMPT = PromptTemplate.from_template(_template)
    return CONVO_QUESTION_PROMPT

memory_chain = ConversationBufferMemory(memory_key="chat_history", input_key="question", return_messages=True)
chat_history=[]

#### Parameters used for ConversationRetrievalChain
retriever: We used VectoreStoreRetriver, which is backed by a VectorStore. To retrieve text, there are two search types you can choose: search_type: “similarity” or “mmr”. search_type="similarity" uses similarity search in the retriever object where it selects text chunk vectors that are most similar to the question vector.

memory: Memory Chain to store the history 

condense_question_prompt: Given a question from the user, we use the previous conversation and that question to make up a standalone question

chain_type: If the chat history is long and doesn't fit the context you use this parameter and the options are "stuff", "refine", "map_reduce", "map-rerank"

Note: If the question asked is outside the scope of context passed then the model will reply it doesn't know the answer

In [213]:
# turn verbose to true to see the full logs and documents
from langchain.memory import ConversationBufferMemory
from langchain.chains import ConversationChain
from langchain.chains import ConversationalRetrievalChain
qa = ConversationalRetrievalChain.from_llm(
    llm=titan_llm, 
    #retriever=vectorstore_faiss_aws.as_retriever(), 
    #retriever=vectorstore_faiss_aws.as_retriever(),
    retriever = docsearch.as_retriever(search_type='similarity', search_kwargs={"k": 8, "vector_field":"embedding",  "text_field": "passage", "metadata_field": "*"}),
    memory=memory_chain,
    #verbose=True,
    #condense_question_prompt=CONDENSE_QUESTION_PROMPT, # create_prompt_template(), 
    chain_type='stuff', # 'refine',
    #max_tokens_limit=100
)

qa.combine_docs_chain.llm_chain.prompt = PromptTemplate.from_template("""
{context}

Answer the question inside the <q></q> XML tags. 

<q>{question}</q>

Do not use any XML tags in the answer. If the answer is not in the context say "Sorry, I don't know, as the answer was not found in the context."

Answer:""")

Let's start a chat

In [214]:
chat = ChatUX(qa, retrievalChain=True)
chat.start_chat()

Starting chat bot


Output()

### In this demo we used Titan LLM to create conversational interface with following patterns:

1. Chatbot (Basic - without context)

2. Chatbot using prompt template(Langchain)

3. Chatbot with personas

4. Chatbot with context