# Testing LLM Capabilities for Convey - An Interactive Survey Interface

In this notebook, we will explore the capabilities of Large Language Models (LLMs) for our project in building an interactive survey interface. We'll focus on the following tasks:

## 1. RAG (Retrieval-Augmented Generation)
- Implementing and fine-tuning RAG for tasks such as responding and asking follow-up questions to users in a personalised manner.
- Exploring RAG's ability to provide relevant product-specific responses based on retrieval from a knowledge source.

## 2. Prompt Engineering
- Crafting effective prompts to guide the LLM's responses.
- Experimenting with different prompt formats and strategies to optimise performance.

## 3. Vector Store Manipulation
- Manipulating vector stores to enhance the understanding and generation capabilities of the LLM.
- Examining the impact of vector store modifications on the quality and relevance of generated responses.

We'll use this notebook to test various features and functionalities provided by the LLM and assess its suitability for the Convey platform.

# Getting Started

1. Create and activate a virtual environment before running the command below to install the necessary Python packages.
2. Create a hugging face api token and store it in the current working directory in a .env file as follows:

    HUGGINGFACEHUB_API_TOKEN="hf_***************"

In [None]:
#%pip install -r requirements.txt

# Import Packages

In [None]:
import os
from dotenv import load_dotenv, find_dotenv
from pathlib import Path

from langchain.docstore.document import Document
from langchain.embeddings import HuggingFaceEmbeddings
from langchain_community.vectorstores import FAISS
from langchain_community.llms import HuggingFaceEndpoint
from langchain.memory import ConversationBufferMemory
from langchain.prompts import PromptTemplate
from operator import itemgetter
from langchain_core.runnables import RunnableParallel
from langchain_core.output_parsers import StrOutputParser
from transformers import pipeline
from transformers.utils import logging

# Loading Hugging Face Hub API Token into OS

In [None]:
# Load API keys from local .env file if available
if os.path.isfile('.env'):
    # Set path to api key
    dotenv_path = Path('.env')
    load_dotenv(dotenv_path=dotenv_path)
else:
    load_dotenv(find_dotenv())

# Vector Store Using Survey Questions

## Defining Survey Questions and Creating Document Objects

In [None]:
demographic_questions = [
    #{'id': 1, 'question': "What is your name?", "check_user_response": 0},   # This question is taken out and assumed as the first survey question
    {'id': 2, 'question': "What is your age group?", "check_user_response": 0},
    {'id': 3, 'question': "what is your gender identity?", "check_user_response": 0},
]

# Creating Document objects for survey questions
demographic_documents = [
    Document(
        page_content=question['question'],
        metadata={
            "id": question['id'],
            "category":"demographics",
            "check": question['check_user_response']
        }
    ) for question in demographic_questions
]

stage_0_questions = [
    {'id': 4, 'question': "What is your hair length?", "check_user_response": 0},
    {'id': 5, 'question': "What is your hair type?", "check_user_response": 0},
    {'id': 6, 'question': "What are your hair concerns?", "check_user_response": 0},
    {'id': 7, 'question': "What is your scalp type?", "check_user_response": 0},
    {'id': 8, 'question': "What are your scalp concerns?", "check_user_response": 0},
    {'id': 9, 'question': "What hair treatments have you done?", "check_user_response": 0},
]
# Creating Document objects for survey questions
stage_0_documents = [
    Document(
        page_content=question['question'],
        metadata={
            "id": question['id'],
            "stage": "0",
            "check": question['check_user_response']
        }
    ) for question in stage_0_questions
]

stage_1_questions = [
    {'id': 10, 'question': "How often do you wash your hair?", "check_user_response": 0},
    {'id': 11, 'question': "What hair products do you use regularly?", "check_user_response": 0},
    {'id': 12, 'question': "What hair styling products do you use regularly?", "check_user_response": 0},
    {'id': 13, 'question': "How often do you switch hair product brands?", "check_user_response": 0},
    {'id': 14, 'question': "How often do you visit hair salons or barber shops?", "check_user_response": 0},
    {'id': 15, 'question': "What is your ideal hair goal?", "check_user_response": 0},
    {'id': 16, 'question': "How important is hair health to you?", "check_user_response": 0},
]
stage_1_documents = [
    Document(
        page_content=question['question'],
        metadata={
            "id": question['id'],
            "stage": "1",
            "check": question['check_user_response']
        }
    ) for question in stage_1_questions
]

stage_2_questions = [
    {'id': 17, 'question': "Which of the following Pantene product series (collections) are you aware of?", "check_user_response": 0},
    {'id': 18, 'question': "From where did you know pantene?", "check_user_response": 0},
    {'id': 19, 'question': "What is your favorite pantene product and what do you like about it?", "check_user_response": 0},
    {'id': 20, 'question': "what is your least favorite pantene product and what do you dislike about it?", "check_user_response": 0},
    {'id': 21, 'question': "How would you rate the effectiveness of your favorite / least favorite pantene product?", "check_user_response": 0},
    {'id': 22, 'question': "Would you recommend your current hair products to others? Why?", "check_user_response": 1},
    {'id': 23, 'question': "What hair product improvements would you like to see in the future?", "check_user_response": 1},
]
stage_2_documents = [
    Document(
        page_content=question['question'],
        metadata={
            "id": question['id'],
            "stage": "2",
            "check": question['check_user_response']
        }
    ) for question in stage_2_questions
]

stage_3_questions = [
    {'id': 24, 'question': "When choosing hair products, how important are the following factors to you?", "check_user_response": 0},
    {'id': 25, 'question': "What is your preferred price range for hair products?", "check_user_response": 0},
    {'id': 26, 'question': "Do you prefer to purchase hair products online or in-store? If in-store, which stores?", "check_user_response": 1},
]
stage_3_documents = [
    Document(
        page_content=question['question'],
        metadata={
            "id": question['id'],
            "stage": "3",
            "check": question['check_user_response']
        }
    ) for question in stage_3_questions
]

## Initialising an Embedding Model from Hugging Face

In [None]:
# Using an embedding model from Hugging Face
embedding_model = HuggingFaceEmbeddings(
    model_name='all-MiniLM-L6-v2', 
    model_kwargs={'device': 'cpu'},
    encode_kwargs = {'normalize_embeddings': False}
)

## Employing FAISS Vector Store

In [None]:
# Creating a vectorstore for the documents/survey questions
demographic_db = FAISS.from_documents(
    demographic_documents,
    embedding=embedding_model,
)
# Saving the vectorstore in local directory - persistence
demographic_db.save_local("demographic_questions")
# Loading the vectorstore from local directory
demographic_db = FAISS.load_local("demographic_questions", embedding_model, allow_dangerous_deserialization=True)

stage_0_db = FAISS.from_documents(
    stage_0_documents,
    embedding=embedding_model,
)
stage_0_db.save_local("stage_0_questions")
stage_0_db = FAISS.load_local("stage_0_questions", embedding_model, allow_dangerous_deserialization=True)

stage_1_db = FAISS.from_documents(
    stage_1_documents,
    embedding=embedding_model,
)
stage_1_db.save_local("stage_1_questions")
stage_1_db = FAISS.load_local("stage_1_questions", embedding_model, allow_dangerous_deserialization=True)

stage_2_db = FAISS.from_documents(
    stage_2_documents,
    embedding=embedding_model,
)
stage_2_db.save_local("stage_2_questions")
stage_2_db = FAISS.load_local("stage_2_questions", embedding_model, allow_dangerous_deserialization=True)

stage_3_db = FAISS.from_documents(
    stage_3_documents,
    embedding=embedding_model,
)
stage_3_db.save_local("stage_3_questions")
stage_3_db = FAISS.load_local("stage_3_questions", embedding_model, allow_dangerous_deserialization=True)

## Similarity Search

In [None]:
text = "30 years old"

demographic_db.similarity_search_with_score(text, k=1, filter=dict(category='demographics'))

# RAG Pipeline

## Initialising an Open-source LLM from Hugging Face 

In [None]:
ENDPOINT_URL = "https://api-inference.huggingface.co/models/mistralai/Mixtral-8x7B-Instruct-v0.1"
# ENDPOINT_URL = "mistralai/Mixtral-8x7B-Instruct-v0.1"

# callbacks = [StreamingStdOutCallbackHandler()]
llm = HuggingFaceEndpoint(
    endpoint_url=ENDPOINT_URL,
    task="text-generation",
    max_new_tokens=256,
    #top_k=50,
    temperature=0.01,
    #repetition_penalty=1.03,
    return_full_text=False,
    # callbacks=callbacks,
    streaming=True,
    stop_sequences=['</s>'],
)

## Creating a Retriever with Vector Store

In [None]:
def get_retriever(vectorstore: FAISS):
    # Setting retriever to only retrieve the best follow-up question 
    retriever = vectorstore.as_retriever(search_kwargs={'k': 1})
    return retriever

retriever = get_retriever(demographic_db)

## Simulating First Survey Question

In [None]:
# Ask the first question
first_question = llm.invoke("[INST]I am doing a survey. Greet me excitedly and ask me what is my age. Do not add anything.[/INST]") #in a fairy tale setting

first_question

## Creating a Chat Log Object 

In [None]:
# Logging of chat
def create_chat_log():
    memory = ConversationBufferMemory(return_messages=False, memory_key='chat_history')
    return memory

def add_to_chat_log(chat_log, message_type: str, message: str):
    if message_type == 'ai':
        chat_log.chat_memory.add_ai_message(message)
    else:
        chat_log.chat_memory.add_user_message(message)

def get_chat_history(chat_log):
    chat_history = chat_log.load_memory_variables({})['chat_history']
    return chat_history


chat_log = create_chat_log()
add_to_chat_log(chat_log, message_type='ai', message=first_question)
get_chat_history(chat_log)

## Initialising RAG Chain

In [None]:
#from langchain_core.runnables import RunnableLambda - to be used for multiple arguments input

def get_rag_chain(retriever):
    # General prompt for all questions
    prompt_template = """You are a friendly survey interface assistant.
        You are given a survey question, a survey user response to that question, the sentiment of the user response and a follow-up question below.
        Reply to the survey user response kindly and just ask the follow-up question. Do not say anything else.
        Do not ask any other questions.

        Question: {previous_question}
        User response: {user_response}
        User sentiment: {sentiment}
        Follow-up question: {next_question}
        
        Reply:"""
    prompt = PromptTemplate(
        template=prompt_template, input_variables=['previous_question', 'user_response', 'next_question', 'sentiment']
    )

    def format_docs(docs):
        return "\n\n".join(doc.page_content for doc in docs)
        # return "\n\n".join(doc.metadata['prompt'] + '\n' + doc.page_content for doc in docs)

    rag_chain = (
        # Retrieve next best question
        RunnableParallel({"docs": itemgetter("user_response") | retriever, "user_response": itemgetter("user_response"), "sentiment": itemgetter("sentiment"), "previous_question": itemgetter("previous_question")})
        # Optional: Format question to ask user
        | ({"docs": lambda x: x['docs'], "user_response": itemgetter("user_response"), "sentiment": itemgetter("sentiment"), "next_question": lambda x: format_docs(x['docs']), "previous_question": itemgetter("previous_question")})
        # Optional: Prompt Engineering - Each question to have their own prompt template for LLM to ask the question
        | ({"docs": lambda x: x['docs'], "prompt": prompt, "user_response": itemgetter("user_response"), "sentiment": itemgetter("sentiment"), "next_question": itemgetter("next_question"), "previous_question": itemgetter("previous_question")}) 
        # Output results
        | ({"answer": itemgetter("prompt") | llm | StrOutputParser(), "docs": lambda x: x['docs'], "user_response": itemgetter("user_response"), "sentiment": itemgetter("sentiment"), "previous_question": itemgetter("previous_question")})
    )
    return rag_chain 


rag_chain = get_rag_chain(retriever)

## Invoking RAG Chain with User Response to First Question

In [None]:
user_response = "I am Xiao Ming."
add_to_chat_log(chat_log, message_type='user', message=user_response)
get_chat_history(chat_log)

### Sentiment of user response

In [None]:
logging.set_verbosity_error() 

def get_user_sentiment(user_response: str):
    pipe = pipeline("text-classification", model="cardiffnlp/twitter-roberta-base-sentiment-latest")
    user_sentiment = pipe(user_response)[0]['label']
    return user_sentiment

user_sentiment = get_user_sentiment(user_response)
user_sentiment

In [None]:
def invoke_rag_chain(rag_chain, user_response: str, user_sentiment: str, previous_question: str):
    output = {}
    for chunk in rag_chain.stream(dict(user_response=user_response, sentiment=user_sentiment, previous_question=previous_question)):
        for key in chunk:
            if key not in output:
                output[key] = chunk[key].strip() if key == 'answer' else chunk[key]
            # if key == 'answer':
                # new_token = chunk[key]
                # yield new_token
                # output[key] += new_token
            else:
                output[key] += chunk[key]
            if key == 'answer':
                print(chunk[key], end="", flush=True)
    return output
    
def get_llm_outputs(rag_chain, user_response: str, previous_question: str):
    user_sentiment = get_user_sentiment(user_response)
    output = invoke_rag_chain(rag_chain, user_response, user_sentiment, previous_question)
    # LLM reply to output to frontend
    llm_reply = output['answer']
    # Get document of question asked by LLM 
    next_question_document = output['docs'][0]
    # id of question asked to output to frontend 
    next_question_id = next_question_document.metadata['id']
    return llm_reply, next_question_document, next_question_id


llm_reply, next_question_document, next_question_id = get_llm_outputs(rag_chain, user_response, first_question)

## Deleting Asked Question from Vector Store Object

In [None]:
def remove_question_from_db(vectorstore: FAISS, document_to_delete: Document):
    count = 0
    for key, item in vectorstore.docstore._dict.items():
        count += 1
        if item == document_to_delete:
            break
    if count >= 0:
        vectorstore.delete([vectorstore.index_to_docstore_id[count-1]])
    return vectorstore


print(len(demographic_db.docstore._dict))
demographic_db = remove_question_from_db(demographic_db, next_question_document)
print(len(demographic_db.docstore._dict))

## Response verification

In [None]:
from langchain.evaluation import Criteria

list(Criteria)

In [None]:
from langchain.evaluation import load_evaluator
from langchain.evaluation import EvaluatorType

evaluator = load_evaluator(EvaluatorType.CRITERIA, criteria="coherence", llm=llm)

In [None]:
response = "ahhahahah"
eval_result = evaluator.evaluate_strings(
        prediction=response,
        input='What is your gender identity?',
    )
eval_result

In [None]:
def verify_user_response(question,response):
    eval_result = evaluator.evaluate_strings(
        prediction=response,
        input=question,
    )

    return eval_result['value']

# Conversation Simulation

Make sure to run the above functions.

## Reload Vector Store From Local Directory

In [None]:
demographic_db = FAISS.load_local("demographic_questions", embedding_model, allow_dangerous_deserialization=True)
stage_0_db = FAISS.load_local("stage_0_questions", embedding_model, allow_dangerous_deserialization=True)
stage_1_db = FAISS.load_local("stage_1_questions", embedding_model, allow_dangerous_deserialization=True)
stage_2_db = FAISS.load_local("stage_2_questions", embedding_model, allow_dangerous_deserialization=True)
stage_3_db = FAISS.load_local("stage_3_questions", embedding_model, allow_dangerous_deserialization=True)

## Begin Loop

In [None]:
# LLM chain to end the survey:
def end_survey():
    print('\n')
    print("It was interesting to get to know more about you! Thank you for participating in the survey!")
    print("If you have any further questions or feedback, feel free to reach out to us.")

# Get question asked
def get_question_asked(question_document):
    # Retrieve original question based on question_id
    return question_document.page_content

# get_question_asked(next_question_document)

In [None]:
chat_log = create_chat_log()
stage = None # Change this for testing different stages
db = demographic_db
retriever = get_retriever(demographic_db)
question_asked = "What is your name?"
user_response = ""
next_question_document = None
clarified = False

first_question = llm.invoke(f"[INST]I am starting to answer a survey. Greet me and ask me: {question_asked}[/INST]")
print(f"LLM: {first_question}")

while True:
    # User responded
    if user_response:
        # Check user response for questions that are specified to check
        if (next_question_document is not None) and (next_question_document.metadata['check'] == 1):
            # Check if user response is coherent with the question asked
            isCoherent = True if verify_user_response(question_asked, user_response) == 'Y' else False
            # If not coherent, ask the question again
            if not isCoherent:
                # Allow only one clarification per question i.e. repeat the question once
                if clarified:
                    clarified = False
                    pass
                else:
                    clarified = True
                    # TO DO: Improve the instruction or construct a LLM chain to ask the question again.
                    repeat_question = llm.invoke(f"[INST] As the survey user did not answer the question correctly. Ask the question kindly again: {question_asked} [/INST]")
                    print('\n')
                    print(f"LLM: {repeat_question}")
                    # Wait for user input
                    user_response = input()
                    print('\n')
                    print("User: ", end='')
                    print(user_response)
                    add_to_chat_log(chat_log, message_type='user', message=user_response)
                    continue
        
        # Survey flow
        if len(db.docstore._dict) == 0 and stage is None:
            stage = 0
            db = stage_0_db
        elif len(db.docstore._dict) == 0 and stage == 0:
            stage = 1
            db = stage_1_db
        elif len(db.docstore._dict) == 0 and stage == 1:
            stage = 2
            db = stage_2_db
        if len(db.docstore._dict) == 0 and stage == 2:
            stage = 3
            db = stage_3_db
        elif len(db.docstore._dict) == 0 and stage == 3:
            # To end the survey gracefully
            end_survey()
            break

        ## Ask the next best question based on previous survey user response
        # Create new retriever object with updated vectorstore
        retriever = get_retriever(db)
        # Create new RAG chain with updated retriever
        qa_chain = get_rag_chain(retriever)
        print('\n')
        print("LLM: ", end='')
        # Get LLM reply, next question to ask and its question id
        llm_reply, next_question_document, next_question_id = get_llm_outputs(qa_chain, user_response, question_asked)
        # Get question asked
        question_asked = get_question_asked(next_question_document)
        add_to_chat_log(chat_log, message_type='ai', message=llm_reply)
        # Updated vectorstore with asked question removed
        db = remove_question_from_db(db, next_question_document)
        
        
    # Wait for user input
    user_response = input()
    print('\n')
    print("User: ", end='')
    print(user_response)
    add_to_chat_log(chat_log, message_type='user', message=user_response)