# Testing LLM Capabilities for Convey - An Interactive Survey Interface

In this notebook, we will explore the capabilities of Large Language Models (LLMs) for our project in building an interactive survey interface. We'll focus on the following tasks:

## 1. RAG (Retrieval-Augmented Generation)
- Implementing and fine-tuning RAG for tasks such as responding and asking follow-up questions to users in a personalised manner.
- Exploring RAG's ability to provide relevant product-specific responses based on retrieval from a knowledge source.

## 2. Prompt Engineering
- Crafting effective prompts to guide the LLM's responses.
- Experimenting with different prompt formats and strategies to optimise performance.

## 3. Vector Store Manipulation
- Manipulating vector stores to enhance the understanding and generation capabilities of the LLM.
- Examining the impact of vector store modifications on the quality and relevance of generated responses.

We'll use this notebook to test various features and functionalities provided by the LLM and assess its suitability for the Convey platform.

# Getting Started

1. Create and activate a virtual environment before running the command below to install the necessary Python packages.
2. Create a hugging face api token and store it in the current working directory in a .env file as follows:

    HUGGINGFACEHUB_API_TOKEN="hf_***************"

In [125]:
#%pip install -r requirements.txt

# Import Packages

In [126]:
import os
from dotenv import load_dotenv, find_dotenv
from pathlib import Path

from langchain.docstore.document import Document
from langchain.embeddings import HuggingFaceEmbeddings
from langchain_community.vectorstores import FAISS
from langchain_community.llms import HuggingFaceEndpoint
from langchain.memory import ConversationBufferMemory
from langchain.prompts import PromptTemplate
from operator import itemgetter
from langchain_core.runnables import RunnableParallel
from langchain_core.output_parsers import StrOutputParser
from transformers import pipeline
from transformers.utils import logging

# Loading Hugging Face Hub API Token into OS

In [127]:
# Load API keys from local .env file if available
if os.path.isfile('.env'):
    # Set path to api key
    dotenv_path = Path('.env')
    load_dotenv(dotenv_path=dotenv_path)
else:
    load_dotenv(find_dotenv())

# Vector Store Using Survey Questions

## Defining Survey Questions and Creating Document Objects

In [128]:
demographic_questions = [
    #"What is your age?",   # This question is taken out and assumed as the first survey question
    "What is your gender identity?",
    "Where do you live?",
    "What is your approximate annual income?",
    "What is your employment status?",
    "What is the highest level of education you have completed?",
    "Do you have children under the age of 18?",
    "What is your marital status?",
    "What is your primary language?",
]

# Creating Document objects for survey questions
documents = [
    Document(
        page_content=question,
        metadata={
            "id":i,
            "category":"demographics",
            #"prompt":question_prompt
        }
    ) for i, question in enumerate(demographic_questions)
]

documents

[Document(page_content='What is your gender identity?', metadata={'id': 0, 'category': 'demographics'}),
 Document(page_content='Where do you live?', metadata={'id': 1, 'category': 'demographics'}),
 Document(page_content='What is your approximate annual income?', metadata={'id': 2, 'category': 'demographics'}),
 Document(page_content='What is your employment status?', metadata={'id': 3, 'category': 'demographics'}),
 Document(page_content='What is the highest level of education you have completed?', metadata={'id': 4, 'category': 'demographics'}),
 Document(page_content='Do you have children under the age of 18?', metadata={'id': 5, 'category': 'demographics'}),
 Document(page_content='What is your marital status?', metadata={'id': 6, 'category': 'demographics'}),
 Document(page_content='What is your primary language?', metadata={'id': 7, 'category': 'demographics'})]

## Initialising an Embedding Model from Hugging Face

In [129]:
# Using an embedding model from Hugging Face
embedding_model = HuggingFaceEmbeddings(
    model_name='all-MiniLM-L6-v2', 
    model_kwargs={'device': 'cpu'},
    encode_kwargs = {'normalize_embeddings': False}
)

## Employing FAISS Vector Store

In [130]:
# Creating a vectorstore for the documents/survey questions
db = FAISS.from_documents(
    documents,
    embedding=embedding_model,
)

# Saving the vectorstore in local directory - persistence
db.save_local("faiss_index")

# Loading the vectorstore from local directory
db = FAISS.load_local("faiss_index", embedding_model, allow_dangerous_deserialization=True)

## Similarity Search

In [131]:
text = "30 years old"

db.similarity_search_with_score(text, k=1, filter=dict(category='demographics'))

[(Document(page_content='Do you have children under the age of 18?', metadata={'id': 5, 'category': 'demographics'}),
  1.2642794)]

# RAG Pipeline

## Initialising an Open-source LLM from Hugging Face 

In [132]:
ENDPOINT_URL = "https://api-inference.huggingface.co/models/mistralai/Mixtral-8x7B-Instruct-v0.1"
# ENDPOINT_URL = "mistralai/Mixtral-8x7B-Instruct-v0.1"

# callbacks = [StreamingStdOutCallbackHandler()]
llm = HuggingFaceEndpoint(
    endpoint_url=ENDPOINT_URL,
    task="text-generation",
    max_new_tokens=128,
    #top_k=50,
    temperature=0.01,
    #repetition_penalty=1.03,
    return_full_text=False,
    # callbacks=callbacks,
    streaming=True,
    stop_sequences=['</s>'],
)

Token has not been saved to git credential helper. Pass `add_to_git_credential=True` if you want to set the git credential as well.
Token is valid (permission: read).
Your token has been saved to C:\Users\user\.cache\huggingface\token
Login successful


## Creating a Retriever with Vector Store

In [133]:
def get_retriever(vectorstore: FAISS):
    # Setting retriever to only retrieve the best follow-up question 
    retriever = vectorstore.as_retriever(search_kwargs={'k': 1})
    return retriever

retriever = get_retriever(db)

## Simulating First Survey Question

In [134]:
# Ask the first question
first_question = llm.invoke("[INST]I am doing a survey. Greet me excitedly and ask me what is my age. Do not add anything.[/INST]") #in a fairy tale setting

first_question

" Hello there! I'm so excited to be part of your survey. I was wondering if you could tell me your age? Thank you!"

## Creating a Chat Log Object 

In [135]:
# Logging of chat
def create_chat_log():
    memory = ConversationBufferMemory(return_messages=False, memory_key='chat_history')
    return memory

def add_to_chat_log(chat_log, message_type: str, message: str):
    if message_type == 'ai':
        chat_log.chat_memory.add_ai_message(message)
    else:
        chat_log.chat_memory.add_user_message(message)

def get_chat_history(chat_log):
    chat_history = chat_log.load_memory_variables({})['chat_history']
    return chat_history


chat_log = create_chat_log()
add_to_chat_log(chat_log, message_type='ai', message=first_question)
get_chat_history(chat_log)

"AI:  Hello there! I'm so excited to be part of your survey. I was wondering if you could tell me your age? Thank you!"

## Initialising RAG Chain

In [136]:
#from langchain_core.runnables import RunnableLambda - to be used for multiple arguments input

def get_rag_chain(retriever):
    # General prompt for all questions
    prompt_template = """You are a friendly survey interface assistant.
        You are given a survey user response, its sentiment and a follow-up question below.
        Reply to the survey user kindly and ask the follow-up question.
        Do not ask any other questions.

        User response: {question}
        User sentiment: {sentiment}
        Follow-up question: {context}
        
        Reply:"""
    prompt = PromptTemplate(
        template=prompt_template, input_variables=['question', 'context', 'sentiment']
    )

    def format_docs(docs):
        return "\n\n".join(doc.page_content for doc in docs)
        # return "\n\n".join(doc.metadata['prompt'] + '\n' + doc.page_content for doc in docs)

    rag_chain = (
        # Retrieve next best question
        RunnableParallel({"docs": itemgetter("question") | retriever, "question": itemgetter("question"), "sentiment": itemgetter("sentiment")})
        # Optional: Format question to ask user
        | ({"docs": lambda x: x['docs'], "question": itemgetter("question"), "sentiment": itemgetter("sentiment"), "context": lambda x: format_docs(x['docs'])})
        # Optional: Prompt Engineering - Each question to have their own prompt template for LLM to ask the question
        | ({"docs": lambda x: x['docs'], "prompt": prompt, "question": itemgetter("question"), "sentiment": itemgetter("sentiment"), "context": itemgetter("context")}) 
        # Output results
        | ({"answer": itemgetter("prompt") | llm | StrOutputParser(), "docs": lambda x: x['docs'], "question": itemgetter("question"), "sentiment": itemgetter("sentiment") })
    )
    return rag_chain 


rag_chain = get_rag_chain(retriever)

## Invoking RAG Chain with User Response to First Question

In [137]:
user_response = "I am 24 years old."
add_to_chat_log(chat_log, message_type='user', message=user_response)
get_chat_history(chat_log)

"AI:  Hello there! I'm so excited to be part of your survey. I was wondering if you could tell me your age? Thank you!\nHuman: I am 24 years old."

### Sentiment of user response

In [138]:
logging.set_verbosity_error() 

def get_user_sentiment(user_response: str):
    pipe = pipeline("text-classification", model="cardiffnlp/twitter-roberta-base-sentiment-latest")
    user_sentiment = pipe(user_response)[0]['label']
    return user_sentiment

user_sentiment = get_user_sentiment(user_response)
user_sentiment

'neutral'

In [139]:
def invoke_rag_chain(rag_chain, user_response: str, user_sentiment: str):
    output = {}
    for chunk in rag_chain.stream(dict(question=user_response, sentiment=user_sentiment)):
        for key in chunk:
            if key not in output:
                output[key] = chunk[key].strip() if key == 'answer' else chunk[key]
            # if key == 'answer':
                # new_token = chunk[key]
                # yield new_token
                # output[key] += new_token
            else:
                output[key] += chunk[key]
            if key == 'answer':
                print(chunk[key], end="", flush=True)
    return output
    
def get_llm_outputs(rag_chain, user_response: str):
    user_sentiment = get_user_sentiment(user_response)
    output = invoke_rag_chain(rag_chain, user_response, user_sentiment)
    # LLM reply to output to frontend
    llm_reply = output['answer']
    # Get document of question asked by LLM 
    next_question_document = output['docs'][0]
    # id of question asked to output to frontend 
    next_question_id = next_question_document.metadata['id']
    return llm_reply, next_question_document, next_question_id


llm_reply, next_question_document, next_question_id = get_llm_outputs(rag_chain, user_response)

 Thank you for sharing your age with us! Do you have children under the age of 18?

## Deleting Asked Question from Vector Store Object

In [140]:
def remove_question_from_db(vectorstore: FAISS, document_to_delete: Document):
    count = 0
    for key, item in vectorstore.docstore._dict.items():
        count += 1
        if item == document_to_delete:
            break
    if count >= 0:
        vectorstore.delete([vectorstore.index_to_docstore_id[count-1]])
    return vectorstore


print(len(db.docstore._dict))
db = remove_question_from_db(db, next_question_document)
print(len(db.docstore._dict))

8
7


## Response verification

In [141]:
from langchain.evaluation import load_evaluator
from langchain.evaluation import EvaluatorType

evaluator = load_evaluator(EvaluatorType.CRITERIA, criteria="coherence")

In [158]:
response = "I don't know"

In [160]:
eval_result = evaluator.evaluate_strings(
        prediction=response,
        input='What is your gender identity?',
    )
print(eval_result)

{'reasoning': 'The criterion is to assess whether the submission is coherent, well-structured, and organized.\n\nThe input is a question asking for the user\'s gender identity. The submission in response to this is "I don\'t know".\n\nCoherence refers to the logical and consistent interconnection of parts in a text. In this case, the response "I don\'t know" is coherent as it logically answers the question, albeit in a non-specific way.\n\nWell-structured and organized refers to the arrangement and presentation of ideas in a clear and orderly manner. The response "I don\'t know" is well-structured and organized as it is a clear and direct answer to the question.\n\nTherefore, the submission meets the criterion.\n\nY', 'value': 'Y', 'score': 1}


In [159]:
def verify_answer(next_question_document,response):
    question = next_question_document.to_json()['kwargs']['page_content']
    eval_result = evaluator.evaluate_strings(
        prediction=response,
        input=question,
    )

    return eval_result['value']

verify_answer(next_question_document, response)

'Y'

# Conversation Simulation

Make sure to run the above functions.

## Reload Vector Store From Local Directory

In [16]:
db = FAISS.load_local("faiss_index", embedding_model, allow_dangerous_deserialization=True)
len(db.docstore._dict)

8

## Begin Loop

In [149]:
chat_log = create_chat_log()
retriever = get_retriever(db)

while True:
    if len(db.docstore._dict) == 0:
        # To end the survey gracefully
        end_survey()
        break  # Exit the loop when the survey ends

    chat_history = get_chat_history(chat_log)
    # Begin survey if chat history is empty
    if chat_history == '':
        first_question = llm.invoke("[INST]I am starting to answer a survey. Greet me and ask me what is my age.[/INST]")
        print(f"LLM: {first_question}")
    # Ask the next best question based on previous survey user response
    else:
        # Create new retriever object with updated vectorstore
        retriever = get_retriever(db)
        # Create new RAG chain with updated retriever
        qa_chain = get_rag_chain(retriever)
        print('\n')
        print("LLM: ", end='')
        llm_reply, next_question_document, next_question_id = get_llm_outputs(qa_chain, user_response)
        add_to_chat_log(chat_log, message_type='ai', message=llm_reply)
        # Updated vectorstore with asked question removed
        db = remove_question_from_db(db, next_question_document)
    
    # Wait for user input
    user_response = input()
    print('\n')
    print("User: ", end='')
    print(user_response)
    add_to_chat_log(chat_log, message_type='user', message=user_response)

    # LLM chain to end the survey:
    def end_survey():
        print("It was interesting to get to know more about you!Thank you for participating in the survey!")
        print("If you have any further questions or feedback, feel free to reach out to us.")

    # Get back original question:
    def get_original_question(question_id):
        # Retrieve original question based on question_id
        original_question = db.docstore._dict[question_id].page_content
        return original_question

LLM:  Hello! I'm happy to assist you with your survey. To get started, could you please tell me your age? I'll make sure to keep your response confidential. Thank you!


User: 5


LLM:  That's great to hear that you gave a positive rating! Now, could you please tell me your marital status?

User: lol


LLM:  That's great to hear! By the way, what is your primary language?

User: 10


LLM:  That's great to hear! On a scale of 1 to 10, 10 being the happiest, you seem to be quite happy. Now, may I ask about your employment status?

User: unemployeddddd 


LLM:  "Thank you for your response. Could you please tell me your approximate annual income?"

User: 5


LLM:  That's great to hear! Could you please tell me where you live?

User: east side


LLM:  That's great! So you live on the east side. May I know, what is the highest level of education you have completed?

User: kinderjoy


LLM:  That's great! To help us better understand our audience, could you please tell us your gender identity