<a href="https://colab.research.google.com/github/Cheezeus/LLM-Foundations-with-Python/blob/main/StackUP_C37_Bounty.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Educational Tutor with Sentiment Analysis

In this code, we will try to enhance educational tutor chatbot text generation with sentiment analysis to adjust their response according to user's sentiment. The idea is to do sentiment analysis to user's prompt and then prompting to our LLM to adjust their response accordingly based on it. Furthermore we will try to implement it in Gradio Chat Interface and keep track of our conversation history.

In [1]:
# Install packages/libraries
!pip install -q accelerate protobuf sentencepiece torch git+https://github.com/huggingface/transformers huggingface_hub

  Installing build dependencies ... [?25l[?25hdone
  Getting requirements to build wheel ... [?25l[?25hdone
  Preparing metadata (pyproject.toml) ... [?25l[?25hdone
  Building wheel for transformers (pyproject.toml) ... [?25l[?25hdone


In [2]:
# Import packages/libraries
import pandas as pd
import os
from transformers import AutoModelForCausalLM, AutoTokenizer, pipeline, BitsAndBytesConfig
import torch

In [3]:
# Use google colab secrets for accessing hf_access_token
from huggingface_hub import login
from google.colab import userdata

login(token=userdata.get('hf_access_token'))

The token has not been saved to the git credentials helper. Pass `add_to_git_credential=True` in this function directly or `--add-to-git-credential` if using via `huggingface-cli` if you want to set the git credential as well.
Token is valid (permission: fineGrained).
Your token has been saved to /root/.cache/huggingface/token
Login successful


## Text-Generation Pipeline
For our educational tutor, we will need a pipeline that will be used for tutoring by using text-generation pipeline. The model that will be used in this code is ibleducation/ibl-tutoring-7B-32k which is a fine-tuned version of amazon/Mistrallite that is trained to respond like a professional teacher (from what I've read)
<br><br>
Check it out at: https://huggingface.co/iblai/ibl-tutoring-7B-32k <br>
P.S. I didn't use flash attention cause T4 seems to be not supported (?)

In [4]:
model_path = "ibleducation/ibl-tutoring-7B-32k"
quantization_config = BitsAndBytesConfig(load_in_8bit=True)
model = AutoModelForCausalLM.from_pretrained(model_path, torch_dtype=torch.float16, device_map="auto")
tokenizer = AutoTokenizer.from_pretrained(model_path)

# Initialize the pipeline using Hugging Face pipeline
generation_pipeline = pipeline(
    "text-generation",
    model=model,
    tokenizer=tokenizer,
    torch_dtype=torch.float16,
    device_map="auto",
)

The secret `HF_TOKEN` does not exist in your Colab secrets.
To authenticate with the Hugging Face Hub, create a token in your settings tab (https://huggingface.co/settings/tokens), set it as secret in your Google Colab and restart your session.
You will be able to reuse this secret in all of your notebooks.
Please note that authentication is recommended but still optional to access public models or datasets.


config.json:   0%|          | 0.00/607 [00:00<?, ?B/s]

pytorch_model.bin.index.json:   0%|          | 0.00/23.9k [00:00<?, ?B/s]

Downloading shards:   0%|          | 0/2 [00:00<?, ?it/s]

pytorch_model-00001-of-00002.bin:   0%|          | 0.00/9.94G [00:00<?, ?B/s]

pytorch_model-00002-of-00002.bin:   0%|          | 0.00/4.54G [00:00<?, ?B/s]

Loading checkpoint shards:   0%|          | 0/2 [00:00<?, ?it/s]

generation_config.json:   0%|          | 0.00/111 [00:00<?, ?B/s]



tokenizer_config.json:   0%|          | 0.00/1.52k [00:00<?, ?B/s]

tokenizer.json:   0%|          | 0.00/1.80M [00:00<?, ?B/s]

special_tokens_map.json:   0%|          | 0.00/551 [00:00<?, ?B/s]

To use this model, we need to reformat the prompt like below: <br>
`<|prompter|>{prompt}</s><|assistant|>` <br>
So we need to define a function to reformat user input

In [5]:
def answer_question(prompt):
  # Reformat prompt
  question = f"<|prompter|>{prompt}</s><|assistant|>"

  # Use generation pipeline generating answer
  response = response = generation_pipeline(question, max_length=400, do_sample=True)[0]["generated_text"]

  return response

In [6]:
# Test 1
text = "Can you explain what is attention mechanism in large language model?"
answer_question(text)

Truncation was not explicitly activated but `max_length` is provided a specific value, please use `truncation=True` to explicitly truncate examples to max length. Defaulting to 'longest_first' truncation strategy. If you encode pairs of sequences (GLUE-style) with the tokenizer you can select this strategy more precisely by providing a specific strategy to `truncation`.
Starting from v4.46, the `logits` model output will have the same type as the model (except at train time, where it will always be FP32)


"<|prompter|>Can you explain what is attention mechanism in large language model?</s><|assistant|> Absolutely! Attention mechanism helps the model to focus on relevant information. Let's explore more about the attention mechanism in large language models. I'm here to patiently explain the concept and provide examples. Feel free to ask any questions. My mission is to help you understand the attention mechanism better."

Since from what we've seen is that the generated text will include our reformated prompt, we will need to remove it to produce the answer

In [7]:
# Revision
def answer_question(prompt):
  # Reformat prompt
  question = f"<|prompter|>{prompt}</s><|assistant|>"

  # Use generation pipeline generating answer
  response = response = generation_pipeline(question, max_length=400, do_sample=True)[0]["generated_text"]

  # Replace reformated prompt from response to clean it
  response = response.replace(question, "").strip()

  return response

In [8]:
# Test 2
text = "What is attention mechanism in large language model? Explain it to me in detail"
answer_question(text)

"Large language models use attention mechanism to focus on important parts of the input. Let's explore it together, starting with the weighted averaging and the masking of irrelevant information. I'll be patient in explaining the details and answering any questions you may have. My mission is to help you understand machine learning better, so please don't hesitate to ask. Sharing knowledge and experiences with others is a virtue that I believe in, and I'm here to support your learning journey. Let's work together to unlock the power of large language models. I'm here to help you grasp the concept of attention mechanism."

## Sentiment Analysis Pipeline
For this section, we'll create the pipeline of sentiment-analysis model that we'll use to classify and score user's response. The label will be divided into `positive`, `negative`, and `neutral` by `lxyuan/distilbert-base-multilingual-cased-sentiments-student`.<br><br>
Check it out here: https://huggingface.co/lxyuan/distilbert-base-multilingual-cased-sentiments-student

In [9]:
# Import Model from HuggingFace
model_id = "lxyuan/distilbert-base-multilingual-cased-sentiments-student"

# Initialize the pipeline using Hugging Face pipeline
sentiment_pipeline = pipeline(
    "sentiment-analysis",
    model=model_id,
    return_all_scores=True
)

config.json:   0%|          | 0.00/759 [00:00<?, ?B/s]

model.safetensors:   0%|          | 0.00/541M [00:00<?, ?B/s]

tokenizer_config.json:   0%|          | 0.00/373 [00:00<?, ?B/s]

vocab.txt:   0%|          | 0.00/996k [00:00<?, ?B/s]

tokenizer.json:   0%|          | 0.00/2.92M [00:00<?, ?B/s]

special_tokens_map.json:   0%|          | 0.00/125 [00:00<?, ?B/s]

Hardware accelerator e.g. GPU is available in the environment, but no `device` argument is passed to the `Pipeline` object. Model will be on CPU.


In [10]:
# Test 1
text = "Can't you explain it better than my teacher?"
sentiment_pipeline(text)

[[{'label': 'positive', 'score': 0.2928171753883362},
  {'label': 'neutral', 'score': 0.3086417615413666},
  {'label': 'negative', 'score': 0.39854109287261963}]]

In [11]:
# Test 2
text = "Could you simplify it for me?"
sentiment_pipeline(text)

[[{'label': 'positive', 'score': 0.43359240889549255},
  {'label': 'neutral', 'score': 0.2301429808139801},
  {'label': 'negative', 'score': 0.3362646698951721}]]

## Combining Text-Generation Pipeline with Sentiment Analysis Pipeline
Next we will combine these 2 pipelines into a function that will perform sentiment analysis first, then text generation based on user's sentiment. The step is:
1. Gain the sentiment of user's input
2. Based on the sentiment & user's input, LLM will adjust their response

In [12]:
# Create function for combining those
def answer_question_based_on_sentiment(prompt):
  # Performing sentiment analysis
  sentiment_result = sentiment_pipeline(prompt)[0]
  print("Sentiment Score:\n", sentiment_result)

  # Set the main sentiment by the highest score
  high_score_index = max([result["score"] for result in sentiment_result])
  sentiment = next((item["label"] for item in sentiment_result if item['score'] == high_score_index), None)
  print(f"User sentiment is: {sentiment}")

  # Reformat user prompt with sentiment
  complete_prompt = f"""Based on analysis, the user sentiment is {sentiment}. Answer their question with the appropiate response based on their sentiment if
  their sentiment is positive you can be energetic and helpful, if their sentiment is neutral you can be concise and professional, but if their sentiment is
  negative you need to be empathetic and helping them to find solutions. Provide only the answer for the user's question without mentioning any of these prompt.
  \n User question: {prompt}"""
  question = f"<|prompter|>{complete_prompt}</s><|assistant|>"

  # Use generation pipeline generating answer
  response = response = generation_pipeline(question, max_length=400, do_sample=True)[0]["generated_text"]

  # Replace reformated prompt from response to clean it
  response = response.replace(question, "").strip()

  return response

In [13]:
# Test 1
text = "Good day sir! can you tell me what is LLM?"
answer_question_based_on_sentiment(text)

Sentiment Score:
 [{'label': 'positive', 'score': 0.8234270811080933}, {'label': 'neutral', 'score': 0.08923740684986115}, {'label': 'negative', 'score': 0.08733554929494858}]
User sentiment is: positive


'Of course! LLM stands for "language learning machine," which is a type of artificial intelligence that can analyze, understand, and use language to perform various tasks. Let me know if you need any further clarification on the subject.'

In [14]:
# Test 2
text = "I'm stressed, what is LLM?!"
answer_question_based_on_sentiment(text)

Sentiment Score:
 [{'label': 'positive', 'score': 0.15085963904857635}, {'label': 'neutral', 'score': 0.13881099224090576}, {'label': 'negative', 'score': 0.7103294134140015}]
User sentiment is: negative


"I understand that you're stressed, and I'm here to help. Let's explore the fascinating world of LLMs together, and I'll be energetic and helpful in sharing knowledge and experiences with you. Remember, it's important to ask questions and learn from your mistakes. I'm here to support you. I'm happy to help!"

## Cont.1: In-Depth Emotion Sentiment Analysis

Our first approach is quite simple by using prompt engineering we'll give user sentiment in the prompt so LLM knows how to respond it with appropiate response like what is shown in our test. We can also improve it using model that further analyse sentiment into in-depth emotion. Now, how about we step up our game using sentiment analysis pipeline that can classify the emotion of user in-depth using `SamLowe/roberta-base-go_emotions model`.<br><br>
Check it out here: https://huggingface.co/SamLowe/roberta-base-go_emotions

In [15]:
# Import Model from HuggingFace
model_id = "SamLowe/roberta-base-go_emotions"

# Initialize the pipeline using Hugging Face pipeline
emotion_pipeline = pipeline(
    "text-classification",
    model=model_id,
    return_all_scores=True,
)

config.json:   0%|          | 0.00/1.92k [00:00<?, ?B/s]

model.safetensors:   0%|          | 0.00/499M [00:00<?, ?B/s]

tokenizer_config.json:   0%|          | 0.00/380 [00:00<?, ?B/s]

vocab.json:   0%|          | 0.00/798k [00:00<?, ?B/s]

merges.txt:   0%|          | 0.00/456k [00:00<?, ?B/s]

tokenizer.json:   0%|          | 0.00/2.11M [00:00<?, ?B/s]

special_tokens_map.json:   0%|          | 0.00/280 [00:00<?, ?B/s]

Hardware accelerator e.g. GPU is available in the environment, but no `device` argument is passed to the `Pipeline` object. Model will be on CPU.


In [16]:
# Revision of function using different sentiment model
def answer_question_based_on_sentiment_advanced(prompt):
  # Performing sentiment analysis
  sentiment_result = emotion_pipeline(prompt)[0]
  print("Sentiment Score:\n", sentiment_result)

  # Set the main sentiment by the highest score
  high_score_index = max([result["score"] for result in sentiment_result])
  sentiment = next((item["label"] for item in sentiment_result if item['score'] == high_score_index), None)
  print(f"User sentiment is: {sentiment}")

  # Reformat user prompt with sentiment
  complete_prompt = f"""Based on analysis, the user sentiment is {sentiment}. Answer their question with the appropiate response based on their sentiment if
  their sentiment is positive-related you can be energetic and helpful, if their sentiment is neutral-related you can be concise and professional, but if their
  sentiment is negative-related you need to be empathetic and helping them to find solutions. Provide only the answer for the user's question without mentioning any of
  these prompt.\n User question: {prompt}"""
  question = f"<|prompter|>{complete_prompt}</s><|assistant|>"

  # Use generation pipeline generating answer
  response = response = generation_pipeline(question, max_length=400, do_sample=True)[0]["generated_text"]

  # Replace reformated prompt from response to clean it
  response = response.replace(question, "").strip()

  return response

In [17]:
# Test 1
text = "Good day sir! can you tell me what is LLM?"
answer_question_based_on_sentiment_advanced(text)

Sentiment Score:
 [{'label': 'admiration', 'score': 0.0718902125954628}, {'label': 'amusement', 'score': 0.0025645128916949034}, {'label': 'anger', 'score': 0.0014970103511586785}, {'label': 'annoyance', 'score': 0.003559664823114872}, {'label': 'approval', 'score': 0.025507479906082153}, {'label': 'caring', 'score': 0.010133023373782635}, {'label': 'confusion', 'score': 0.03657359257340431}, {'label': 'curiosity', 'score': 0.7520508170127869}, {'label': 'desire', 'score': 0.006020245607942343}, {'label': 'disappointment', 'score': 0.0009681525989435613}, {'label': 'disapproval', 'score': 0.0023041339591145515}, {'label': 'disgust', 'score': 0.0005974540254101157}, {'label': 'embarrassment', 'score': 0.0004486514371819794}, {'label': 'excitement', 'score': 0.02856059931218624}, {'label': 'fear', 'score': 0.0009636717149987817}, {'label': 'gratitude', 'score': 0.020114535465836525}, {'label': 'grief', 'score': 0.00045999776921235025}, {'label': 'joy', 'score': 0.008943972177803516}, {'l

'Absolutely! LLM stands for "Large Language Model". It is a type of AI system that is trained on a large corpus of text data to generate Human-like responses. I\'m here to help you understand it better, so feel free to ask any further questions or seek clarifications if needed. Let\'s explore the world of LLMs together!'

In [18]:
# Test 2
text = "I'm stressed, what is LLM?!"
answer_question_based_on_sentiment_advanced(text)

Sentiment Score:
 [{'label': 'admiration', 'score': 0.005741694942116737}, {'label': 'amusement', 'score': 0.003648048732429743}, {'label': 'anger', 'score': 0.007148414384573698}, {'label': 'annoyance', 'score': 0.02478634938597679}, {'label': 'approval', 'score': 0.016129640862345695}, {'label': 'caring', 'score': 0.062080226838588715}, {'label': 'confusion', 'score': 0.22135613858699799}, {'label': 'curiosity', 'score': 0.22749462723731995}, {'label': 'desire', 'score': 0.006942933890968561}, {'label': 'disappointment', 'score': 0.04225169122219086}, {'label': 'disapproval', 'score': 0.00944582000374794}, {'label': 'disgust', 'score': 0.0019527097465470433}, {'label': 'embarrassment', 'score': 0.005574601702392101}, {'label': 'excitement', 'score': 0.02018202468752861}, {'label': 'fear', 'score': 0.08480712026357651}, {'label': 'gratitude', 'score': 0.0021741290111094713}, {'label': 'grief', 'score': 0.006961570121347904}, {'label': 'joy', 'score': 0.015063393861055374}, {'label': '

"I'm sorry to hear that. Let's try to relax your nervousness by exploring the amazing applications of LLM together. You can be energetic and helpful in researching and understanding the technology better. Remember to mention all the necessary details of LLM in your response. I will be patient in explaining the concept and its uses. Let's have a positive outlook on this new technology."

## Cont.2: Conversational-Awareness & UI
Based on our approach above, we do sentiment analysis first on each user's input and then generate the response based on it. However for creating interactive chatbot we will need to address `Conversational-Awareness` where we will need to keep track of chat history. Also we will be using `Gradio` for our interface

In [19]:
# Function for chat history
def set_chat_history(role, content, messages):
  message = {"role": role, "content": content}
  messages.append(message)
  return messages

In [25]:
# Revision of function for conversational awareness
def chatbot_chat(prompt, messages):
  # System Prompt
  system_prompt = """You are an educational tutor that also used sentiment analysis to adjust your response. If their
    sentiment is positive-related you can be energetic and helpful, if their sentiment is neutral-related you can be concise and
    professional, but if their sentiment is negative-related you need to be empathetic and helping them to find solutions."""

  # Append user's input
  messages = set_chat_history("user", prompt, messages)

  # Performing sentiment analysis
  sentiment_result = emotion_pipeline(prompt)[0]
  print("Sentiment Score:\n", sentiment_result)

  # Set the main sentiment by the highest score
  high_score_index = max([result["score"] for result in sentiment_result])
  sentiment = next((item["label"] for item in sentiment_result if item['score'] == high_score_index), None)
  print(f"User sentiment is: {sentiment}")

  # Reformat user prompt with sentiment
  if len(messages) >= 5:
    complete_prompt = f"""{system_prompt}.\nBased on these conversation between you and user {messages[-6:-1]}, and the current user
    sentiment is {sentiment}. Answer their question only with the appropiate response based on those information above without mentioning
    their sentiment.\n User question: {prompt}\nAssitant:"""
  else:
    complete_prompt = f"""{system_prompt}.\nBased on these conversation between you and user {messages}, and the current user
    sentiment is {sentiment}. Answer their question only with the appropiate response based on those information above without mentioning
    their sentiment.\n User question: {prompt}\nAssitant:"""
  question = f"<|prompter|>{complete_prompt}</s><|assistant|>"

  # Use generation pipeline generating answer
  response = generation_pipeline(question, max_length=400, do_sample=True)[0]["generated_text"]

  # Replace reformated prompt from response to clean it
  response = response.replace(question, "").strip()
  complete_response = f"User Sentiment: {sentiment}\n\nResponse: {response}"

  # Append assistant's response
  messages = set_chat_history("assistant", response, messages)

  return complete_response

In [21]:
!pip -q install gradio

[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m50.4/50.4 kB[0m [31m3.4 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m18.1/18.1 MB[0m [31m89.4 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m318.7/318.7 kB[0m [31m26.6 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m94.6/94.6 kB[0m [31m8.4 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m76.4/76.4 kB[0m [31m6.9 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m77.9/77.9 kB[0m [31m7.3 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m141.9/141.9 kB[0m [31m12.0 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m10.3/10.3 MB[0m [31m98.4 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━

In [26]:
# Import Gradio
import gradio as gr

# Create & Launch Gradio Chatbot Interface
gr.ChatInterface(
    chatbot_chat,
    chatbot=gr.Chatbot(height=500),
    textbox=gr.Textbox(placeholder="Enter question here", container=False, scale=7),
    title="Educational Tutor Chatbot",
    description="Using sentiment for adjusting chatbot response",
    theme="soft",
    retry_btn=None,
    undo_btn="Delete Previous",
    clear_btn="Clear",
).launch()

Setting queue=True in a Colab notebook requires sharing enabled. Setting `share=True` (you can turn this off by setting `share=False` in `launch()` explicitly).

Colab notebook detected. To show errors in colab notebook, set debug=True in launch()
Running on public URL: https://1b12cecd76b723eeae.gradio.live

This share link expires in 72 hours. For free permanent hosting and GPU upgrades, run `gradio deploy` from Terminal to deploy to Spaces (https://huggingface.co/spaces)




## Conclusion & Improvement That Can Be Made

That's the end of this experiment, we could also improve the chatbot performance by evaluating the sentiment of overall chat between user & assistant which will be more contextual by:
1. Summarizing chat history
2. Sentiment analysis on chat history
3. Use these as parameter in text-generation pipeline
<br><br>
P.S. For this experiment, you may come across LLM hallucinating their answer or the sentiment analysis :D