# Customer Support Chatbot
## Project Overview

This project aims to develop an advanced customer support chatbot leveraging sentiment analysis and question-answering capabilities. Utilizing the DistilBERT model for sentiment classification and the Hugging Face Transformers library for question answering, the chatbot is designed to provide users with accurate and context-aware responses to their inquiries. The system is integrated with a knowledge base, consisting of frequently asked questions (FAQs) stored in a CSV file. The chatbot can analyze user sentiment, allowing it to tailor its responses based on user emotions, enhancing user experience. Overall, this project combines natural language processing techniques to create a responsive and intelligent support assistant.


## Setting Up the Development Environment
We set the runtime environment to use the T4 GPU and install the necessary required libraries and log in to Hugging Face.

In [1]:
!pip install -q accelerate protobuf sentencepiece torch git+https://github.com/huggingface/transformers huggingface_hub langchain langchain_community

  Installing build dependencies ... [?25l[?25hdone
  Getting requirements to build wheel ... [?25l[?25hdone
  Preparing metadata (pyproject.toml) ... [?25l[?25hdone
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m50.4/50.4 kB[0m [31m3.7 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m1.0/1.0 MB[0m [31m22.5 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m2.3/2.3 MB[0m [31m66.8 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m399.9/399.9 kB[0m [31m31.7 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m290.2/290.2 kB[0m [31m21.3 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m76.4/76.4 kB[0m [31m7.7 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m77.9/77.9 kB[0m [31m6.7 MB/s[0m eta [36m0:00:00[

In [2]:
import pandas as pd
import os
from transformers import pipeline, RobertaTokenizer, RobertaForQuestionAnswering
from huggingface_hub import login
import torch
from langchain.chains import LLMChain
from langchain.prompts import ChatPromptTemplate
from langchain.llms import HuggingFacePipeline
from langchain.memory import ConversationBufferMemory

In [3]:
login(token = "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")

The token has not been saved to the git credentials helper. Pass `add_to_git_credential=True` in this function directly or `--add-to-git-credential` if using via `huggingface-cli` if you want to set the git credential as well.
Token is valid (permission: read).
Your token has been saved to /root/.cache/huggingface/token
Login successful


## Creating and Managing the QA Dataset
We create a basic QA dataset stored in a CSV file. This file will serve as a knowledge base that the chat agent will reference when answering questions. If the file already exists, the code will load the existing data instead of creating a new file. This approach allows the agent to build on previously acquired knowledge.

In [4]:
# Load or create a Q&A dataset for the chatbot to reference
csv_file = 'customer_support_qa.csv'

In [5]:
# Check if the CSV file exists; if not, create it with customer support data
if not os.path.exists(csv_file):
    customer_support_qa = {
        'question': [
            "How can I reset my password?",
            "What is the status of my order?",
            "How can I contact customer service?",
            "What are your business hours?",
            "Do you offer international shipping?",
            "What is your return policy?",
            "How can I track my order?",
            "Can I change my order after placing it?",
            "What payment methods do you accept?",
            "How do I leave feedback?"
        ],
        'answer': [
            "To reset your password, go to the login page and click on 'Forgot Password.' You will receive a link to reset it via email.",
            "You can check the status of your order by logging into your account and visiting the 'Orders' section. There you can see tracking details.",
            "You can reach customer service via our support email at support@company.com or call us at (123) 456-7890.",
            "Our business hours are Monday to Friday from 9:00 AM to 6:00 PM PST.",
            "Yes, we offer international shipping. Please check our shipping policy for more details.",
            "You can return items within 30 days of purchase. Make sure they are unused and in their original packaging.",
            "You can track your order by logging into your account and going to the 'Orders' section.",
            "If you need to change your order, please contact our customer service as soon as possible.",
            "We accept various payment methods including credit cards, PayPal, and bank transfers.",
            "We value your feedback! You can leave feedback on our website or directly via email."
        ]
    }
    customer_support_qa = pd.DataFrame(customer_support_qa)
    customer_support_qa.to_csv(csv_file, index=False)
else:
    qa_df = pd.read_csv(csv_file)

In [6]:
# Verify the CSV content
qa_df = pd.read_csv(csv_file)
print(qa_df)


                                  question  \
0             How can I reset my password?   
1          What is the status of my order?   
2      How can I contact customer service?   
3            What are your business hours?   
4     Do you offer international shipping?   
5              What is your return policy?   
6                How can I track my order?   
7  Can I change my order after placing it?   
8      What payment methods do you accept?   
9                 How do I leave feedback?   

                                              answer  
0  To reset your password, go to the login page a...  
1  You can check the status of your order by logg...  
2  You can reach customer service via our support...  
3  Our business hours are Monday to Friday from 9...  
4  Yes, we offer international shipping. Please c...  
5  You can return items within 30 days of purchas...  
6  You can track your order by logging into your ...  
7  If you need to change your order, please conta... 

## Initializing the Llama 2 Model and Tokenizer
We set up the Llama 2 model and its tokenizer using the Hugging Face Transformers library.

In [7]:
tokenizer = RobertaTokenizer.from_pretrained("deepset/roberta-large-squad2")
model = RobertaForQuestionAnswering.from_pretrained("deepset/roberta-large-squad2")
qa_pipeline = pipeline(task="question-answering", model=model, tokenizer=tokenizer)

The secret `HF_TOKEN` does not exist in your Colab secrets.
To authenticate with the Hugging Face Hub, create a token in your settings tab (https://huggingface.co/settings/tokens), set it as secret in your Google Colab and restart your session.
You will be able to reuse this secret in all of your notebooks.
Please note that authentication is recommended but still optional to access public models or datasets.


tokenizer_config.json:   0%|          | 0.00/1.19k [00:00<?, ?B/s]

vocab.json:   0%|          | 0.00/798k [00:00<?, ?B/s]

merges.txt:   0%|          | 0.00/456k [00:00<?, ?B/s]

special_tokens_map.json:   0%|          | 0.00/239 [00:00<?, ?B/s]

config.json:   0%|          | 0.00/696 [00:00<?, ?B/s]



model.safetensors:   0%|          | 0.00/1.42G [00:00<?, ?B/s]

Hardware accelerator e.g. GPU is available in the environment, but no `device` argument is passed to the `Pipeline` object. Model will be on CPU.


In [8]:
# HuggingFace pipeline integrated into LangChain
llm = HuggingFacePipeline(pipeline=qa_pipeline)

  llm = HuggingFacePipeline(pipeline=qa_pipeline)


## Initializing the sentiment analyzer


In [9]:
# Initialize the sentiment analysis pipeline using DistilBERT
sentiment_analyzer = pipeline("sentiment-analysis", model="distilbert-base-uncased-finetuned-sst-2-english")


config.json:   0%|          | 0.00/629 [00:00<?, ?B/s]

model.safetensors:   0%|          | 0.00/268M [00:00<?, ?B/s]

tokenizer_config.json:   0%|          | 0.00/48.0 [00:00<?, ?B/s]

vocab.txt:   0%|          | 0.00/232k [00:00<?, ?B/s]

Hardware accelerator e.g. GPU is available in the environment, but no `device` argument is passed to the `Pipeline` object. Model will be on CPU.


In [10]:
# example to test the sentiment
sample_text = "I'm really happy with the service!"
sentiment = sentiment_analyzer(sample_text)
print(sentiment)

[{'label': 'POSITIVE', 'score': 0.9998730421066284}]


In [11]:
# Example usage to check sentiment
sample_text = "I'm really not sure i am happy with the service!"
sentiment = sentiment_analyzer(sample_text)
print(sentiment)

[{'label': 'NEGATIVE', 'score': 0.999747097492218}]


In [12]:
# Function to analyze sentiment and adjust chatbot responses
def analyze_sentiment(user_input):
    sentiment = sentiment_analyzer(user_input)[0]
    return sentiment['label'], sentiment['score']

In [13]:
!pip install sentence_transformers

Collecting sentence_transformers
  Downloading sentence_transformers-3.1.1-py3-none-any.whl.metadata (10 kB)
Downloading sentence_transformers-3.1.1-py3-none-any.whl (245 kB)
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m245.3/245.3 kB[0m [31m4.9 MB/s[0m eta [36m0:00:00[0m
[?25hInstalling collected packages: sentence_transformers
Successfully installed sentence_transformers-3.1.1


# Implementing the cosine similarity
We define a function that uses embeddings derived from the questions to help in the question answering agent. The agent checks for similar questions similar to the users input and return the corresponding answer. If there is no similarity we return none and handover to the llm model to answer the question based on the knowledge base and the context provided

In [14]:
from sklearn.metrics.pairwise import cosine_similarity
from sentence_transformers import SentenceTransformer

# Load the Sentence Transformer model
sentence_model = SentenceTransformer('sentence-transformers/all-MiniLM-L6-v2')

# Generate embeddings for each question
qa_df['embedding'] = qa_df['question'].apply(lambda x: sentence_model.encode(x, convert_to_tensor=True))

# Convert embeddings to a list of numpy arrays (for use with cosine_similarity)
qa_df['embedding'] = qa_df['embedding'].apply(lambda x: x.cpu().numpy())

def find_similar_question(question, threshold=0.8):
    global qa_df
    # Generate the embedding for the input question
    question_embedding = sentence_model.encode([question], convert_to_tensor=True)

    # Extract embeddings from the DataFrame and convert them to numpy arrays
    embeddings = [emb for emb in qa_df['embedding']]

    # Calculate cosine similarities
    similarities = cosine_similarity([question_embedding[0].cpu().numpy()], embeddings)[0]

    # Find the best match with a similarity score above the threshold
    best_match_idx = similarities.argmax()
    if similarities[best_match_idx] > threshold:
        return qa_df.iloc[best_match_idx]['answer']
    else:
        return None

modules.json:   0%|          | 0.00/349 [00:00<?, ?B/s]

config_sentence_transformers.json:   0%|          | 0.00/116 [00:00<?, ?B/s]

README.md:   0%|          | 0.00/10.7k [00:00<?, ?B/s]

sentence_bert_config.json:   0%|          | 0.00/53.0 [00:00<?, ?B/s]

config.json:   0%|          | 0.00/612 [00:00<?, ?B/s]

model.safetensors:   0%|          | 0.00/90.9M [00:00<?, ?B/s]

tokenizer_config.json:   0%|          | 0.00/350 [00:00<?, ?B/s]

vocab.txt:   0%|          | 0.00/232k [00:00<?, ?B/s]

tokenizer.json:   0%|          | 0.00/466k [00:00<?, ?B/s]

special_tokens_map.json:   0%|          | 0.00/112 [00:00<?, ?B/s]

1_Pooling/config.json:   0%|          | 0.00/190 [00:00<?, ?B/s]

In [15]:
content_description = (
    "We are SmartTech Solutions, a leading company specializing in advanced technology solutions and IT services, committed to delivering high-quality products and exceptional customer service. "
    "Our headquarters are located in San Francisco, California, with additional offices in New York and Austin. Our mission is to drive innovation and deliver reliable solutions that meet the evolving needs of our clients. "
    "Our product lineup includes the SmartTech Pro, a high-performance laptop designed for professionals, and the SmartTech Lite, a sleek and affordable option for students and casual users. "
    "In addition to our products, we offer comprehensive services including 24/7 tech support, custom software development, IT consulting, and cybersecurity solutions tailored to businesses of all sizes. "
    "Our core values are Customer Focus, Integrity, Innovation, and Excellence. We are dedicated to ensuring that each client receives personalized attention and cutting-edge solutions. "
    "We also offer a range of educational webinars and workshops to help our clients stay ahead in the technology landscape. Our annual Tech Innovations Conference brings together industry leaders and experts to discuss the latest trends and advancements. "
    "For customer support, you can reset your password by clicking 'Forgot Password' on the login page and following the instructions sent to your email. To check your order status, log into your account and visit the 'Orders' section, where you can view real-time tracking information. "
    "If you have any inquiries or need assistance, contact us via email at support@smarttech.com or call our customer service line at (555) 123-4567. Our support team is available Monday to Friday from 9:00 AM to 6:00 PM PST, and Saturday from 10:00 AM to 4:00 PM PST. "
    "For more details about our products, services, and latest updates, visit our website at www.smarttech.com. Follow us on social media for the latest news and promotions. "
    "We are also committed to sustainability and actively engage in eco-friendly practices, including energy-efficient operations and recycling programs. Our team is dedicated to making a positive impact on both technology and the environment. "
)

In [16]:

# Initialize conversation history with a static context defining the model's role
conversation_history = [
    {"role": "system", "content": content_description}
]

# Implementing the Question-Answering Function
We implement the core function of the chat agent, which is to answer questions.

In [17]:

def answer_question(question):
    global qa_df, conversation_history

    # Analyze sentiment of the user's question
    sentiment_label, sentiment_score = analyze_sentiment(question)

    # Update conversation history with the user's input
    conversation_history.append({"role": "user", "content": question})

    # Check if the question is in the QA dataset
    answer = find_similar_question(question.lower())

    # If an answer is found in the QA dataset, use it
    if answer is not None:
        response = answer
    else:
        # Generate an answer using the QA pipeline
        # Trim context to the most relevant recent parts
        context = " ".join([entry["content"] for entry in conversation_history[-5:]])

        # Use the QA model with the current context
        qa_input = {
            'question': question,
            'context': context
        }
        response = qa_pipeline(qa_input)['answer']

        # Store the new question-answer pair if not already present
        if not any(qa_df['question'].str.lower() == question.lower()):
            new_row = pd.DataFrame({'question': [question], 'answer': [response]})
            new_embedding = sentence_model.encode(question, convert_to_tensor=True).cpu().numpy()
            new_row['embedding'] = [new_embedding]
            qa_df = pd.concat([qa_df, new_row], ignore_index=True)
            qa_df.to_csv(csv_file, index=False)

    # Add the assistant's response to the conversation history
    conversation_history.append({"role": "assistant", "content": response})

    # Adjust the response based on sentiment
    if sentiment_label == 'NEGATIVE' and sentiment_score > 0.99:
        final_response = f"I'm sorry for the inconvenience caused. {response} Is there anything else I can assist you with?"
    elif sentiment_label == 'POSITIVE' and sentiment_score > 0.99:
        final_response = f"I'm glad you're satisfied! {response} Let me know if you have more questions."
    else:
        final_response = response  # Default response for neutral sentiment

    return final_response


In [18]:
user_question = "what is your company address"
answer_question(user_question)

  self.pid = os.fork()


'San Francisco, California,'

In [19]:
# Example Usage
user_question = "How can I reset my password?"
answer_question(user_question)

"I'm sorry for the inconvenience caused. To reset your password, go to the login page and click on 'Forgot Password.' You will receive a link to reset it via email. Is there anything else I can assist you with?"

In [20]:
user_question = "What is the status of my order?"
answer_question(user_question)

"You can check the status of your order by logging into your account and visiting the 'Orders' section. There you can see tracking details."

In [21]:
user_question = "How can I contact customer service?"
answer_question(user_question)

"I'm sorry for the inconvenience caused. You can reach customer service via our support email at support@company.com or call us at (123) 456-7890. Is there anything else I can assist you with?"

In [22]:
print(qa_df)

                                   question  \
0              How can I reset my password?   
1           What is the status of my order?   
2       How can I contact customer service?   
3             What are your business hours?   
4      Do you offer international shipping?   
5               What is your return policy?   
6                 How can I track my order?   
7   Can I change my order after placing it?   
8       What payment methods do you accept?   
9                  How do I leave feedback?   
10             what is your company address   

                                               answer  \
0   To reset your password, go to the login page a...   
1   You can check the status of your order by logg...   
2   You can reach customer service via our support...   
3   Our business hours are Monday to Friday from 9...   
4   Yes, we offer international shipping. Please c...   
5   You can return items within 30 days of purchas...   
6   You can track your order by logg

# Implementing the Gradio UI
We install the Gradio library that will enable us to build a web-based interface where users can input questions and receive answers, making the Llama 2 chatbot more accessible and user-friendly.

In [23]:
!pip -q install gradio

[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m18.1/18.1 MB[0m [31m37.8 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m318.7/318.7 kB[0m [31m13.3 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m94.6/94.6 kB[0m [31m8.2 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m10.8/10.8 MB[0m [31m35.4 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m62.8/62.8 kB[0m [31m5.0 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m71.5/71.5 kB[0m [31m6.0 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m130.2/130.2 kB[0m [31m11.5 MB/s[0m eta [36m0:00:00[0m
[?25h

In [24]:
import gradio as gr

# Create a Gradio Interface
interface = gr.Interface(
    fn=answer_question,
    inputs="text",
    outputs="text",
    title="Llama 2 Chatbot with QA and Sentiment Pipeline",
    description="Ask a question and the chatbot will respond using a pre-defined QA dataset or Llama 2 if the answer is not in the dataset.",
)
interface.launch()

Setting queue=True in a Colab notebook requires sharing enabled. Setting `share=True` (you can turn this off by setting `share=False` in `launch()` explicitly).

Colab notebook detected. To show errors in colab notebook, set debug=True in launch()
Running on public URL: https://be3fdd2c54000b0abd.gradio.live

This share link expires in 72 hours. For free permanent hosting and GPU upgrades, run `gradio deploy` from Terminal to deploy to Spaces (https://huggingface.co/spaces)


