# Emotional Sentiment Analysis and Adaptive Response System

In this project, we will build a chatbot capable of identifying and understanding emotional states from text input. The system will analyze user input to detect emotions such as fear, joy, sadness, and more. Based on the detected emotion, the chatbot will generate culturally sensitive and empathetic responses.

Steps Involved:

1.**Data Collection & Preprocessing**: We will load a dataset containing text data labeled with emotions and preprocess it for model training.

2**.Model Training**: Using a machine learning model (e.g., Logistic Regression), we will classify the emotional state of the input text.

3.**Response Generation**: After detecting the emotion, the chatbot will generate relevant responses tailored to the emotional context.

4.**Integration & Testing**: Finally, we will integrate the sentiment analysis model with the response generation system and test the chatbot’s performance.
The goal is to create an empathetic chatbot that understands emotional cues and responds appropriately, providing emotional support.

In [1]:
!pip install transformers torch pandas nltk spacy openai



 EXPLANATION


transformers: A popular library for working with pre-trained models like GPT, BERT, and others for tasks like text generation, sentiment analysis, and more.

torch: The core package for PyTorch, a deep learning framework used for building and training neural networks.

pandas: A powerful library for data manipulation and analysis, especially useful for handling tabular data (like CSVs).

nltk: The Natural Language Toolkit, which provides tools for text processing and analysis, such as tokenization, stopword removal, and more.

spacy: Another NLP library that provides advanced text processing tools like tokenization, named entity recognition, and dependency parsing.

openai: This library gives you access to OpenAI's models, such as GPT, for various NLP tasks like generating text, answering questions, etc.

In [3]:
from google.colab import files
uploaded = files.upload()  # This will open a dialog to upload the file from your local machine


Saving Emotion_classify_Data.csv to Emotion_classify_Data.csv


 EXPLANATION

`from google.colab import files` This imports the files module
from Google Colab, which allows you to upload files.

`uploaded = files.upload() `This triggers the file upload dialog, allowing you to select files from your local system to upload. Once uploaded, the files are stored in the Colab environment and can be accessed for further processing.

In [4]:
import os
print(os.listdir())


['.config', 'archive.zip', 'Emotion_classify_Data.csv', 'sample_data']


In [5]:
import pandas as pd

# Load the dataset
df = pd.read_csv('Emotion_classify_Data.csv')

# Display the first few rows to inspect the data
print(df.head())


                                             Comment Emotion
0  i seriously hate one subject to death but now ...    fear
1                 im so full of life i feel appalled   anger
2  i sit here to write i start to dig out my feel...    fear
3  ive been really angry with r and i feel like a...     joy
4  i feel suspicious if there is no one outside l...    fear


EXPLANATION


`import pandas as pd `This imports the pandas library and gives it the alias pd, which is commonly used to work with data in Python.

`df = pd.read_csv('Emotion_classify_Data.csv')` This reads the CSV file named 'Emotion_classify_Data.csv' into a pandas DataFrame (stored in the variable df). The DataFrame allows you to work with the dataset in a structured way, where rows represent individual data points and columns represent the features of the data.

`print(df.head())`



This prints the first few rows of the DataFrame to help you inspect the data. By default, head() shows the first 5 rows, giving you a quick preview of your dataset's structure.

In [6]:
# Check for missing values
print(df.isnull().sum())

# Remove rows with missing values (if any)
df = df.dropna()


Comment    0
Emotion    0
dtype: int64


EXPLANATION


`df.isnull().sum()`

1.This checks if there are any missing (null) values in the
DataFrame df. The isnull() method returns a DataFrame of the same shape as df with True for missing values and False for non-missing values.

2.The sum() function then adds up the True values (which are counted as 1), giving you the total number of missing values in each column.

`df = df.dropna()`

1.This removes all rows from the DataFrame df that contain any missing (NaN) values.
2.The dropna() method returns a new DataFrame with the rows containing NaN values dropped, and the result is assigned back to df (replacing the original DataFrame).

In [9]:
import re

# Function to clean text
def clean_text(text):
    # Remove punctuation, numbers, and convert text to lowercase
    text = re.sub(r'[^\w\s]', '', text)  # Remove punctuation
    text = re.sub(r'\d+', '', text)      # Remove digits
    text = text.lower()                  # Convert to lowercase
    return text

# Apply the clean_text function to the 'Comment' column
df['cleaned_comment'] = df['Comment'].apply(clean_text)

# Display the cleaned data
print(df[['Comment', 'cleaned_comment']].head())


                                             Comment  \
0  i seriously hate one subject to death but now ...   
1                 im so full of life i feel appalled   
2  i sit here to write i start to dig out my feel...   
3  ive been really angry with r and i feel like a...   
4  i feel suspicious if there is no one outside l...   

                                     cleaned_comment  
0  i seriously hate one subject to death but now ...  
1                 im so full of life i feel appalled  
2  i sit here to write i start to dig out my feel...  
3  ive been really angry with r and i feel like a...  
4  i feel suspicious if there is no one outside l...  


EXPLANATION


1.clean_text: This function removes punctuation, digits, and converts text to lowercase.

2.apply: Applies the clean_text function to each comment in the DataFrame.

3.print: Displays the original and cleaned text for comparison.

In [10]:
from sklearn.preprocessing import LabelEncoder

# Initialize the label encoder
label_encoder = LabelEncoder()

# Encode the emotion labels
df['emotion_label'] = label_encoder.fit_transform(df['Emotion'])

# Display the encoded labels
print(df[['Emotion', 'emotion_label']].head())


  Emotion  emotion_label
0    fear              1
1   anger              0
2    fear              1
3     joy              2
4    fear              1


EXPLANATION


1.LabelEncoder(): This initializes the label encoder, which converts categorical text labels (e.g., emotions) into numeric values.

2.fit_transform(): This method encodes the 'Emotion' column into numeric labels and stores them in a new column 'emotion_label'.

3.print: Displays the original emotion labels alongside their encoded numeric labels for reference.

In [11]:
from sklearn.model_selection import train_test_split

# Split the data into features (X) and target (y)
X = df['cleaned_comment']  # Features: The cleaned comments (text)
y = df['emotion_label']    # Target: The corresponding emotions

# Split the data into training and testing sets (80% train, 20% test)
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

# Check the size of training and test sets
print(len(X_train), len(X_test))


4749 1188


EXPLANATION




*  train_test_split: Splits the dataset into training (80%) and testing (20%) sets.

*  X: The feature (cleaned text comments).
*   y: The target (emotion labels).


*   print: Displays the size of the training and testing sets.



In [12]:
from sklearn.feature_extraction.text import TfidfVectorizer

# Initialize the TF-IDF Vectorizer
tfidf_vectorizer = TfidfVectorizer(max_features=5000)

# Fit and transform the training data
X_train_tfidf = tfidf_vectorizer.fit_transform(X_train)

# Transform the test data (without fitting again)
X_test_tfidf = tfidf_vectorizer.transform(X_test)

# Check the shape of the transformed data
print(X_train_tfidf.shape, X_test_tfidf.shape)


(4749, 5000) (1188, 5000)


EXPLANATION



1.TfidfVectorizer: Converts the text data into numerical vectors based on term frequency-inverse document frequency (TF-IDF), which helps capture important words in the text.

2.fit_transform(X_train): Fits the vectorizer to the training data and transforms it into numerical features.

3.transform(X_test): Transforms the test data into numerical features using the already fitted vectorizer.

4.print: Displays the shape of the transformed data, showing the number of samples and features.

In [13]:
from sklearn.linear_model import LogisticRegression
from sklearn.metrics import classification_report

# Initialize and train the Logistic Regression model
model = LogisticRegression()
model.fit(X_train_tfidf, y_train)

# Predict on the test set
y_pred = model.predict(X_test_tfidf)

# Evaluate the model's performance
print(classification_report(y_test, y_pred))


              precision    recall  f1-score   support

           0       0.91      0.92      0.92       392
           1       0.95      0.90      0.93       416
           2       0.90      0.94      0.92       380

    accuracy                           0.92      1188
   macro avg       0.92      0.92      0.92      1188
weighted avg       0.92      0.92      0.92      1188



EXPLANATION


1.LogisticRegression(): Initializes a logistic regression model, which is used for classification tasks.

2.fit(X_train_tfidf, y_train): Trains the model on the transformed training data (X_train_tfidf) and the corresponding labels (y_train).

3.predict(X_test_tfidf): Uses the trained model to predict the emotions for the test set (X_test_tfidf).

4.classification_report: Evaluates the model's performance, providing metrics like precision, recall, and F1-score for each emotion label.

In [14]:
# Define responses for each emotion
emotion_responses = {
    'fear': 'I’m here for you. It’s okay to feel scared sometimes.',
    'anger': 'I understand your frustration. Let’s try to talk it out.',
    'joy': 'That’s wonderful to hear! Keep the positivity flowing!',
    'sadness': 'I’m really sorry you’re feeling this way. I’m here to help.',
    # Add more emotions and responses as needed
}

# Function to generate a response based on emotion
def generate_response(emotion):
    return emotion_responses.get(emotion, "I’m here to listen.")

# Predict emotion and generate response for the first test sample
predicted_emotion = label_encoder.inverse_transform([y_pred[0]])  # Get the predicted emotion
response = generate_response(predicted_emotion[0])

# Print the response
print("Predicted Emotion:", predicted_emotion[0])
print("Response:", response)


Predicted Emotion: anger
Response: I understand your frustration. Let’s try to talk it out.


EXPLANATION


1.emotion_responses: A dictionary mapping emotions to predefined empathetic responses.

2.generate_response(emotion): A function that returns an appropriate response based on the predicted emotion.

3.inverse_transform([y_pred[0]]): Converts the predicted numeric emotion back to the original text label.

4.print: Displays the predicted emotion and the corresponding response.

In [15]:
def chat_with_bot(user_input):
    # Clean the user input
    cleaned_input = clean_text(user_input)

    # Transform the input using TF-IDF
    user_input_tfidf = tfidf_vectorizer.transform([cleaned_input])

    # Predict the emotion
    predicted_label = model.predict(user_input_tfidf)
    predicted_emotion = label_encoder.inverse_transform(predicted_label)

    # Generate the response
    response = generate_response(predicted_emotion[0])

    return response

# Example usage:
user_input = "I'm feeling really anxious today"
print("Bot:", chat_with_bot(user_input))


Bot: I’m here for you. It’s okay to feel scared sometimes.


# SUMMARY


This project builds a chatbot that detects a user's emotional state from text and responds empathetically. It involves:

1.Data Preprocessing: Cleaning and encoding text data labeled with emotions like fear, anger, and joy.

2.Model Training: Using a Logistic Regression model to classify emotions based on text.

3.Feature Extraction: Transforming text into numerical features with TF-IDF.

4.Response Generation: Providing culturally relevant, empathetic responses based on the predicted emotion.

The goal is to create a chatbot that understands emotional cues and offers supportive, context-aware replies.






