

<h1><center><font size=10> Introduction to LLMs and GenAI</center></font></h1>
<h1><center>Mini Project 2: Word2vec and GloVe</center></h1>

## Problem Statement

### Business Context

In todayâ€™s fast-paced e-commerce landscape, customer reviews significantly influence product perception and buying decisions. Businesses must actively monitor customer sentiment to extract insights and maintain a competitive edge. Ignoring negative feedback can lead to serious issues, such as:

Customer Churn: Unresolved complaints drive loyal customers away, reducing retention and future revenue.

Reputation Damage: Persistent negative sentiment can erode brand trust and deter new buyers.

Financial Loss: Declining sales and shifting customer preference toward competitors directly impact profitability.

Actively tracking and addressing customer sentiment is essential for sustained growth and brand strength.

### Problem Definition

A growing e-commerce platform specializing in electronic gadgets collects customer feedback from product reviews, surveys, and social media. With a 200% increase in their customer base over three years and a recent 25% spike in feedback volume, their manual review process is no longer sustainable.

To address this, the company aims to implement an AI-driven solution to automatically classify customer sentiments (positive, negative, or neutral).

As a Data Scientist, your task is to analyze the provided customer reviewsâ€”along with their labeled sentimentsâ€”and build a predictive model for sentiment classification.

=================================================================================================================

### Data Dictionary

- **Product ID**: An exclusive identification number for each product

- **Product Review**: Insights and opinions shared by customers about the product

- **Sentiment**: Sentiment associated with the product review, indicating whether the review expresses a positive, negative, or neutral sentiment

## Importing the necessary libraries

In [1]:
!pip install gensim

Collecting gensim
  Using cached gensim-4.3.3.tar.gz (23.3 MB)
  Installing build dependencies ...done
[?25h  Getting requirements to build wheel ... [?25done
[?25h  Preparing metadata (pyproject.toml) ... [?25ldone
[?25hCollecting numpy<2.0,>=1.18.5 (from gensim)
  Using cached numpy-1.26.4.tar.gz (15.8 MB)
  Installing build dependencies ... done
[?25h  Getting requirements to build wheel ... [?25ldone
[?25h  Installing backend dependencies ... [?25done
done
[?25hCollecting scipy<1.14.0,>=1.7.0 (from gensim)
  Using cached scipy-1.13.1.tar.gz (57.2 MB)
  Installing build dependedone
[?25h  Getting requirements to build wheel ... [?25ldone
[?25h  Installing backend dependencies ... [?25done
[?25h  Preparing metadata (pyproject.toml) ... [?25error
  [1;31merror[0m: [1msubprocess-exited-with-error[0m
  
  [31mÃ—[0m [32mPreparing metadata [0m[1;32m([0m[32mpyproject.toml[0m[1;32m)[0m did not run successfully.
  [31mâ”‚[0m exit code: [1;36m1[0m
  [31mâ•°â

In [2]:
# to read and manipulate the data
import pandas as pd
import numpy as np
pd.set_option('max_colwidth', None)    # setting column to the maximum column width as per the data

# to visualise data
import matplotlib.pyplot as plt
import seaborn as sns

# to use regular expressions for manipulating text data
import re

# to load the natural language toolkit
import nltk
nltk.download('stopwords')    # loading the stopwords
nltk.download('wordnet')    # loading the wordnet module that is used in stemming

# to remove common stop words
from nltk.corpus import stopwords

# to perform stemming
from nltk.stem.porter import PorterStemmer

# to create Bag of Words
from sklearn.feature_extraction.text import CountVectorizer

# to import Word2Vec
from gensim.models import Word2Vec

# to split data into train and test sets
from sklearn.model_selection import train_test_split

# to build a Random Forest model
from sklearn.ensemble import RandomForestClassifier

# to compute metrics to evaluate the model
from sklearn import metrics
from sklearn.metrics import accuracy_score, classification_report
from sklearn.metrics import confusion_matrix

# To tune different models
from sklearn.model_selection import GridSearchCV

[nltk_data] Downloading package stopwords to
[nltk_data]     /Users/abhinavroyce/nltk_data...
[nltk_data]   Package stopwords is already up-to-date!
[nltk_data] Downloading package wordnet to
[nltk_data]     /Users/abhinavroyce/nltk_data...
[nltk_data]   Package wordnet is already up-to-date!


ModuleNotFoundError: No module named 'gensim'

## Loading the dataset

In [None]:
from google.colab import drive
drive.mount('/content/drive')

In [None]:
# loading data into a pandas dataframe
reviews = pd.read_csv("/content/drive/MyDrive/0- July-Dec 2025/5th sem Intro to LLM and GenAI/Classroom Mini Projects/Part-2/Product_Reviews.csv")

In [None]:
# creating a copy of the data
data = reviews.copy()

## Data Overview

### Checking the first five rows of the data

In [None]:
data.head(5)

### Checking the shape of the dataset

In [None]:
data.shape

* The dataset has 1007 rows and 3 columns.

### Checking for Missing Values

In [None]:
data.isnull().sum()

* There are no missing values in the data

### Checking for duplicate values

In [None]:
# checking for duplicate values
data.duplicated().sum()

* There are 2 duplicate values in the dataset.
* We'll drop them.

In [None]:
# dropping duplicate values
data = data.drop_duplicates()

data.duplicated().sum()

## Exploratory Data Analysis (EDA)


In [None]:
sns.countplot(data=data, x="Sentiment");

In [None]:
data['Sentiment'].value_counts(normalize=True)

- Majority of the reviews are positive (\~85%), followed by neutral reviews (8%), and then the positive reviews (\~7%)

=================================================================================================================

## Text Preprocessing

### Removing special characters from the text

In [None]:
# defining a function to remove special characters
def remove_special_characters(text):
    # Defining the regex pattern to match non-alphanumeric characters
    pattern = '[^A-Za-z0-9]+'

    # Finding the specified pattern and replacing non-alphanumeric characters with a blank string
    new_text = ''.join(re.sub(pattern, ' ', text))

    return new_text

In [None]:
# Applying the function to remove special characters
data['cleaned_text'] = data['Product Review'].apply(remove_special_characters)

In [None]:
# checking a couple of instances of cleaned data
data.loc[0:3, ['Product Review','cleaned_text']]

- We can observe that the function removed the special characters and retained the alphabets and numbers.

### Lowercasing

In [None]:
# changing the case of the text data to lower case
data['cleaned_text'] = data['cleaned_text'].str.lower()

In [None]:
# checking a couple of instances of cleaned data
data.loc[0:3, ['Product Review','cleaned_text']]

- We can observe that all the text has now successfully been converted to lower case.

### Removing extra whitespace

In [None]:
# removing extra whitespaces from the text
data['cleaned_text'] = data['cleaned_text'].str.strip()

In [None]:
# checking a couple of instances of cleaned data
data.loc[0:3, ['Product Review','cleaned_text']]

### Removing stopwords

* The idea with stop word removal is to **exclude words that appear frequently throughout** all the documents in the corpus.
* Pronouns and articles are typically categorized as stop words.
* The `NLTK` library has an in-built list of stop words and it can utilize that list to remove the stop words from a dataset.

In [None]:
# defining a function to remove stop words using the NLTK library
def remove_stopwords(text):
    # Split text into separate words
    words = text.split()

    # Removing English language stopwords
    new_text = ' '.join([word for word in words if word not in stopwords.words('english')])

    return new_text

In [None]:
# Applying the function to remove stop words using the NLTK library
data['cleaned_text_without_stopwords'] = data['cleaned_text'].apply(remove_stopwords)

In [None]:
# checking a couple of instances of cleaned data
data.loc[0:3,['cleaned_text','cleaned_text_without_stopwords']]

* We observe that all the stopwords have been removed.

### Stemming/Lemmatization
We will use lemmatization because we got better results using that on our dataset.

In [None]:
# Function to apply lemmatization
from nltk.stem import WordNetLemmatizer
from nltk.corpus import wordnet #Downloads the WordNet lexical database.WordNet is adictionary-like database where Words are grouped into sets of synonyms
import nltk

# Make sure to download WordNet resources if not already done
nltk.download('wordnet')
nltk.download('omw-1.4') # Open Multilingual WordNet package -This adds language translations, richer word forms, and improved morphological data to WordNet.

lemmatizer = WordNetLemmatizer()

# defining a function to perform stemming
def apply_lemmatizer(text):
    # Split text into separate words
    words = text.split()

    # Applying the Porter Stemmer on every word of a message and joining the stemmed words back into a single string
    new_text = ' '.join([lemmatizer.lemmatize(word) for word in words])

    return new_text

In [None]:
# Applying the function to perform stemming
data['final_cleaned_text'] = data['cleaned_text_without_stopwords'].apply(apply_lemmatizer)

In [None]:
# checking a couple of instances of cleaned data
data.loc[0:2,['cleaned_text_without_stopwords','final_cleaned_text']]

=================================================================================================================

## Text Vectorization

### 1. Count Vectorizer

- We'll use the [`CountVectorizer`](https://scikit-learn.org/stable/modules/generated/sklearn.feature_extraction.text.CountVectorizer.html) class of sklearn to vectorize the data using Bag of Words (BoW).

- We first create the document-term matrix, where each value in the matrix stores the count of a term in a document.

- We then consider only the top *n* terms by frequency
    - *n* is a hyperparameter that one can change and experiment with

In [None]:
# Initializing CountVectorizer with top 1000 words
bow_vec = CountVectorizer(max_features = 1000)

# Applying CountVectorizer on data
data_features_BOW = bow_vec.fit_transform(data['final_cleaned_text'])

# Convert the data features to array
data_features_BOW = data_features_BOW.toarray()


# Shape of the feature vector
print("Shape of the feature vector",data_features_BOW.shape)

# Getting the 1000 words considered by the BoW model
words = bow_vec.get_feature_names_out()

print("first 10 words",words[:10])
print("last 10 words",words[-10:])

# Creating a DataFrame from the data features
df_BOW = pd.DataFrame(data_features_BOW, columns=bow_vec.get_feature_names_out())
df_BOW.head()


- From the above dataframe, we can observe that the word *yet* is present only once in the third document, and the word *would* is presented twice in the fourth document.

## 2. Word2Vec


Word2Vec is a popular technique to convert words into numerical vectors (i.e., embeddings) so that similar words end up having similar vector representations. It helps machines understand the semantic meaning of words based on their context in sentences.

* REAL BENEFIT

  * After training on a large corpus, Word2Vec embeddings capture interesting relationships:

    * vector("king") - vector("man") + vector("woman") â‰ˆ vector("queen")


#### HOW IT WORKS:
Word2Vec has two main models:
* 1. CBOW (Continuous Bag of Words)- Predicts the target word from surrounding context words.
* 2. Skip-gram- Predicts surrounding context words from a target word.



#### Summary:
* Word2Vec turns words into vectors based on their context.

* It helps models understand semantic relationships.

* It works best when trained on large text corpora (like Wikipedia, Google News, etc.).


### 2.1 CBOW

* Example

"The cat sat on the mat"

We'll use a context window of 2 (i.e., two words before and after the target word).
* CBOW works opposite of Skip-gram:

Instead of predicting context from a word, it predicts the word from its context.

| Context (Input Words) | Target (Predicted Word) |
| --------------------- | ----------------------- |
| \["The", "sat"]       | "cat"                   |
| \["cat", "on"]        | "sat"                   |
| \["sat", "the"]       | "on"                    |
| \["on", "mat"]        | "the"                   |

* Note: When using a window size of 2, you can include 2 words on either side if available.

So, the CBOW model is trained to learn that the center word ("cat") is likely when "The" and "sat" are around it.

In [None]:
# Example CBOW
# Note-
  # sg=0 â†’ model is trained to predict target word from context (CBOW)
  # sg=1 â†’ model is trained to predict context words from target (Skip-gram)


from gensim.models import Word2Vec

# Define corpus
sentences = [
    ["the", "cat", "sat", "on", "the", "mat"],
    ["the", "dog", "sat", "on", "the", "rug"],
    ["cats", "and", "dogs", "are", "friends"],
    ["the", "puppy", "played", "with", "the", "ball"],
    ["the", "kitten", "played", "with", "the", "yarn"]
]



# CBOW model (sg=0 for CBOW, sg=1 for skip-gram)
cbow_model = Word2Vec(sentences, vector_size=10, window=2, min_count=1, sg=0)
"""
PARAMETERS:
1. vector_size=10
What it means: Number of dimensions in the word vector.
Example: "cat" â†’ [0.12, -0.56, 0.91, ...] (10 numbers)
Tip: Bigger vectors can store more meaning but need more data & computation.

2. window=2
What it means: How many words before & after the target word are considered context.
Example: In "The cat sat on the mat",
if target = "sat", window=2 â†’ context = "cat", "on", "the", "mat".
Tip:
Small window â†’ local grammar relationships
Large window â†’ broader semantic relationships

3. min_count=1
What it means: Minimum word frequency to be included in the vocabulary.
Example:
min_count=1 â†’ keep all words (good for small datasets)
min_count=5 â†’ ignore words that appear fewer than 5 times (good for large datasets).
Tip: Helps remove rare, noisy words in big corpora.

4. sg=0 or sg=1
What it means: Chooses the training algorithm.
sg=0 â†’ CBOW (predict target word from context)
sg=1 â†’ Skip-gram (predict context words from target)
Example:
CBOW: "cat", "on" â†’ "sat"
Skip-gram: "sat" â†’ "cat", "on"
Tip:
CBOW is faster & works well with frequent words.
Skip-gram is slower but works better with rare words.

5. workers
What it means: Number of CPU threads to use in training.
Word2Vec can process multiple parts of the training data in parallel to speed things up.
Example:
workers=1 â†’ use only 1 CPU core (slower, but deterministic results)
workers=4 â†’ use 4 CPU cores (faster)
Tip:
On your personal machine, you can set it to the number of cores you have.
On Colab / Jupyter with small datasets, it wonâ€™t matter much â€” but for huge corpora, it makes training much faster."""

# Vector for a word
print("Vector for 'cat':")
print(cbow_model.wv['cat'])

# Similar words to 'cat'
print("\nWords similar to 'cat'and the cosine of angles between those vectors:")
print(cbow_model.wv.most_similar('cat'))


### 2.2 Skip-gram

* EXAMPLE:

Let's say we have this simple sentence as our training corpus:

"The cat sat on the mat"

Suppose we want to train a Skip-gram model with a context window of 2.
We'll break the sentence into word pairs where the target word predicts context words.

Skip-gram pairs (target â†’ context):

target: "cat" â†’ context: "The", "sat"

target: "sat" â†’ context: "cat", "on"

target: "on" â†’ context: "sat", "the"

target: "the" â†’ context: "on", "mat"

* The model learns vector representations (say, 100-dimensional) for each word, so that:

  * Words that appear in similar contexts (like "cat" and "dog" if seen in a bigger dataset) will have similar vectors.

  * The distance (cosine similarity) between similar words will be small (close to 1).

In [None]:
# Example on skip gram
from gensim.models import Word2Vec

# Simple corpus
sentences = [
    ["the", "cat", "sat", "on", "the", "mat"],
    ["the", "dog", "sat", "on", "the", "rug"],
    ["cats", "and", "dogs", "are", "friends"],
    ["the", "puppy", "played", "with", "the", "ball"],
    ["the", "kitten", "played", "with", "the", "yarn"]
]


# Train Word2Vec model
model = Word2Vec(sentences, vector_size=10, window=2, min_count=1, sg=1)

# Get vector for 'cat'
print(model.wv['cat'])

# Find similar words
print(model.wv.most_similar('cat'))


### CBOW vs Skip-gram Summary
| Feature    | CBOW                      | Skip-gram             |
| ---------- | ------------------------- | --------------------- |
| Input      | Surrounding context words | Target word           |
| Output     | Predict center word       | Predict context words |
| Faster on  | Large datasets            | Small datasets        |
| Better for | Frequent words            | Rare words            |


## Now Applying on our dataset

In [None]:
import pandas as pd
from gensim.models import Word2Vec
import numpy as np


# Step 1 â€” Tokenize the text
sentences = data['final_cleaned_text'].apply(lambda x: x.split())  # assuming text is already cleaned

# Step 2 â€” Train CBOW Model (sg=0)
cbow_model = Word2Vec(
    sentences,
    vector_size=100,  # length of each word vector
    window=3,         # context window size
    min_count=5,      # include all words
    sg=0,             # CBOW
    workers=4         # CPU cores to use
)

# Step 3 â€” Train Skip-gram Model (sg=1)
skipgram_model = Word2Vec(
    sentences,
    vector_size=100,
    window=3,
    min_count=5,
    sg=1,             # Skip-gram
    workers=4
)

# Step 4 â€” Function to get sentence vectors
def get_sentence_vector(model, tokens):
    word_vecs = [model.wv[word] for word in tokens if word in model.wv]
    if len(word_vecs) == 0:
        return np.zeros(model.vector_size)  # handle empty sentences
    return np.mean(word_vecs, axis=0)

# Step 5 â€” Apply to dataset

# CBOW Vectors
data_cbow_vectors = np.array([get_sentence_vector(cbow_model, tokens) for tokens in sentences])
# Skip-gram Vectors
data_skipgram_vectors = np.array([get_sentence_vector(skipgram_model, tokens) for tokens in sentences])

# Step 6 â€” Convert to DataFrames (optional)
df_cbow = pd.DataFrame(data_cbow_vectors)
df_skipgram = pd.DataFrame(data_skipgram_vectors)




In [None]:
# Checking top 5 similar words to the word 'book'
similar = cbow_model.wv.similar_by_word('book', topn=5)
print(similar)

In [None]:
# Checking top 5 similar words to the word 'review'
similar = model_W2V.wv.similar_by_word('review', topn=5)
print(similar)

### GloVe

In [None]:
from gensim.models import KeyedVectors
# load the Stanford GloVe model
filename = '/content/drive/MyDrive/0- July-Dec 2025/5th sem Intro to LLM and GenAI/Classroom Mini Projects/Part-2/glove.6B.100d.txt.word2vec'
model = KeyedVectors.load_word2vec_format(filename, binary=False)

In [None]:
# Checking the word embedding of a random word
word = "book"
model[word]

In [None]:
#Returning the top 5 similar words.
result = model.most_similar("book", topn=5)
print(result)

In [None]:
#Returning the top 5 similar words.
result = model.most_similar("review", topn=5)
print(result)

In [None]:
#List of words in the vocabulary
words = model.index_to_key

#Dictionary with key as the word and the value as the corresponding embedding vector.
word_vector_dict = dict(zip(model.index_to_key,list(model.vectors)))

#Defining the dimension of the embedded vector.
vec_size=100

def average_vectorizer_GloVe(doc):
    # Initializing a feature vector for the sentence
    feature_vector = np.zeros((vec_size,), dtype="float64")

    # Creating a list of words in the sentence that are present in the model vocabulary
    words_in_vocab = [word for word in doc.split() if word in words]

    # adding the vector representations of the words
    for word in words_in_vocab:
        feature_vector += np.array(word_vector_dict[word])

    # Dividing by the number of words to get the average vector
    if len(words_in_vocab) != 0:
        feature_vector /= len(words_in_vocab)

    return feature_vector

    # creating a dataframe of the vectorized documents
df_glove = pd.DataFrame(data['final_cleaned_text'].apply(average_vectorizer_GloVe).tolist(), columns=['Feature '+str(i) for i in range(vec_size)])
df_glove

In [None]:
import pandas as pd
from sklearn.ensemble import RandomForestClassifier
from sklearn.naive_bayes import MultinomialNB
from sklearn.model_selection import train_test_split
from sklearn.metrics import classification_report, confusion_matrix, f1_score
import matplotlib.pyplot as plt
import seaborn as sns
# Create a list of datasets and their labels
vectorized_datasets = [
    ("BoW", df_BOW),
     ("GloVe", df_glove),
    ("word2Vec_cbow",df_cbow),
    ("skipgram",df_skipgram)
]

# Your target variable
y = data['Sentiment']

# Store results
results = []

# Loop over each dataset and train both classifiers
for name, X in vectorized_datasets:
    # Split data (80/20)
    X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=100)

    # Random Forest
    rf_model = RandomForestClassifier(random_state=100)
    rf_model.fit(X_train, y_train)
    rf_preds = rf_model.predict(X_test)
    rf_f1 = f1_score(y_test, rf_preds, average='macro')
    results.append((f"RandomForest - {name}", rf_f1, rf_model, X_test, y_test, rf_preds))

    """# Multinomial Naive Bayes
    nb_model = MultinomialNB()
    nb_model.fit(X_train, y_train)
    nb_preds = nb_model.predict(X_test)
    nb_f1 = f1_score(y_test, nb_preds, average='macro')
    results.append((f"NaiveBayes - {name}", nb_f1, nb_model, X_test, y_test, nb_preds))"""

    # Gradient Boosting
    from sklearn.ensemble import GradientBoostingClassifier
    gboost = GradientBoostingClassifier(random_state=100)
    gboost.fit(X_train, y_train)
    gb_preds = gboost.predict(X_test)
    gb_f1 = f1_score(y_test, gb_preds, average='macro')
    results.append((f"Gradient Boost - {name}", gb_f1, gboost, X_test, y_test, gb_preds))

    # Ada Boosting
    from sklearn.ensemble import AdaBoostClassifier
    ada = AdaBoostClassifier()
    ada.fit(X_train, y_train)
    ada_preds = ada.predict(X_test)
    ada_f1 = f1_score(y_test, ada_preds, average='macro')
    results.append((f"Adaptive Boost - {name}", ada_f1, ada, X_test, y_test, ada_preds))






# Sort results by F1 score (descending)
results.sort(key=lambda x: x[1], reverse=True)

# Print all F1 scores
print("\nðŸ“Š Model Performance (Macro F1-scores):\n")
for label, f1_score_val, _, _, _, _ in results:
    print(f"{label:30s}: Macro F1 = {f1_score_val:.4f}")




100, 3

ðŸ“Š Model Performance (Macro F1-scores):

Gradient Boost - BoW          : Macro F1 = 0.6442
Gradient Boost - skipgram     : Macro F1 = 0.5362
RandomForest - skipgram       : Macro F1 = 0.5354
RandomForest - GloVe          : Macro F1 = 0.5138
Gradient Boost - word2Vec_cbow: Macro F1 = 0.5096
RandomForest - BoW            : Macro F1 = 0.4818
RandomForest - word2Vec_cbow  : Macro F1 = 0.4818
Gradient Boost - GloVe        : Macro F1 = 0.4776
Adaptive Boost - GloVe        : Macro F1 = 0.3715
Adaptive Boost - skipgram     : Macro F1 = 0.3436
Adaptive Boost - word2Vec_cbow: Macro F1 = 0.3414
Adaptive Boost - BoW          : Macro F1 = 0.3025

1000, 2

ðŸ“Š Model Performance (Macro F1-scores):

Gradient Boost - BoW          : Macro F1 = 0.6442
RandomForest - GloVe          : Macro F1 = 0.5138
Gradient Boost - skipgram     : Macro F1 = 0.5111
RandomForest - BoW            : Macro F1 = 0.4818
RandomForest - word2Vec_cbow  : Macro F1 = 0.4818
Gradient Boost - GloVe        : Macro F1 = 0.4776
RandomForest - skipgram       : Macro F1 = 0.4724
Gradient Boost - word2Vec_cbow: Macro F1 = 0.4346
Adaptive Boost - GloVe        : Macro F1 = 0.3715
Adaptive Boost - skipgram     : Macro F1 = 0.3700
Adaptive Boost - BoW          : Macro F1 = 0.3025
Adaptive Boost - word2Vec_cbow: Macro F1 = 0.3005

In [None]:
# Best model
best_model_label, best_f1, best_model, X_test_best, y_test_best, y_pred_best = results[0]

print(f"\nâœ… Best Model: {best_model_label} (Macro F1 = {best_f1:.4f})\n")
print("Classification Report:\n")
print(classification_report(y_test_best, y_pred_best))

# Plot Confusion Matrix
cm = confusion_matrix(y_test_best, y_pred_best, labels=best_model.classes_)
plt.figure(figsize=(6, 5))
sns.heatmap(cm, annot=True, fmt='d', cmap='Blues', xticklabels=best_model.classes_, yticklabels=best_model.classes_)
plt.title(f"Confusion Matrix: {best_model_label}")
plt.xlabel("Predicted")
plt.ylabel("Actual")
plt.tight_layout()
plt.show()

# Conclusion

- We analyzed the distribution of sentiments of the customers.

- We used different text processing techniques to clean the raw text data.

- We then built an ML model (Random Forest) with the vectorized data.

- The Random Forest model was able to achieve a recall score of 88% on the test dataset.
    - The model can be tuned further or a different model can be trained to model the data better.

- By pinpointing areas of improvement or concerns raised by customers based on the predictions of the model, the organization can take swift and targeted actions to address issues, minimizing the risk of revenue loss and bolstering customer satisfaction.

- The organization can leverage sentiment categorizations to tailor marketing strategies.
    - Highlighting positive sentiments in promotional material can contribute to a positive brand image.
    - They can use neutral and negative sentiments to make informed decisions around inventory.

<font size=6 color='blue'>Thanks.....</font>
___