# Build a model to analyze sentiment in tweets about Apple and Google products

##  OVERVIEW AND DATA UNDERSTANDING 
#### **Business Understanding**

##### **What is  sentiment analysis?**

**Sentiment Analysis** also known as **Sentiment Classification** in brief uses natural language processing to identify the emotional tone behind text, such as customer feedback, and categorize it as positive, negative, or neutral. 

The above can also be described as a text classification tasks, where we look at a phrase, or a list of phrases and use a classifier to tell if the sentiment behind that is:

- positive
- negative 
- neutral. 

In some cases, the third attribute is not taken to keep it a binary classification problem. 

In this project we will thus carry out a sentiment classification task where we will analyzes tweets, their emotions, and whether they are directed at a brand or product i.e Apple and Google products



##### **How is sentiment analysis important to an organization?**

For your organization, sentiment analysis is crucial in the following ways:

- Understanding customer opinions for improving experiences, and addressing concerns proactively. 
- It helps to predict customer behavior for a particular product
- It can help to test the adaptability of a product
- Automates the task of customer preference reports.

The above are but a few benefits, but in general sentiment analysis assit business stake holders to also define various business problems regarding their products


##### **An Overview of Apple Products**
Apple is one of the most recognized brands in the world valued at over $2 trillion in 2021.  

It is known for its innovative consumer electronics, including the iPhone, iPad, MacBook, and other devices.

Apple’s next products, which may include a virtual reality headset and self-driving car. The past few product launches have been smaller in scope, like the HomePod and AirPod, and Apple fans are clamouring for the next iPhone.  

Given their wide range of products some mentioned above , below we have some statistics on their performance in 2023 as at September: 

- 231 million iPhones, 49 million iPads and 22 million Mac and MacBook units were sold in 2023
- Apple’s home and wearables division declined by 6.5% in 2023
- It sold 75 million AirPods and 38 million Apple Watches in 2023
- Apple Music has 93 million subscribers, Apple TV+ has 47 million


To be able to consistently get high revenues, Apple needs to continuously carry out sentiment analysis on the users' strong emotional reactions to the brand , which frequently result in a mix of positive and negative sentiments in their data. 

Tweets being a source of helpful data , looking at Tweets on Apple products could, among other things, cover customer service experiences, software upgrades, or the introduction of new items. 


##### **An Overview of Google Products**

Google offers diverse products designed to enhance productivity, connectivity, and innovation.

Key offerings include:

- Google Search
- Gmail
- Google Drive
- Google Workspace for organizing and collaborating
- YouTube
- Google Photos
- Google Play for entertainment
- Google Maps, Waze, and Google Earth for navigation. 
- Businesses benefit from Google Ads, Google Analytics, and Google Cloud Platform, while developers use tools like Firebase and   BigQuery.

Additionally, smart devices like Pixel phones, Nest home products, and Chromecast provide cutting-edge hardware solutions. Overall, Google's products aim to simplify daily life, empower businesses, and connect the world.

In 2021 Statistics Highlighting Google's performance showed a revenue of $278.1 billion.

Similar to Apple, for Google to continue to thrive, sentiment analysis is thus crucial to check on matters such as customer satisfaction.

#### **Why Analyze Tweets?**
- **Social Media Influence**: Platforms like Twitter have become primary channels where customers share their feedback, both positive and negative, about brands and products.
- **Volume of Data**: The massive and real-time nature of tweets makes manual analysis impractical, necessitating automated solutions.
- **Business Impact**: Sentiment analysis of tweets can provide actionable insights to enhance customer experience, refine marketing strategies, and maintain a competitive edge.


##### ** Stakeholders **

Sentiment analysis is important for various participants such as:

- **Business Managers**: Understand customer satisfaction and drive decision-making.
- **Marketing Teams**: Create sentiment-driven marketing campaigns.
- **Customer Service Teams**: Prioritize resolving issues flagged in negative reviews.

#### **Challenges in Sentiment Analysis**
- **Unstructured Data**: Tweets are often informal, with abbreviations, slang, and emojis, making preprocessing essential.
- **Ambiguity**: Some texts may have mixed sentiments or implicit emotions that are challenging to classify.
- **Scalability**: Handling and processing large datasets efficiently is a significant challenge.

#### **Proposed Solutions**
Proof of Concept Approach: 

To execute the sentiment analysis . The following is the execution plan:
- Begin with simple approaches like bag-of-words or TF-IDF vectorization 
- Proceed to commplex methods (e.g., word embeddings or transformers)

Pre-trained Tools: 

NLP has many pre-trained models (e.g., spaCy, NLTK, Hugging Face Transformers) and libraries for quick text processing. 

For example, use:
- TF-IDF + Logistic Regression for a baseline.
- Pre-trained embeddings (e.g., Word2Vec, GloVe) for better results.
- Fine-tuned BERT if there is access to good hardware.

#### Projected Conclusion





## **Problem statement**

#### Business Problem:
- In today’s digital world, customer feedback plays a critical role in shaping business decisions. Companies receive large volumes of unstructured textual data in the form of reviews, surveys, and social media posts. Analyzing this data manually is time-consuming and error-prone.

The goal of this project is to build a sentiment analysis model that classifies customer feedback as positive, negative, or neutral. 

This will enable businesses to:
- Identify key areas for improvement.
- Tailor marketing strategies based on customer sentiment.
- Monitor brand reputation over time.


## **Objectives**

1. **Primary Objective**:
   - Build a machine learning-based sentiment classification model that categorizes tweets as **positive**, **negative**, or **neutral** towards a brand or product.
   
2. **Secondary Objectives**:
   - Identify whether a tweet contains an emotion directed at a specific brand or product.
   - Preprocess and clean the tweet text to remove noise (e.g., hashtags, mentions, and URLs).
   - Extract key textual features that indicate sentiment and brand-related emotions.
   - Provide actionable insights to help businesses improve customer satisfaction and marketing strategies.
   
   
### **Key Questions to Address**
1. How can we preprocess and clean textual data effectively to extract meaningful insights?
2. What are the best features to use (e.g., word embeddings, TF-IDF, or sentiment lexicons) for classifying tweet sentiment?
3. Which supervised learning models (e.g., Logistic Regression, Random Forest, or BERT) perform best for this task?
4. What level of accuracy, precision, and recall can we achieve for sentiment classification?



## **Metrics of Success**

To evaluate the success of our sentiment analysis model, we will use metrics such as; accuracy, precision, recall or sensitivity, f1 score and the confusion matrix. 

To evaluate the performance of the sentiment classification model, we will use the following metrics:

1. **Accuracy**:
   - Accuracy will check at the percentage of the correctly classified instances of sentiments out of the total sentmental            instances.
   - Target: **85% or higher**.

2. **Precision**:
   - Precision will tell the percentage of actually correct positive sentiment predictions, thus telling us how often the model      is correct when it predicts a positive sentiment. The percentage of actual positive sentiments, that are correctly              identified by the model will be shown by recall. This metrics is important to strike a tradeoff between true positives and      false negatives
   - Target: **80% or higher** for each class (positive, negative, neutral).

3. **Recall**:
   - Measure the model’s ability to correctly identify all relevant examples of a specific sentiment.
   - Target: **75% or higher** for each class.

4. **F1-Score**:
   - Provide a balanced metric that considers both precision and recall.
   - Target: **80% or higher** overall.

5. **Business Impact**:
   - Improved customer satisfaction through the identification of key negative sentiments.
   - Better marketing strategies based on trends in positive feedback.


### 2. DATA UNDERSTANDING

In [166]:
# Importing necessary libraries
import pandas as pd
import numpy as np
import re
import matplotlib.pyplot as plt
import seaborn as sns
from wordcloud import WordCloud
from sklearn.model_selection import train_test_split
from sklearn.feature_extraction.text import TfidfVectorizer
from sklearn.naive_bayes import MultinomialNB
from sklearn.metrics import classification_report, confusion_matrix
import nltk
from nltk.corpus import stopwords
from nltk.tokenize import word_tokenize
from nltk.stem import WordNetLemmatizer

from nltk.tokenize import word_tokenize

# Download NLTK resources
nltk.download('stopwords')
nltk.download('punkt')
nltk.download('wordnet')

[nltk_data] Downloading package stopwords to
[nltk_data]     C:\Users\user\AppData\Roaming\nltk_data...
[nltk_data]   Package stopwords is already up-to-date!
[nltk_data] Downloading package punkt to
[nltk_data]     C:\Users\user\AppData\Roaming\nltk_data...
[nltk_data]   Package punkt is already up-to-date!
[nltk_data] Downloading package wordnet to
[nltk_data]     C:\Users\user\AppData\Roaming\nltk_data...
[nltk_data]   Package wordnet is already up-to-date!


True

In [167]:
# Update this to your file's path
df = pd.read_csv('judge-1377884607_tweet_product_company.csv', encoding='latin1')
# Display the first few rows of the dataset
print("Dataset Head:")
df.head(10)

Dataset Head:


Unnamed: 0,tweet_text,emotion_in_tweet_is_directed_at,is_there_an_emotion_directed_at_a_brand_or_product
0,.@wesley83 I have a 3G iPhone. After 3 hrs twe...,iPhone,Negative emotion
1,@jessedee Know about @fludapp ? Awesome iPad/i...,iPad or iPhone App,Positive emotion
2,@swonderlin Can not wait for #iPad 2 also. The...,iPad,Positive emotion
3,@sxsw I hope this year's festival isn't as cra...,iPad or iPhone App,Negative emotion
4,@sxtxstate great stuff on Fri #SXSW: Marissa M...,Google,Positive emotion
5,@teachntech00 New iPad Apps For #SpeechTherapy...,,No emotion toward brand or product
6,,,No emotion toward brand or product
7,"#SXSW is just starting, #CTIA is around the co...",Android,Positive emotion
8,Beautifully smart and simple idea RT @madebyma...,iPad or iPhone App,Positive emotion
9,Counting down the days to #sxsw plus strong Ca...,Apple,Positive emotion


In [168]:
df['emotion_in_tweet_is_directed_at'].value_counts().keys()

Index(['iPad', 'Apple', 'iPad or iPhone App', 'Google', 'iPhone',
       'Other Google product or service', 'Android App', 'Android',
       'Other Apple product or service'],
      dtype='object')

In [169]:
df = df.dropna(subset = ['tweet_text'])

In [170]:
sentense = df['tweet_text'][9092]

### Dealing with the text column
1. Basic cleaning: removing capitalization, special characters such as ?,;., converting to lower case
2. Tokenizing our texts column
3. create a new column with joined words
4. removing the stopwords

In [171]:
import re
# Regular expression to extract Twitter usernames
pattern = r"@\w+"
pattern_2 = r'#\w+'

# Extract usernames from the 'Tweets' column
df['Usernames'] = df['tweet_text'].apply(lambda x: re.findall(pattern, x))
df['Tagged_Names'] = df['tweet_text'].apply(lambda x: re.findall(pattern_2, x))

df.head(10)

Unnamed: 0,tweet_text,emotion_in_tweet_is_directed_at,is_there_an_emotion_directed_at_a_brand_or_product,Usernames,Tagged_Names
0,.@wesley83 I have a 3G iPhone. After 3 hrs twe...,iPhone,Negative emotion,[@wesley83],"[#RISE_Austin, #SXSW]"
1,@jessedee Know about @fludapp ? Awesome iPad/i...,iPad or iPhone App,Positive emotion,"[@jessedee, @fludapp]",[#SXSW]
2,@swonderlin Can not wait for #iPad 2 also. The...,iPad,Positive emotion,[@swonderlin],"[#iPad, #SXSW]"
3,@sxsw I hope this year's festival isn't as cra...,iPad or iPhone App,Negative emotion,[@sxsw],[#sxsw]
4,@sxtxstate great stuff on Fri #SXSW: Marissa M...,Google,Positive emotion,[@sxtxstate],[#SXSW]
5,@teachntech00 New iPad Apps For #SpeechTherapy...,,No emotion toward brand or product,[@teachntech00],"[#SpeechTherapy, #SXSW, #iear, #edchat, #asd]"
7,"#SXSW is just starting, #CTIA is around the co...",Android,Positive emotion,[],"[#SXSW, #CTIA, #googleio, #android]"
8,Beautifully smart and simple idea RT @madebyma...,iPad or iPhone App,Positive emotion,"[@madebymany, @thenextweb]","[#hollergram, #sxsw]"
9,Counting down the days to #sxsw plus strong Ca...,Apple,Positive emotion,[],[#sxsw]
10,Excited to meet the @samsungmobileus at #sxsw ...,Android,Positive emotion,[@samsungmobileus],"[#sxsw, #fail]"


In [172]:
sentence = df['tweet_text'][10]
sentence

'Excited to meet the @samsungmobileus at #sxsw so I can show them my Sprint Galaxy S still running Android 2.1.   #fail'

##### converting to lower case
Apply to the whole dataframe.

In [173]:
# Transform the whole dataset (df[tweet_text]) to lowercase
df["tweet_text"] = df["tweet_text"].str.lower()

In [174]:
# Display full text
#df.style.set_properties(**{'text-align': 'left'})

In [175]:
sentence = df['tweet_text'][1]
sentence

"@jessedee know about @fludapp ? awesome ipad/iphone app that you'll likely appreciate for its design. also, they're giving free ts at #sxsw"

In [176]:

import re

# Create a function that removes words starting with @ and #
def remove_words_with_at(text):
    # Use a regular expression to remove words containing "@" and words starting with "#"
    cleaned_text = re.sub(r'\S*@\S*|#\w+','', text)
  
    return cleaned_text

In [177]:
remove_words_with_at(sentence)

" know about  ? awesome ipad/iphone app that you'll likely appreciate for its design. also, they're giving free ts at "

In [178]:
# Apply the function to the whole dataset
# Create a new column for texts with no @ and # 

# Create new column with tokenized data
df["clean_tweet_text"] = df["tweet_text"].apply(remove_words_with_at)
# Display full text
#df.style.set_properties(**{'text-align': 'left'})
df.head()

Unnamed: 0,tweet_text,emotion_in_tweet_is_directed_at,is_there_an_emotion_directed_at_a_brand_or_product,Usernames,Tagged_Names,clean_tweet_text
0,.@wesley83 i have a 3g iphone. after 3 hrs twe...,iPhone,Negative emotion,[@wesley83],"[#RISE_Austin, #SXSW]","i have a 3g iphone. after 3 hrs tweeting at ,..."
1,@jessedee know about @fludapp ? awesome ipad/i...,iPad or iPhone App,Positive emotion,"[@jessedee, @fludapp]",[#SXSW],know about ? awesome ipad/iphone app that yo...
2,@swonderlin can not wait for #ipad 2 also. the...,iPad,Positive emotion,[@swonderlin],"[#iPad, #SXSW]",can not wait for 2 also. they should sale th...
3,@sxsw i hope this year's festival isn't as cra...,iPad or iPhone App,Negative emotion,[@sxsw],[#sxsw],i hope this year's festival isn't as crashy a...
4,@sxtxstate great stuff on fri #sxsw: marissa m...,Google,Positive emotion,[@sxtxstate],[#SXSW],"great stuff on fri : marissa mayer (google), ..."


In [199]:
def extract_category_words(tweet, categories):
    # Tokenize and check for category words
    extracted_words = []
    for category in categories:
        if category.lower() in tweet.lower():
            extracted_words.append(category)
    return " ".join(extracted_words)
df['category_words'] = df['clean_tweet_text'].apply(lambda x: extract_category_words(x, categories))
df['category_words'].value_counts()

Google                        2169
                              1784
iPad                          1714
Apple                         1193
iPhone                        1041
iPad Apple                     569
Android                        241
iPhone Android                 118
iPad iPhone                    103
Android App Android             30
Apple Google                    23
Apple iPhone                    23
iPad Android                    23
Google Android                  15
iPhone Android App Android      10
iPad Google                     10
iPad iPhone Android              8
Apple Android                    7
iPad Apple iPhone                4
Google iPhone                    3
iPad Google iPhone               2
Google iPhone Android            2
Name: category_words, dtype: int64

##### Dealing with the blanks


In [180]:

df

Unnamed: 0,tweet_text,emotion_in_tweet_is_directed_at,is_there_an_emotion_directed_at_a_brand_or_product,Usernames,Tagged_Names,clean_tweet_text,category_words
0,.@wesley83 i have a 3g iphone. after 3 hrs twe...,iPhone,Negative emotion,[@wesley83],"[#RISE_Austin, #SXSW]","i have a 3g iphone. after 3 hrs tweeting at ,...",iPhone
1,@jessedee know about @fludapp ? awesome ipad/i...,iPad or iPhone App,Positive emotion,"[@jessedee, @fludapp]",[#SXSW],know about ? awesome ipad/iphone app that yo...,iPad iPhone
2,@swonderlin can not wait for #ipad 2 also. the...,iPad,Positive emotion,[@swonderlin],"[#iPad, #SXSW]",can not wait for 2 also. they should sale th...,
3,@sxsw i hope this year's festival isn't as cra...,iPad or iPhone App,Negative emotion,[@sxsw],[#sxsw],i hope this year's festival isn't as crashy a...,iPhone
4,@sxtxstate great stuff on fri #sxsw: marissa m...,Google,Positive emotion,[@sxtxstate],[#SXSW],"great stuff on fri : marissa mayer (google), ...",Google
...,...,...,...,...,...,...,...
9088,ipad everywhere. #sxsw {link},iPad,Positive emotion,[],[#SXSW],ipad everywhere. {link},iPad
9089,"wave, buzz... rt @mention we interrupt your re...",,No emotion toward brand or product,[@mention],"[#sxsw, #google, #circles]","wave, buzz... rt we interrupt your regularly ...",
9090,"google's zeiger, a physician never reported po...",,No emotion toward brand or product,[],"[#sxsw, #health2dev]","google's zeiger, a physician never reported po...",Google
9091,some verizon iphone customers complained their...,,No emotion toward brand or product,[],[#SXSW],some verizon iphone customers complained their...,iPhone


In [181]:
# Verify our cleaned tweet text
sentence = df['clean_tweet_text'][2347]
sentence

' check out 2011 south by southwest interactive iphone/ipad must-have apps, sites and tools by  {link} '

#### Remove numerical values, and stop words

In [182]:
# Function to remove numerical values using regex
def remove_numbers(text):
    return re.sub(r'\d+','', text)
# Check if the function works with our sentence
clean_sentence = remove_numbers(sentence)
clean_sentence

' check out  south by southwest interactive iphone/ipad must-have apps, sites and tools by  {link} '

In [183]:
# Apply the function to the column
df['clean_tweet_text'] = df['clean_tweet_text'].apply(remove_numbers)
#confirm if it worked
sentence 

' check out 2011 south by southwest interactive iphone/ipad must-have apps, sites and tools by  {link} '

#### Tokenizing

In [184]:
# Import the regexptokenizer
from nltk.tokenize import RegexpTokenizer

basic_token_pattern = r"(?u)\b\w\w+\b"

tokenizer = RegexpTokenizer(basic_token_pattern)
tokenizer.tokenize(sentence)

['check',
 'out',
 '2011',
 'south',
 'by',
 'southwest',
 'interactive',
 'iphone',
 'ipad',
 'must',
 'have',
 'apps',
 'sites',
 'and',
 'tools',
 'by',
 'link']

In [185]:
# Create new column with tokenized data
df["text_tokenized"] = df["clean_tweet_text"].apply(tokenizer.tokenize)
# Display full text
#df.style.set_properties(**{'text-align': 'left'})
df.head()

Unnamed: 0,tweet_text,emotion_in_tweet_is_directed_at,is_there_an_emotion_directed_at_a_brand_or_product,Usernames,Tagged_Names,clean_tweet_text,category_words,text_tokenized
0,.@wesley83 i have a 3g iphone. after 3 hrs twe...,iPhone,Negative emotion,[@wesley83],"[#RISE_Austin, #SXSW]","i have a g iphone. after hrs tweeting at , i...",iPhone,"[have, iphone, after, hrs, tweeting, at, it, w..."
1,@jessedee know about @fludapp ? awesome ipad/i...,iPad or iPhone App,Positive emotion,"[@jessedee, @fludapp]",[#SXSW],know about ? awesome ipad/iphone app that yo...,iPad iPhone,"[know, about, awesome, ipad, iphone, app, that..."
2,@swonderlin can not wait for #ipad 2 also. the...,iPad,Positive emotion,[@swonderlin],"[#iPad, #SXSW]",can not wait for also. they should sale the...,,"[can, not, wait, for, also, they, should, sale..."
3,@sxsw i hope this year's festival isn't as cra...,iPad or iPhone App,Negative emotion,[@sxsw],[#sxsw],i hope this year's festival isn't as crashy a...,iPhone,"[hope, this, year, festival, isn, as, crashy, ..."
4,@sxtxstate great stuff on fri #sxsw: marissa m...,Google,Positive emotion,[@sxtxstate],[#SXSW],"great stuff on fri : marissa mayer (google), ...",Google,"[great, stuff, on, fri, marissa, mayer, google..."


In [186]:
# Function to Load the data
def Load_dataset(data):
    df = pd.read_csv(data, encoding='latin1')
    return df

#show the shape of the dataset
def data_shape(df):
    print(df.shape)
    
#Check for the dataset information   
def check_Info(df):
    print(df.info())

#Show the columns
def Columns(df):
    print(df.columns)
    
# Check the columns data_types
def data_types(df):
    print(df.dtypes)
    
# Check for missing values
def check_for_missing_values(df):
    print (df.isnull().sum())
    
# Check for the duplicates
def Duplicates(df):
    print(df.duplicated().sum())
    
# The description of the data(descriptives)
def Describe_data(df):
    return(df.describe())


In [187]:
data_shape(df)

(9092, 8)


In [188]:
check_Info(df)

<class 'pandas.core.frame.DataFrame'>
Int64Index: 9092 entries, 0 to 9092
Data columns (total 8 columns):
 #   Column                                              Non-Null Count  Dtype 
---  ------                                              --------------  ----- 
 0   tweet_text                                          9092 non-null   object
 1   emotion_in_tweet_is_directed_at                     3291 non-null   object
 2   is_there_an_emotion_directed_at_a_brand_or_product  9092 non-null   object
 3   Usernames                                           9092 non-null   object
 4   Tagged_Names                                        9092 non-null   object
 5   clean_tweet_text                                    9092 non-null   object
 6   category_words                                      9092 non-null   object
 7   text_tokenized                                      9092 non-null   object
dtypes: object(8)
memory usage: 959.3+ KB
None


In [189]:
Columns(df)

Index(['tweet_text', 'emotion_in_tweet_is_directed_at',
       'is_there_an_emotion_directed_at_a_brand_or_product', 'Usernames',
       'Tagged_Names', 'clean_tweet_text', 'category_words', 'text_tokenized'],
      dtype='object')


In [195]:
check_for_missing_values(df)

tweet_text                                               0
emotion_in_tweet_is_directed_at                       5801
is_there_an_emotion_directed_at_a_brand_or_product       0
Usernames                                                0
Tagged_Names                                             0
clean_tweet_text                                         0
category_words                                           0
text_tokenized                                           0
tweet_tokenized                                          0
dtype: int64


In [191]:
Describe_data(df)

Unnamed: 0,tweet_text,emotion_in_tweet_is_directed_at,is_there_an_emotion_directed_at_a_brand_or_product,Usernames,Tagged_Names,clean_tweet_text,category_words,text_tokenized
count,9092,3291,9092,9092,9092,9092,9092,9092
unique,9047,9,4,56,2059,8990,22,8657
top,rt @mention marissa mayer: google will connect...,iPad,No emotion toward brand or product,[],[#sxsw],rt marissa mayer: google will connect the dig...,Google,"[rt, google, to, launch, major, new, social, n..."
freq,9,946,5388,4173,2488,9,2169,25


In [192]:
token = word_tokenize(sentence)
token

['check',
 'out',
 '2011',
 'south',
 'by',
 'southwest',
 'interactive',
 'iphone/ipad',
 'must-have',
 'apps',
 ',',
 'sites',
 'and',
 'tools',
 'by',
 '{',
 'link',
 '}']

In [193]:
df['tweet_tokenized'] =[word_tokenize(sentence) for sentence in df['tweet_text']]


In [194]:

# Clean text (remove unwanted characters and convert to lowercase)
def clean_text(text):
    if not isinstance(text, str):
        return ''
    text = re.sub(r'@[A-Za-z0-9_]+', '', text)  # Remove mentions (@user)
    text = re.sub(r'http\S+', '', text)  # Remove URLs
    text = re.sub(r'[^a-zA-Z\s]', '', text)  # Remove non-alphabetic characters
    text = text.lower()  # Convert text to lowercase
    return text

#df['cleaned_tweet'] = df['tweet_text'].apply(clean_text)
cleaned_sentense = clean_text(sentense)
cleaned_sentense

'rt  google tests checkin offers at sxsw link'

In [15]:
tokens = nltk.word_tokenize(df[])

SyntaxError: invalid syntax (<ipython-input-15-624799ddebac>, line 1)

In [None]:
df['User_name'] = []

In [None]:
df['emotion_in_tweet_is_directed_at'].isna().value_counts(normalize = True)

In [None]:
# Display dataset information
print("\nDataset Information:")
df.info()

In [None]:
# Display value counts for the target variable
print("\nClass Distribution:")
print(df['is_there_an_emotion_directed_at_a_brand_or_product'].value_counts())

In [None]:
df[]