# Natural Language Processing (NLP) Project : Sentiment Analysis on Tweets about Apple and Google Products


## Overview
This project aims to analyze Twitter sentiment about Apple and Google products using Natural Language Processing (NLP). The dataset contains tweets labeled as positive, negative, or neutral. By building a sentiment analysis model, we aim to categorize the sentiment of tweets accurately and gain insights into public perception of these tech giants' products.

## Business Understanding
### Business Problem
Understanding customer sentiment is critical for businesses to gauge public opinion and improve products or services. For Apple and Google, analyzing Twitter sentiment can provide actionable insights to enhance user satisfaction and market strategies.

### Stakeholders
1. `Marketing Teams`:
Use insights to create targeted campaigns focusing on products with positive sentiment.
Address negative feedback to improve brand perception.
2. `Product Teams`:
Identify areas of improvement for specific products (e.g., iPhone or Pixel).
3. `Executives`:
Make data-driven decisions for product launches, pricing strategies, and market positioning.

### Objectives
1. Build a model to classify the sentiment of tweets into positive, negative, or neutral categories.
2. Evaluate model performance using suitable metrics.
3. Provide insights and recommendations based on the analysis results.

## Data Understanding
Here, we intend to comprehensively explore and analyze our dataset to gain insights into its structure, content, and quality.

In [1]:
# Import necessary libraries
import pandas as pd

In [2]:
# Load the data
df = pd.read_csv('sentiment.csv', encoding='unicode_escape')

In [3]:
# View the first five rows of the dataset to see if loading has been done correctly
df.head()

Unnamed: 0,tweet_text,emotion_in_tweet_is_directed_at,is_there_an_emotion_directed_at_a_brand_or_product
0,.@wesley83 I have a 3G iPhone. After 3 hrs twe...,iPhone,Negative emotion
1,@jessedee Know about @fludapp ? Awesome iPad/i...,iPad or iPhone App,Positive emotion
2,@swonderlin Can not wait for #iPad 2 also. The...,iPad,Positive emotion
3,@sxsw I hope this year's festival isn't as cra...,iPad or iPhone App,Negative emotion
4,@sxtxstate great stuff on Fri #SXSW: Marissa M...,Google,Positive emotion


In [4]:
# View the overall information of each column
df.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 9093 entries, 0 to 9092
Data columns (total 3 columns):
 #   Column                                              Non-Null Count  Dtype 
---  ------                                              --------------  ----- 
 0   tweet_text                                          9092 non-null   object
 1   emotion_in_tweet_is_directed_at                     3291 non-null   object
 2   is_there_an_emotion_directed_at_a_brand_or_product  9093 non-null   object
dtypes: object(3)
memory usage: 213.2+ KB


In [5]:
# Summary statistics of columns
df.describe()

Unnamed: 0,tweet_text,emotion_in_tweet_is_directed_at,is_there_an_emotion_directed_at_a_brand_or_product
count,9092,3291,9093
unique,9065,9,4
top,RT @mention Marissa Mayer: Google Will Connect...,iPad,No emotion toward brand or product
freq,5,946,5389


In [6]:
df['is_there_an_emotion_directed_at_a_brand_or_product'].unique()

array(['Negative emotion', 'Positive emotion',
       'No emotion toward brand or product', "I can't tell"], dtype=object)

### Data Overview
The dataset provides comprehensive information about over 9,000 tweets labeled as positive, negative, or neutral.
- **Rows**: 9,093.
- **Columns**: 3.
- **Column Names**: 
     - `tweet_text` - Content of the tweet..
     - `emotion_in_tweet_is_directed_at` - Subject of the tweet_text.
     - `is_there_an_emotion_directed_at_a_brand_or_product` - Sentiment label.
- **Data Types**:
     - `Categorical Columns`: All 3 columns.
- **Missing Values**:
     - `tweet_text`: 1 missing value.
     - `emotion_in_tweet_is_directed_at`: 5,802 missing values.
     - `is_there_an_emotion_directed_at_a_brand_or_product`: No missing values.
- **Nunique**:
     - `tweet_text`: 9,065 unique tweets.
     - `emotion_in_tweet_is_directed_at`: 9 unique products.
     - `is_there_an_emotion_directed_at_a_brand_or_product`: 4 unique sentiment labels ('Negative emotion', 'Positive emotion', 'No emotion toward brand or product', 'I can't tell')