# Building a Twitter Sentiment Analysis Tool (Because who doesn't love a good Apple vs. Google debate?)

![Jupyter Logo](Twitter_pic.avif)

### Table of Content

1. Introduction
   - 1.1 Background
   - 1.2 Objectives
   - 1.3 Scope of the Project
   - 1.4 Significance of the Study

2. Business Understanding
   - 2.1 Stakeholders
   - 2.2 Problem Statement
   - 2.3 Objectives
   - 2.4 Metrics of Success

3. Data Preparation
   - 3.1 Data Understanding
   - 3.2 Data Preprocessing
   - 3.2 Data Cleaning
   - 3.4 Data Integration

4. Exploratory Data Analysis (EDA)
   - 4.1 Descriptive Statistics

7. Machine Learning Models
   - 7.1 Model Selection
   - 7.2 Feature Engineering
   - 7.3 Model Training
   - 7.4 Model Evaluation

8. Visualization and Reporting
   - 8.1 Visualization of Key Insights
   - 8.2 Statistical Results
   - 8.3 Implications for the Coding Community
   - 8.4 Recommendations for Stakeholders

9. Conclusion
   - 9.1 Summary of Findings
   - 9.2 Recommendations
   - 9.3 Future Research Directions



## 1. Introduction: Unveiling the Voices of Twitter

Have you ever wondered what people are really saying about Apple and Google products on Twitter?  Sure, we see the flashy ads and press releases, but what about the tweets from everyday users?  This project dives into the world of Social Media Sentiment Analysis to understand how people feel about these tech giants.

**1.2 Objectives**

Let's build a basic sentiment analysis tool! This **"proof-of-concept"** model will categorize Tweets about Apple and Google products as positive, negative, or neutral.
We'll then evaluate its performance to see how well it can actually decipher people's opinions.

**1.3 Scope of the Project**

For now, we'll focus on building a model that can distinguish between positive and negative tweets.  In the future, we might even be able to include neutral tweets to get a more complete picture.

**1.4 Significance of the Study**

This project is a stepping stone towards `real-time social media listening.` Imagine being able to track how people feel about your latest product launch `as it happens.` This can help us make data-driven decisions, improve brand reputation, and stay ahead of the curve.

## 2. Business Understanding

**2.1 Stakeholders**

- `Marketing Ninjas:` This tool is to understand brand reputation and customer sentiment on social media.

- `Product Development Heroes:` This one's for you too! See what users are saying about Apple and Google products to guide future development.

**2.2 Problem Statement**

Traditionally, gauging customer sentiment involved surveys and focus groups. But these methods are slow and don't provide real-time insights. We need a faster, more scalable way to analyze public opinion on social media, and that's where this project comes in!

**2.3 Objectives**

- Develop an NLP model that automatically sorts tweets about tech products into positive or negative piles.
- Give stakeholders a tool to track and analyze real-time sentiment shifts on Twitter.
- Identify areas for product improvement based on what users are tweeting.

**2.4 Metrics for Success**

- How accurate is our model at classifying Tweets? Can it tell the difference between a happy Apple user and a frustrated Google customer?
- Is the model good at finding relevant Tweets about these brands in the vast Twitterverse?
- (For a future stage) How easy and user-friendly the sentiment analysis tool is for stakeholders.

## 3. Data Preparation

**3.1 Data Understanding**

We'll be using a dataset from CrowdFlower via `data.world.` This treasure trove contains over 9,000 Tweets labeled as positive, negative, or neutral regarding Apple and Google products. But before we jump in and build our model, let's get to know this data a little better.

|    Feature          |      Description                                           |
|---------------------|------------------------------------------------------------|
| **tweet_text**              | This column likely contains the actual text content of each Tweet.             |
| **emotion_in_tweet_is_directed_at**| This column specifies the Tweet is directed at a specific entity(product, brand)       |
| **is_there_an_emotion_directed_at_a_brand_or_product** |This column is to identify emotions specifically directed at brands or products.  |

By thoroughly understanding the data we can effectively pre-process and clean the data for building our sentiment analysis model. This initial data exploration phase is crucial for ensuring the model is trained on high-quality information, leading to better performance in classifying Tweet sentiment.


**3.2 Importing Libraries**

In [2]:
import pandas as pd

Exploring the data to get a glimpse of:

- The Data 
- Info of the data
- Shape of the data
- Statistical summary of the data
- Missing values
- The duplicates

In [4]:
df = pd.read_csv('tweet_product_company.csv', encoding = 'latin-1')
df.head()

Unnamed: 0,tweet_text,emotion_in_tweet_is_directed_at,is_there_an_emotion_directed_at_a_brand_or_product
0,.@wesley83 I have a 3G iPhone. After 3 hrs twe...,iPhone,Negative emotion
1,@jessedee Know about @fludapp ? Awesome iPad/i...,iPad or iPhone App,Positive emotion
2,@swonderlin Can not wait for #iPad 2 also. The...,iPad,Positive emotion
3,@sxsw I hope this year's festival isn't as cra...,iPad or iPhone App,Negative emotion
4,@sxtxstate great stuff on Fri #SXSW: Marissa M...,Google,Positive emotion
