# Twitter Sentiment Analysis (Notebook First)

This notebook now supports **two modes**:
1. **Demo mode (default):** runs with local sample tweets and shows insights without any credentials.
2. **Twitter API mode (optional):** fetches live tweets if you provide your own API credentials.


## Why this change?
So anyone viewing this repository can open the notebook and immediately see outputs/insights, while still keeping an optional path for live Twitter data.


## Simple note for viewers
This notebook is **fully standalone** and can be understood on its own. You do **not** need to read `src/` or `tests/` to follow the analysis here.


In [1]:
# Core imports
import os
import re
from pathlib import Path

import pandas as pd
from textblob import TextBlob
import matplotlib.pyplot as plt
from wordcloud import WordCloud, STOPWORDS

plt.style.use('fivethirtyeight')


In [2]:
# Configuration
USE_TWITTER_API = False  # Set True only if you want live fetch
TWITTER_USERNAME = 'BillGates'
SAMPLE_CSV_PATH = Path('data/sample_tweets.csv')

print('Mode selected: DEMO (local sample CSV, no credentials required)' if not USE_TWITTER_API else 'Mode selected: TWITTER API')


Mode selected: DEMO (local sample CSV, no credentials required)


In [3]:
# Load tweets (demo mode by default, optional API mode)
if USE_TWITTER_API:
    import tweepy

    bearer_token = os.getenv('TWITTER_BEARER_TOKEN')
    if not bearer_token:
        raise ValueError('Set TWITTER_BEARER_TOKEN environment variable for API mode.')

    client = tweepy.Client(bearer_token=bearer_token, wait_on_rate_limit=True)
    user = client.get_user(username=TWITTER_USERNAME)
    if user.data is None:
        raise ValueError(f'User not found: {TWITTER_USERNAME}')

    response = client.get_users_tweets(user.data.id, max_results=100, tweet_fields=['created_at', 'text'])
    tweets = [t.text for t in (response.data or [])]
    df = pd.DataFrame({'Tweets': tweets})
    print(f'Loaded tweets from Twitter user: {TWITTER_USERNAME}')
else:
    df = pd.read_csv(SAMPLE_CSV_PATH).rename(columns={'text': 'Tweets'})
    print(f'Loaded tweets from local file: {SAMPLE_CSV_PATH}')

print(f'Total tweets: {len(df)}')


Loaded tweets from local file: data/sample_tweets.csv
Total tweets: 15


In [4]:
# Preview data
df.head()


                                              Tweets
0  I absolutely love the new AI features in this ...
1  The service was okay, nothing special but not ...
2  Support team resolved my issue quickly, great ...
3  This update made the app slower and more confu...
4  Really happy with the clean interface and fast...

In [5]:
# Clean text
def clean_text(text):
    text = re.sub(r'@[A-Za-z0-9_]+', '', str(text))
    text = re.sub(r'#', '', text)
    text = re.sub(r'RT\s+', '', text)
    text = re.sub(r'https?://\S+', '', text)
    text = re.sub(r'\s+', ' ', text)
    return text.strip()

def get_subjectivity(text):
    return TextBlob(text).sentiment.subjectivity

def get_polarity(text):
    return TextBlob(text).sentiment.polarity

def get_analysis(score):
    if score > 0:
        return 'Positive'
    elif score < 0:
        return 'Negative'
    return 'Neutral'

df['Clean_Tweets'] = df['Tweets'].apply(clean_text)
df['Subjectivity'] = df['Clean_Tweets'].apply(get_subjectivity)
df['Polarity'] = df['Clean_Tweets'].apply(get_polarity)
df['Analysis'] = df['Polarity'].apply(get_analysis)


In [6]:
# Quick sentiment distribution
df['Analysis'].value_counts()


Positive    7
Neutral     4
Negative    4
Name: Analysis, dtype: int64

In [7]:
# Plot sentiment counts
plt.figure(figsize=(6, 4))
df['Analysis'].value_counts().reindex(['Positive', 'Neutral', 'Negative']).fillna(0).plot(kind='bar')
plt.title('Sentiment Analysis Summary')
plt.xlabel('Sentiment')
plt.ylabel('Count')
plt.tight_layout()
plt.show()


In [8]:
# Word cloud
all_words = ' '.join(df['Clean_Tweets'])
stopwords = set(STOPWORDS)
stopwords.update(['co', 'amp', 'https'])
wordcloud = WordCloud(stopwords=stopwords, background_color='white', width=900, height=450).generate(all_words)

plt.figure(figsize=(10, 5))
plt.imshow(wordcloud, interpolation='bilinear')
plt.axis('off')
plt.tight_layout()
plt.show()


## Final note
- For easy demo and repository browsing: keep `USE_TWITTER_API = False`.
- For live Twitter fetch: set `USE_TWITTER_API = True` and configure `TWITTER_BEARER_TOKEN` in your environment.


## Preview visuals (already visible on GitHub)
These are static preview visuals from the demo sample so visitors can immediately see expected output patterns without executing code.

![Sentiment chart preview](data/demo_sentiment_bar.svg)

![Word cloud preview](data/demo_wordcloud.svg)
