# 🐦 Twitter Sentiment Analysis
## 📌 Project Overview
- Build a sentiment analysis tool for tweets.
- Classify tweets as **Positive**, **Negative**, or **Neutral**.
- Use **NLP** with **NLTK** and **TF-IDF**.
- Train a model to predict sentiment.

## 📥 Step 1: Import Libraries
Import necessary libraries for NLP and ML.

In [None]:
import pandas as pd
import numpy as np
import re
import nltk
from nltk.corpus import stopwords
from nltk.tokenize import word_tokenize
from sklearn.feature_extraction.text import TfidfVectorizer
from sklearn.model_selection import train_test_split
from sklearn.naive_bayes import MultinomialNB
from sklearn.metrics import accuracy_score, classification_report

nltk.download('punkt')
nltk.download('stopwords')

## 📂 Step 2: Load Dataset
We'll create a small dataset of tweets.

In [None]:
data = {
    'Tweet': [
        "I love the new design of this app!",
        "This service is terrible, I hate it!",
        "It's okay, nothing special.",
        "Absolutely amazing performance!",
        "Worst experience ever, very bad.",
        "Happy to use this product every day!",
        "Not good, disappointed with the update."
    ]
}

df = pd.DataFrame(data)
df.head()

## 🧹 Step 3: Clean and Preprocess Tweets
- Remove links and punctuation
- Lowercase
- Tokenize and remove stopwords

In [None]:
def clean_text(text):
    text = re.sub(r"http\S+|www\S+", '', text)
    text = re.sub(r'[^A-Za-z\s]', '', text)
    text = text.lower()
    tokens = word_tokenize(text)
    stop_words = set(stopwords.words('english'))
    filtered = [word for word in tokens if word not in stop_words]
    return ' '.join(filtered)

df['Cleaned_Tweet'] = df['Tweet'].apply(clean_text)
df.head()

## 🏷️ Step 4: Assign Sentiment Labels
Simple rule-based labeling for demo.

In [None]:
def get_sentiment(text):
    if "love" in text or "happy" in text or "amazing" in text or "good" in text:
        return "Positive"
    elif "hate" in text or "bad" in text or "terrible" in text or "worst" in text:
        return "Negative"
    else:
        return "Neutral"

df['Sentiment'] = df['Cleaned_Tweet'].apply(get_sentiment)
df.head()

## 🔡 Step 5: TF-IDF Vectorization
Convert text into numeric features.

In [None]:
vectorizer = TfidfVectorizer()
X = vectorizer.fit_transform(df['Cleaned_Tweet'])
y = df['Sentiment']

## 🧠 Step 6: Train/Test Split & Train Model

In [None]:
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3, random_state=42)

model = MultinomialNB()
model.fit(X_train, y_train)

## 📊 Step 7: Evaluate the Model

In [None]:
y_pred = model.predict(X_test)
print("Accuracy:", accuracy_score(y_test, y_pred))
print("\nClassification Report:\n", classification_report(y_test, y_pred))

## 🗣️ Step 8: Predict New Tweet Sentiment

In [None]:
def predict_sentiment(text):
    cleaned = clean_text(text)
    vectorized = vectorizer.transform([cleaned])
    prediction = model.predict(vectorized)
    return prediction[0]

print(predict_sentiment("I am so happy with this service!"))
print(predict_sentiment("This is the worst update ever."))

## 💾 Step 9: Save the Model

In [None]:
import joblib
joblib.dump(model, 'twitter_sentiment_model.pkl')
joblib.dump(vectorizer, 'tfidf_vectorizer.pkl')