# üöÄ Sentiment Analysis of Social Media Posts Using BERTweet
## Real-Time Sentiment Detection Using NLP & Deep Learning

**Project Components:**
- ‚úÖ Data Preprocessing Pipeline
- ‚úÖ BERTweet Transformer Model
- ‚úÖ Flask REST API
- ‚úÖ ngrok Public URL Integration
- ‚úÖ Beautiful Web UI

**Model:** `finiteautomata/bertweet-base-sentiment-analysis`

---

## üì¶ STEP 1: Install Required Libraries

In [11]:
# Install all required packages
!pip install transformers torch flask flask-cors pyngrok emoji -q

print("‚úÖ All packages installed successfully!")

‚úÖ All packages installed successfully!


## üîß STEP 2: Import Libraries

In [12]:
# Standard libraries
import re
import emoji
import json
from datetime import datetime

# Deep Learning & NLP
import torch
from transformers import AutoTokenizer, AutoModelForSequenceClassification, pipeline

# Flask & API
from flask import Flask, request, jsonify, render_template_string
from flask_cors import CORS

# ngrok for public URL
from pyngrok import ngrok

# Threading for Flask
import threading

print("‚úÖ All libraries imported successfully!")
print(f"üî• PyTorch version: {torch.__version__}")
print(f"ü§ñ CUDA available: {torch.cuda.is_available()}")

‚úÖ All libraries imported successfully!
üî• PyTorch version: 2.9.0+cu126
ü§ñ CUDA available: True


## üßπ STEP 3: Data Preprocessing Pipeline

In [13]:
class TextPreprocessor:
    """
    Comprehensive text preprocessing for social media content
    Handles URLs, mentions, hashtags, emojis, and text normalization
    """

    def __init__(self,
                 remove_urls=True,
                 remove_mentions=True,
                 remove_hashtags=False,
                 handle_emojis=True,
                 lowercase=False):  # BERTweet works better with original case

        self.remove_urls = remove_urls
        self.remove_mentions = remove_mentions
        self.remove_hashtags = remove_hashtags
        self.handle_emojis = handle_emojis
        self.lowercase = lowercase

    def clean_url(self, text):
        """Remove URLs from text"""
        url_pattern = r'http\S+|www\.\S+'
        return re.sub(url_pattern, '', text)

    def clean_mentions(self, text):
        """Remove @mentions from text"""
        mention_pattern = r'@\w+'
        return re.sub(mention_pattern, '', text)

    def clean_hashtags(self, text):
        """Remove # but keep the word"""
        return text.replace('#', '')

    def process_emojis(self, text):
        """Keep emojis as BERTweet handles them well"""
        # BERTweet is trained on tweets with emojis, so we keep them
        return text

    def clean_extra_spaces(self, text):
        """Remove extra whitespaces"""
        return ' '.join(text.split())

    def preprocess(self, text):
        """
        Apply all preprocessing steps to text

        Args:
            text (str): Raw input text

        Returns:
            str: Cleaned and preprocessed text
        """
        if not isinstance(text, str):
            return ""

        # Apply preprocessing steps
        if self.remove_urls:
            text = self.clean_url(text)

        if self.remove_mentions:
            text = self.clean_mentions(text)

        if self.remove_hashtags:
            text = self.clean_hashtags(text)

        if self.handle_emojis:
            text = self.process_emojis(text)

        # Convert to lowercase if needed
        if self.lowercase:
            text = text.lower()

        # Clean extra spaces
        text = self.clean_extra_spaces(text)

        return text.strip()


# Initialize preprocessor
preprocessor = TextPreprocessor()

# Test preprocessing
print("\n" + "="*80)
print("üßπ TEXT PREPROCESSING DEMONSTRATION")
print("="*80)

test_texts = [
    "I love this product! üòç #awesome @company https://example.com",
    "Terrible service üò† @support please fix this ASAP!",
    "Just okay... nothing special üòê Check out www.example.com",
]

for i, text in enumerate(test_texts, 1):
    cleaned = preprocessor.preprocess(text)
    print(f"\n{i}. ORIGINAL: {text}")
    print(f"   CLEANED:  {cleaned}")

print("\n" + "="*80)
print("‚úÖ Preprocessing pipeline ready!")


üßπ TEXT PREPROCESSING DEMONSTRATION

1. ORIGINAL: I love this product! üòç #awesome @company https://example.com
   CLEANED:  I love this product! üòç #awesome

2. ORIGINAL: Terrible service üò† @support please fix this ASAP!
   CLEANED:  Terrible service üò† please fix this ASAP!

3. ORIGINAL: Just okay... nothing special üòê Check out www.example.com
   CLEANED:  Just okay... nothing special üòê Check out

‚úÖ Preprocessing pipeline ready!


## ü§ñ STEP 4: Load BERTweet Model

In [14]:
# Load BERTweet model and tokenizer
MODEL_NAME = "finiteautomata/bertweet-base-sentiment-analysis"

print(f"üì• Loading BERTweet model: {MODEL_NAME}")
print("This may take a few moments...\n")

# Load tokenizer and model
tokenizer = AutoTokenizer.from_pretrained(MODEL_NAME)
model = AutoModelForSequenceClassification.from_pretrained(MODEL_NAME)

# Create sentiment analysis pipeline
sentiment_pipeline = pipeline(
    "sentiment-analysis",
    model=model,
    tokenizer=tokenizer,
    device=0 if torch.cuda.is_available() else -1  # Use GPU if available
)

print("‚úÖ BERTweet model loaded successfully!")
print(f"üéØ Device: {'GPU' if torch.cuda.is_available() else 'CPU'}")

# Test the model
print("\n" + "="*80)
print("üß™ MODEL TESTING")
print("="*80)

test_samples = [
    "I absolutely love this! Best day ever! üòç",
    "This is terrible. Worst experience. üò°",
    "It's okay, nothing special."
]

for sample in test_samples:
    result = sentiment_pipeline(sample)[0]
    print(f"\nText: {sample}")
    print(f"Sentiment: {result['label']} (Confidence: {result['score']:.4f})")

print("\n" + "="*80)
print("‚úÖ Model testing complete!")

üì• Loading BERTweet model: finiteautomata/bertweet-base-sentiment-analysis
This may take a few moments...



Device set to use cuda:0


‚úÖ BERTweet model loaded successfully!
üéØ Device: GPU

üß™ MODEL TESTING

Text: I absolutely love this! Best day ever! üòç
Sentiment: POS (Confidence: 0.9919)

Text: This is terrible. Worst experience. üò°
Sentiment: NEG (Confidence: 0.9830)

Text: It's okay, nothing special.
Sentiment: NEU (Confidence: 0.5241)

‚úÖ Model testing complete!


## üé® STEP 5: Create Beautiful Web UI (HTML/CSS/JS)

In [15]:
# Beautiful Web UI with modern design
HTML_TEMPLATE = """
<!DOCTYPE html>
<html lang="en">
<head>
    <meta charset="UTF-8">
    <meta name="viewport" content="width=device-width, initial-scale=1.0">
    <title>Social Media Sentiment Analyzer</title>
    <style>
        * {
            margin: 0;
            padding: 0;
            box-sizing: border-box;
        }

        body {
            font-family: 'Segoe UI', Tahoma, Geneva, Verdana, sans-serif;
            background: linear-gradient(135deg, #667eea 0%, #764ba2 100%);
            min-height: 100vh;
            display: flex;
            justify-content: center;
            align-items: center;
            padding: 20px;
        }

        .container {
            background: white;
            border-radius: 20px;
            box-shadow: 0 20px 60px rgba(0, 0, 0, 0.3);
            max-width: 600px;
            width: 100%;
            padding: 40px;
            animation: slideIn 0.5s ease-out;
        }

        @keyframes slideIn {
            from {
                opacity: 0;
                transform: translateY(-30px);
            }
            to {
                opacity: 1;
                transform: translateY(0);
            }
        }

        h1 {
            color: #667eea;
            text-align: center;
            margin-bottom: 10px;
            font-size: 2em;
        }

        .subtitle {
            text-align: center;
            color: #666;
            margin-bottom: 30px;
            font-size: 0.9em;
        }

        .input-group {
            margin-bottom: 20px;
        }

        label {
            display: block;
            margin-bottom: 10px;
            color: #333;
            font-weight: 600;
        }

        textarea {
            width: 100%;
            padding: 15px;
            border: 2px solid #e0e0e0;
            border-radius: 10px;
            font-size: 16px;
            font-family: inherit;
            resize: vertical;
            min-height: 120px;
            transition: border-color 0.3s;
        }

        textarea:focus {
            outline: none;
            border-color: #667eea;
        }

        button {
            width: 100%;
            padding: 15px;
            background: linear-gradient(135deg, #667eea 0%, #764ba2 100%);
            color: white;
            border: none;
            border-radius: 10px;
            font-size: 16px;
            font-weight: 600;
            cursor: pointer;
            transition: transform 0.2s, box-shadow 0.2s;
        }

        button:hover {
            transform: translateY(-2px);
            box-shadow: 0 5px 20px rgba(102, 126, 234, 0.4);
        }

        button:active {
            transform: translateY(0);
        }

        button:disabled {
            opacity: 0.6;
            cursor: not-allowed;
        }

        .result {
            margin-top: 30px;
            padding: 25px;
            border-radius: 10px;
            display: none;
            animation: fadeIn 0.5s ease-out;
        }

        @keyframes fadeIn {
            from { opacity: 0; }
            to { opacity: 1; }
        }

        .result.positive {
            background: #d4edda;
            border: 2px solid #28a745;
        }

        .result.negative {
            background: #f8d7da;
            border: 2px solid #dc3545;
        }

        .result.neutral {
            background: #fff3cd;
            border: 2px solid #ffc107;
        }

        .sentiment-label {
            font-size: 1.5em;
            font-weight: bold;
            margin-bottom: 10px;
        }

        .confidence {
            font-size: 1.1em;
            margin-bottom: 15px;
        }

        .confidence-bar {
            width: 100%;
            height: 10px;
            background: rgba(0, 0, 0, 0.1);
            border-radius: 5px;
            overflow: hidden;
        }

        .confidence-fill {
            height: 100%;
            border-radius: 5px;
            transition: width 0.5s ease-out;
        }

        .loading {
            display: none;
            text-align: center;
            margin-top: 20px;
        }

        .spinner {
            border: 4px solid #f3f3f3;
            border-top: 4px solid #667eea;
            border-radius: 50%;
            width: 40px;
            height: 40px;
            animation: spin 1s linear infinite;
            margin: 0 auto;
        }

        @keyframes spin {
            0% { transform: rotate(0deg); }
            100% { transform: rotate(360deg); }
        }

        .examples {
            margin-top: 20px;
            padding: 15px;
            background: #f8f9fa;
            border-radius: 10px;
        }

        .examples h3 {
            color: #667eea;
            margin-bottom: 10px;
            font-size: 1em;
        }

        .example-item {
            padding: 8px;
            margin: 5px 0;
            background: white;
            border-radius: 5px;
            cursor: pointer;
            transition: background 0.2s;
            font-size: 0.9em;
        }

        .example-item:hover {
            background: #e9ecef;
        }
    </style>
</head>
<body>
    <div class="container">
        <h1>üé≠ Sentiment Analyzer</h1>
        <p class="subtitle">Powered by BERTweet Transformer Model</p>

        <div class="input-group">
            <label for="textInput">Enter your text or social media post:</label>
            <textarea id="textInput" placeholder="Type your message here... (e.g., 'I love this product! üòç')"></textarea>
        </div>

        <button onclick="analyzeSentiment()">üöÄ Analyze Sentiment</button>

        <div class="loading" id="loading">
            <div class="spinner"></div>
            <p style="margin-top: 10px; color: #667eea;">Analyzing...</p>
        </div>

        <div class="result" id="result">
            <div class="sentiment-label" id="sentimentLabel"></div>
            <div class="confidence" id="confidence"></div>
            <div class="confidence-bar">
                <div class="confidence-fill" id="confidenceFill"></div>
            </div>
        </div>

        <div class="examples">
            <h3>üí° Try these examples:</h3>
            <div class="example-item" onclick="setExample('I absolutely love this product! Best purchase ever! üòç')">"I absolutely love this product! Best purchase ever! üòç"</div>
            <div class="example-item" onclick="setExample('This is the worst service I have ever experienced üò°')">"This is the worst service I have ever experienced üò°"</div>
            <div class="example-item" onclick="setExample('It\'s okay, nothing special really')">"It's okay, nothing special really"</div>
        </div>
    </div>

    <script>
        function setExample(text) {
            document.getElementById('textInput').value = text;
        }

        async function analyzeSentiment() {
            const text = document.getElementById('textInput').value.trim();

            if (!text) {
                alert('Please enter some text to analyze!');
                return;
            }

            // Show loading
            document.getElementById('loading').style.display = 'block';
            document.getElementById('result').style.display = 'none';

            try {
                const response = await fetch('/predict', {
                    method: 'POST',
                    headers: {
                        'Content-Type': 'application/json',
                    },
                    body: JSON.stringify({ text: text })
                });

                const data = await response.json();

                // Hide loading
                document.getElementById('loading').style.display = 'none';

                if (data.error) {
                    alert('Error: ' + data.error);
                    return;
                }

                // Display result
                const resultDiv = document.getElementById('result');
                const sentiment = data.sentiment.toLowerCase();
                const confidence = (data.confidence * 100).toFixed(2);

                resultDiv.className = 'result ' + sentiment;
                resultDiv.style.display = 'block';

                const emoji = sentiment === 'positive' ? 'üòä' : sentiment === 'negative' ? 'üò¢' : 'üòê';
                document.getElementById('sentimentLabel').innerHTML = emoji + ' ' + data.sentiment.toUpperCase();
                document.getElementById('confidence').textContent = 'Confidence: ' + confidence + '%';

                const fillColor = sentiment === 'positive' ? '#28a745' : sentiment === 'negative' ? '#dc3545' : '#ffc107';
                const fillDiv = document.getElementById('confidenceFill');
                fillDiv.style.width = confidence + '%';
                fillDiv.style.background = fillColor;

            } catch (error) {
                document.getElementById('loading').style.display = 'none';
                alert('Error analyzing sentiment: ' + error.message);
            }
        }

        // Allow Enter key to submit
        document.getElementById('textInput').addEventListener('keypress', function(e) {
            if (e.key === 'Enter' && !e.shiftKey) {
                e.preventDefault();
                analyzeSentiment();
            }
        });
    </script>
</body>
</html>
"""

print("‚úÖ Beautiful Web UI template created!")

‚úÖ Beautiful Web UI template created!


## üåê STEP 6: Create Flask API with Endpoints

In [16]:
# Create Flask application
app = Flask(__name__)
CORS(app)  # Enable CORS for all routes

# Statistics tracking
stats = {
    'total_requests': 0,
    'positive': 0,
    'negative': 0,
    'neutral': 0
}

@app.route('/')
def home():
    """Serve the web UI"""
    return render_template_string(HTML_TEMPLATE)

@app.route('/predict', methods=['POST'])
def predict():
    """
    Main prediction endpoint

    Request JSON format:
    {
        "text": "Your text here"
    }

    Response JSON format:
    {
        "original_text": "...",
        "preprocessed_text": "...",
        "sentiment": "POS/NEG/NEU",
        "confidence": 0.95,
        "timestamp": "2024-01-01 12:00:00"
    }
    """
    try:
        # Get JSON data
        data = request.get_json()

        if not data or 'text' not in data:
            return jsonify({
                'error': 'No text provided. Please send JSON with "text" field.'
            }), 400

        original_text = data['text']

        if not original_text.strip():
            return jsonify({
                'error': 'Empty text provided.'
            }), 400

        # Preprocess text
        preprocessed_text = preprocessor.preprocess(original_text)

        # Get prediction
        result = sentiment_pipeline(preprocessed_text)[0]

        sentiment = result['label']
        confidence = result['score']

        # Update statistics
        stats['total_requests'] += 1
        if sentiment == 'POS':
            stats['positive'] += 1
            sentiment_full = 'Positive'
        elif sentiment == 'NEG':
            stats['negative'] += 1
            sentiment_full = 'Negative'
        else:
            stats['neutral'] += 1
            sentiment_full = 'Neutral'

        # Prepare response
        response = {
            'original_text': original_text,
            'preprocessed_text': preprocessed_text,
            'sentiment': sentiment_full,
            'sentiment_code': sentiment,
            'confidence': confidence,
            'timestamp': datetime.now().strftime('%Y-%m-%d %H:%M:%S')
        }

        return jsonify(response), 200

    except Exception as e:
        return jsonify({
            'error': f'Prediction failed: {str(e)}'
        }), 500

@app.route('/batch_predict', methods=['POST'])
def batch_predict():
    """
    Batch prediction endpoint for multiple texts

    Request JSON format:
    {
        "texts": ["text1", "text2", "text3"]
    }
    """
    try:
        data = request.get_json()

        if not data or 'texts' not in data:
            return jsonify({
                'error': 'No texts provided. Please send JSON with "texts" array.'
            }), 400

        texts = data['texts']

        if not isinstance(texts, list):
            return jsonify({
                'error': 'texts must be an array.'
            }), 400

        results = []

        for text in texts:
            preprocessed = preprocessor.preprocess(text)
            prediction = sentiment_pipeline(preprocessed)[0]

            results.append({
                'text': text,
                'sentiment': prediction['label'],
                'confidence': prediction['score']
            })

        return jsonify({
            'results': results,
            'total_processed': len(results)
        }), 200

    except Exception as e:
        return jsonify({
            'error': f'Batch prediction failed: {str(e)}'
        }), 500

@app.route('/stats', methods=['GET'])
def get_stats():
    """Get API usage statistics"""
    return jsonify(stats), 200

@app.route('/health', methods=['GET'])
def health():
    """Health check endpoint"""
    return jsonify({
        'status': 'healthy',
        'model': MODEL_NAME,
        'timestamp': datetime.now().strftime('%Y-%m-%d %H:%M:%S')
    }), 200

print("‚úÖ Flask API created with following endpoints:")
print("   üìç GET  /           - Web UI")
print("   üìç POST /predict    - Single prediction")
print("   üìç POST /batch_predict - Batch predictions")
print("   üìç GET  /stats      - Usage statistics")
print("   üìç GET  /health     - Health check")

‚úÖ Flask API created with following endpoints:
   üìç GET  /           - Web UI
   üìç POST /predict    - Single prediction
   üìç POST /batch_predict - Batch predictions
   üìç GET  /stats      - Usage statistics
   üìç GET  /health     - Health check


## üîë STEP 7: Setup ngrok (Get Your Token First!)

### üìù How to get ngrok token:
1. Go to https://ngrok.com/
2. Sign up for free account
3. Go to https://dashboard.ngrok.com/get-started/your-authtoken
4. Copy your authtoken
5. Paste it below

In [17]:
# IMPORTANT: Replace 'YOUR_NGROK_TOKEN' with your actual ngrok authtoken
NGROK_TOKEN = "38wLUKPFM1Cz5YBsIRQKty9Hya0_3ZjyzxViXjnrkVGqLBzh8"  # <-- PASTE YOUR TOKEN HERE

# Authenticate ngrok
ngrok.set_auth_token(NGROK_TOKEN)

print("‚úÖ ngrok authenticated successfully!")
print("üîß Ready to create public URL...")

‚úÖ ngrok authenticated successfully!
üîß Ready to create public URL...


## üöÄ STEP 8: Start Flask Server with ngrok Public URL

In [19]:
# Kill any existing ngrok tunnels
ngrok.kill()

# Start Flask in a separate thread
def run_flask():
    app.run(port=5000, debug=False, use_reloader=False)

flask_thread = threading.Thread(target=run_flask, daemon=True)
flask_thread.start()

# Wait for Flask to start
import time
time.sleep(3)

# Create ngrok tunnel
public_url = ngrok.connect(5000)

print("\n" + "="*80)
print("üéâ SUCCESS! Your Sentiment Analysis API is LIVE!")
print("="*80)
print(f"\nüåê Public URL: {public_url}")
print(f"\nüì± Open this URL in your browser to use the Web UI")
print(f"\nüîó API Endpoints:")
print(f"   ‚Ä¢ Web UI:        {public_url}")
print(f"   ‚Ä¢ Predict:       {public_url}/predict")
print(f"   ‚Ä¢ Batch Predict: {public_url}/batch_predict")
print(f"   ‚Ä¢ Statistics:    {public_url}/stats")
print(f"   ‚Ä¢ Health:        {public_url}/health")
print(f"\nüí° Keep this cell running to keep the server alive!")
print("="*80)

# Keep the server running
print("\n‚è≥ Server is running... Press 'Stop' button to terminate.")

# This keeps the cell running
try:
    while True:
        time.sleep(1)
except KeyboardInterrupt:
    print("\nüõë Server stopped.")
    ngrok.kill()

 * Serving Flask app '__main__'
 * Debug mode: off


Address already in use
Port 5000 is in use by another program. Either identify and stop that program, or start the server with a different port.



üéâ SUCCESS! Your Sentiment Analysis API is LIVE!

üåê Public URL: NgrokTunnel: "https://667506c622ec.ngrok-free.app" -> "http://localhost:5000"

üì± Open this URL in your browser to use the Web UI

üîó API Endpoints:
   ‚Ä¢ Web UI:        NgrokTunnel: "https://667506c622ec.ngrok-free.app" -> "http://localhost:5000"
   ‚Ä¢ Predict:       NgrokTunnel: "https://667506c622ec.ngrok-free.app" -> "http://localhost:5000"/predict
   ‚Ä¢ Batch Predict: NgrokTunnel: "https://667506c622ec.ngrok-free.app" -> "http://localhost:5000"/batch_predict
   ‚Ä¢ Statistics:    NgrokTunnel: "https://667506c622ec.ngrok-free.app" -> "http://localhost:5000"/stats
   ‚Ä¢ Health:        NgrokTunnel: "https://667506c622ec.ngrok-free.app" -> "http://localhost:5000"/health

üí° Keep this cell running to keep the server alive!

‚è≥ Server is running... Press 'Stop' button to terminate.


INFO:werkzeug:127.0.0.1 - - [01/Feb/2026 08:22:41] "GET / HTTP/1.1" 200 -
INFO:werkzeug:127.0.0.1 - - [01/Feb/2026 08:22:41] "[33mGET /favicon.ico HTTP/1.1[0m" 404 -
INFO:werkzeug:127.0.0.1 - - [01/Feb/2026 08:22:46] "POST /predict HTTP/1.1" 200 -
INFO:werkzeug:127.0.0.1 - - [01/Feb/2026 08:22:51] "POST /predict HTTP/1.1" 200 -
INFO:werkzeug:127.0.0.1 - - [01/Feb/2026 08:23:02] "POST /predict HTTP/1.1" 200 -
INFO:werkzeug:127.0.0.1 - - [01/Feb/2026 08:25:11] "GET / HTTP/1.1" 200 -
INFO:werkzeug:127.0.0.1 - - [01/Feb/2026 08:25:31] "POST /predict HTTP/1.1" 200 -
INFO:werkzeug:127.0.0.1 - - [01/Feb/2026 08:25:58] "POST /predict HTTP/1.1" 200 -



üõë Server stopped.
