# Sentiment Analysis API Demo

This notebook demonstrates how to interact with the sentiment analysis API deployed in Docker.

The API uses a DistilBERT model fine-tuned on the SST-2 dataset for sentiment analysis.

**Prerequisites:**
- Docker and docker-compose installed
- API running (start with: `docker-compose up -d`)
- API accessible at `http://localhost` (or your server address)


In [17]:
# Install required packages if not already installed
import sys
import subprocess

def install_if_missing(package):
    try:
        __import__(package)
    except ImportError:
        print(f"Installing {package}...")
        subprocess.check_call([sys.executable, "-m", "pip", "install", package, "-q"])
        print(f"‚úÖ {package} installed successfully")

# Install packages
install_if_missing("requests")
install_if_missing("pandas")

# Now import everything
import requests
import json
import time
from concurrent.futures import ThreadPoolExecutor, as_completed
from datetime import datetime
import pandas as pd

# API endpoint (adjust if running on different host/port)
API_BASE_URL = "http://localhost"
PREDICT_ENDPOINT = f"{API_BASE_URL}/predict"
HEALTH_ENDPOINT = f"{API_BASE_URL}/health"

print(f"‚úÖ All packages loaded successfully!")
print(f"\nAPI Base URL: {API_BASE_URL}")
print(f"Predict Endpoint: {PREDICT_ENDPOINT}")
print(f"Health Endpoint: {HEALTH_ENDPOINT}")


Installing requests...
‚úÖ requests installed successfully
Installing pandas...
‚úÖ pandas installed successfully
‚úÖ All packages loaded successfully!

API Base URL: http://localhost
Predict Endpoint: http://localhost/predict
Health Endpoint: http://localhost/health


## 1. Health Check

First, let's verify the API is running and healthy.


In [18]:
# Check API health
try:
    response = requests.get(HEALTH_ENDPOINT, timeout=5)
    print(f"Status Code: {response.status_code}")
    print(f"Response: {json.dumps(response.json(), indent=2)}")
except requests.exceptions.RequestException as e:
    print(f"Error connecting to API: {e}")
    print("Make sure the API is running with: docker-compose up -d")


Status Code: 200
Response: {
  "device": "cpu",
  "model_loaded": true,
  "status": "healthy"
}


## 2. Single Request Example

Let's make a simple prediction request.


In [19]:
# Example 1: Positive sentiment
text1 = "I absolutely love this product! It's amazing and works perfectly."
payload = {"text": text1}

print(f"Input text: {text1}\n")
response = requests.post(PREDICT_ENDPOINT, json=payload)
result = response.json()

print(f"Status Code: {response.status_code}")
print(f"Response:\n{json.dumps(result, indent=2)}")


Input text: I absolutely love this product! It's amazing and works perfectly.

Status Code: 200
Response:
{
  "negative_score": 0.0001,
  "positive_score": 0.9999,
  "score": 0.9999,
  "sentiment": "POSITIVE",
  "text": "I absolutely love this product! It's amazing and works perfectly."
}


In [None]:
# Example 2: Negative sentiment
text2 = "This is terrible. I'm very disappointed with the quality and service."
payload = {"text": text2}

print(f"Input text: {text2}\n")
response = requests.post(PREDICT_ENDPOINT, json=payload)
result = response.json()

print(f"Status Code: {response.status_code}")
print(f"Response:\n{json.dumps(result, indent=2)}")


## 3. Multiple Sequential Requests

Let's test multiple requests sequentially to see the API's response time.


In [20]:
# Test texts with various sentiments
test_texts = [
    "This movie is fantastic! I highly recommend it.",
    "The service was okay, nothing special.",
    "I hate waiting in long lines. Very frustrating experience.",
    "The food was delicious and the atmosphere was perfect.",
    "Not impressed at all. Poor quality and overpriced.",
    "Amazing customer service! They went above and beyond.",
    "The product broke after one day. Complete waste of money.",
    "Great value for money. Will definitely buy again!"
]

print("Sequential Requests:\n")
start_time = time.time()
results = []

for i, text in enumerate(test_texts, 1):
    payload = {"text": text}
    request_start = time.time()
    response = requests.post(PREDICT_ENDPOINT, json=payload)
    request_time = time.time() - request_start
    
    result = response.json()
    results.append({
        "text": text[:50] + "..." if len(text) > 50 else text,
        "sentiment": result.get("sentiment", "N/A"),
        "score": result.get("score", 0),
        "response_time_ms": round(request_time * 1000, 2)
    })
    print(f"{i}. {result.get('sentiment', 'N/A')} ({result.get('score', 0):.4f}) - {request_time*1000:.2f}ms")

total_time = time.time() - start_time
print(f"\nTotal time: {total_time:.2f}s")
print(f"Average time per request: {total_time/len(test_texts)*1000:.2f}ms")


Sequential Requests:

1. POSITIVE (0.9999) - 130.99ms
2. NEGATIVE (0.9862) - 437.75ms
3. NEGATIVE (0.9968) - 110.52ms
4. POSITIVE (0.9999) - 134.67ms
5. NEGATIVE (0.9998) - 122.51ms
6. POSITIVE (0.9999) - 113.55ms
7. NEGATIVE (0.9998) - 125.60ms
8. POSITIVE (0.9998) - 117.03ms

Total time: 1.29s
Average time per request: 161.77ms


## 4. Parallel Requests Demonstration

Now let's demonstrate the API's ability to handle multiple parallel requests simultaneously. This showcases the NGINX + Gunicorn architecture's capability to process concurrent requests.


In [21]:
def make_prediction(text):
    """Helper function to make a prediction request."""
    payload = {"text": text}
    start_time = time.time()
    try:
        response = requests.post(PREDICT_ENDPOINT, json=payload, timeout=30)
        elapsed = time.time() - start_time
        result = response.json()
        return {
            "text": text[:40] + "..." if len(text) > 40 else text,
            "sentiment": result.get("sentiment", "ERROR"),
            "score": result.get("score", 0),
            "response_time_ms": round(elapsed * 1000, 2),
            "status_code": response.status_code,
            "success": response.status_code == 200
        }
    except Exception as e:
        elapsed = time.time() - start_time
        return {
            "text": text[:40] + "..." if len(text) > 40 else text,
            "sentiment": "ERROR",
            "score": 0,
            "response_time_ms": round(elapsed * 1000, 2),
            "status_code": 0,
            "success": False,
            "error": str(e)
        }

# Test with parallel requests
parallel_texts = [
    "I'm so happy with my purchase!",
    "This is the worst experience ever.",
    "The quality is excellent and delivery was fast.",
    "Not worth the price. Very disappointed.",
    "Outstanding service and great product quality.",
    "Terrible customer support. Will not recommend.",
    "Love it! Exactly what I was looking for.",
    "Poor quality materials. Broke immediately.",
    "Fantastic value! Highly satisfied.",
    "Waste of money. Complete garbage.",
    "Amazing features and easy to use.",
    "Horrible experience from start to finish.",
    "Great product, fast shipping, excellent service.",
    "Very poor quality. Do not buy.",
    "Perfect! Exceeded my expectations.",
    "Awful product. Returned immediately."
]

print(f"Making {len(parallel_texts)} parallel requests...\n")
start_time = time.time()

# Use ThreadPoolExecutor for parallel requests
with ThreadPoolExecutor(max_workers=16) as executor:
    futures = [executor.submit(make_prediction, text) for text in parallel_texts]
    parallel_results = [future.result() for future in as_completed(futures)]

total_time = time.time() - start_time

# Display results
print("Parallel Request Results:\n")
for i, result in enumerate(parallel_results, 1):
    status = "‚úì" if result["success"] else "‚úó"
    print(f"{status} {i:2d}. {result['sentiment']:8s} ({result['score']:.4f}) - {result['response_time_ms']:6.2f}ms - {result['text']}")

print(f"\n{'='*70}")
print(f"Total time for {len(parallel_texts)} parallel requests: {total_time:.2f}s")
print(f"Average time per request: {total_time/len(parallel_texts)*1000:.2f}ms")
print(f"Throughput: {len(parallel_texts)/total_time:.2f} requests/second")

# Count successes
successful = sum(1 for r in parallel_results if r["success"])
print(f"Successful requests: {successful}/{len(parallel_texts)}")


Making 16 parallel requests...

Parallel Request Results:

‚úì  1. POSITIVE (0.9999) - 549.77ms - I'm so happy with my purchase!
‚úì  2. POSITIVE (0.9999) - 541.65ms - Outstanding service and great product qu...
‚úì  3. NEGATIVE (0.9998) - 543.77ms - This is the worst experience ever.
‚úì  4. NEGATIVE (0.9997) - 559.96ms - Terrible customer support. Will not reco...
‚úì  5. POSITIVE (0.9999) - 995.01ms - Great product, fast shipping, excellent ...
‚úì  6. NEGATIVE (0.9997) - 1002.01ms - Horrible experience from start to finish...
‚úì  7. NEGATIVE (0.9998) - 1001.54ms - Very poor quality. Do not buy.
‚úì  8. NEGATIVE (0.9998) - 1061.86ms - Not worth the price. Very disappointed.
‚úì  9. POSITIVE (0.9999) - 1056.46ms - Fantastic value! Highly satisfied.
‚úì 10. POSITIVE (0.9999) - 1278.98ms - Perfect! Exceeded my expectations.
‚úì 11. POSITIVE (0.9998) - 1304.66ms - The quality is excellent and delivery wa...
‚úì 12. POSITIVE (0.9999) - 1305.26ms - Love it! Exactly what I was looking for

## 5. Performance Comparison: Sequential vs Parallel

Let's compare the performance difference between sequential and parallel request handling.


In [22]:
# Performance comparison
comparison_texts = parallel_texts[:8]  # Use first 8 texts for comparison

# Sequential requests
print("Sequential Requests:")
seq_start = time.time()
seq_results = []
for text in comparison_texts:
    result = make_prediction(text)
    seq_results.append(result)
seq_time = time.time() - seq_start

print(f"  Total time: {seq_time:.2f}s")
print(f"  Average: {seq_time/len(comparison_texts)*1000:.2f}ms per request")
print(f"  Throughput: {len(comparison_texts)/seq_time:.2f} req/s\n")

# Parallel requests
print("Parallel Requests:")
par_start = time.time()
with ThreadPoolExecutor(max_workers=8) as executor:
    futures = [executor.submit(make_prediction, text) for text in comparison_texts]
    par_results = [future.result() for future in as_completed(futures)]
par_time = time.time() - par_start

print(f"  Total time: {par_time:.2f}s")
print(f"  Average: {par_time/len(comparison_texts)*1000:.2f}ms per request")
print(f"  Throughput: {len(comparison_texts)/par_time:.2f} req/s\n")

# Comparison
speedup = seq_time / par_time if par_time > 0 else 0
print(f"{'='*70}")
print(f"Speedup: {speedup:.2f}x faster with parallel requests")
print(f"Time saved: {seq_time - par_time:.2f}s ({((seq_time - par_time)/seq_time*100):.1f}% reduction)")


Sequential Requests:
  Total time: 0.89s
  Average: 111.41ms per request
  Throughput: 8.98 req/s

Parallel Requests:
  Total time: 0.40s
  Average: 50.42ms per request
  Throughput: 19.83 req/s

Speedup: 2.21x faster with parallel requests
Time saved: 0.49s (54.7% reduction)


## 6. Results Summary and Visualization

Let's create a summary DataFrame and visualize the results.


In [23]:
# Create DataFrame from parallel results
df = pd.DataFrame(parallel_results)

print("Results Summary:")
print("="*70)
print(f"\nTotal Requests: {len(df)}")
print(f"Successful: {df['success'].sum()}")
print(f"Failed: {(~df['success']).sum()}")

print(f"\nSentiment Distribution:")
print(df['sentiment'].value_counts())

print(f"\nResponse Time Statistics:")
print(f"  Mean: {df['response_time_ms'].mean():.2f}ms")
print(f"  Median: {df['response_time_ms'].median():.2f}ms")
print(f"  Min: {df['response_time_ms'].min():.2f}ms")
print(f"  Max: {df['response_time_ms'].max():.2f}ms")
print(f"  Std Dev: {df['response_time_ms'].std():.2f}ms")

print(f"\nConfidence Score Statistics:")
print(f"  Mean: {df['score'].mean():.4f}")
print(f"  Median: {df['score'].median():.4f}")
print(f"  Min: {df['score'].min():.4f}")
print(f"  Max: {df['score'].max():.4f}")

# Display full results table
print(f"\n{'='*70}")
print("Full Results Table:")
print(df[['text', 'sentiment', 'score', 'response_time_ms']].to_string(index=False))


Results Summary:

Total Requests: 16
Successful: 16
Failed: 0

Sentiment Distribution:
sentiment
POSITIVE    8
NEGATIVE    8
Name: count, dtype: int64

Response Time Statistics:
  Mean: 1057.53ms
  Median: 1059.16ms
  Min: 541.65ms
  Max: 1498.71ms
  Std Dev: 346.00ms

Confidence Score Statistics:
  Mean: 0.9998
  Median: 0.9998
  Min: 0.9991
  Max: 0.9999

Full Results Table:
                                       text sentiment  score  response_time_ms
             I'm so happy with my purchase!  POSITIVE 0.9999            549.77
Outstanding service and great product qu...  POSITIVE 0.9999            541.65
         This is the worst experience ever.  NEGATIVE 0.9998            543.77
Terrible customer support. Will not reco...  NEGATIVE 0.9997            559.96
Great product, fast shipping, excellent ...  POSITIVE 0.9999            995.01
Horrible experience from start to finish...  NEGATIVE 0.9997           1002.01
             Very poor quality. Do not buy.  NEGATIVE 0.9998       

## 7. Error Handling Examples

Let's test how the API handles edge cases and errors.


In [24]:
# Test error cases
print("Testing Error Handling:\n")

# Test 1: Empty text
print("1. Empty text:")
response = requests.post(PREDICT_ENDPOINT, json={"text": ""})
print(f"   Status: {response.status_code}")
print(f"   Response: {json.dumps(response.json(), indent=2)}\n")

# Test 2: Missing text field
print("2. Missing 'text' field:")
response = requests.post(PREDICT_ENDPOINT, json={})
print(f"   Status: {response.status_code}")
print(f"   Response: {json.dumps(response.json(), indent=2)}\n")

# Test 3: Invalid JSON
print("3. Invalid JSON:")
try:
    response = requests.post(PREDICT_ENDPOINT, data="not json")
    print(f"   Status: {response.status_code}")
except Exception as e:
    print(f"   Error: {e}\n")

# Test 4: Very long text (should be truncated)
print("4. Very long text (truncation test):")
long_text = "This is a test. " * 1000
response = requests.post(PREDICT_ENDPOINT, json={"text": long_text})
print(f"   Status: {response.status_code}")
result = response.json()
print(f"   Sentiment: {result.get('sentiment')}")
print(f"   Score: {result.get('score')}\n")

# Test 5: Special characters and emojis
print("5. Text with special characters and emojis:")
special_text = "I love this! üòç It's amazing!!! üéâüéä"
response = requests.post(PREDICT_ENDPOINT, json={"text": special_text})
print(f"   Status: {response.status_code}")
result = response.json()
print(f"   Sentiment: {result.get('sentiment')}")
print(f"   Score: {result.get('score')}")


Testing Error Handling:

1. Empty text:
   Status: 400
   Response: {
  "error": "Text field is required and cannot be empty"
}

2. Missing 'text' field:
   Status: 400
   Response: {
  "error": "No JSON data provided"
}

3. Invalid JSON:
   Status: 500
4. Very long text (truncation test):
   Status: 200
   Sentiment: NEGATIVE
   Score: 0.9974

5. Text with special characters and emojis:
   Status: 200
   Sentiment: POSITIVE
   Score: 0.9999


## 8. API Architecture Demonstration

This API demonstrates a production-ready architecture with:

- **NGINX**: Reverse proxy and load balancer (handles incoming connections)
- **Gunicorn**: WSGI server with multiple workers (processes Python requests)
- **Flask**: Web framework (handles application logic)
- **DistilBERT**: ML model for sentiment analysis

The parallel request handling showcases how NGINX distributes requests across Gunicorn workers, allowing multiple predictions to be processed simultaneously.

### Key Features Demonstrated:

1. ‚úÖ Health check endpoint for monitoring
2. ‚úÖ JSON-based API with proper error handling
3. ‚úÖ Support for parallel/concurrent requests
4. ‚úÖ Fast inference with optimized model
5. ‚úÖ Production-ready architecture with proper layering

---

**Note:** Make sure the Docker containers are running before executing this notebook:
```bash
docker-compose up -d
```

To stop the containers:
```bash
docker-compose down
```
