# üòä Sentiment anal√Ωza a anal√Ωza emoc√≠

**Autor:** Praut s.r.o. - AI Integration & Business Automation

## Co se nauƒç√≠te:
- Z√°kladn√≠ a pokroƒçil√° sentiment anal√Ωza
- Detekce emoc√≠ v textu
- Anal√Ωza z√°kaznick√Ωch recenz√≠
- Monitoring soci√°ln√≠ch s√≠t√≠

In [None]:
!pip install -q transformers accelerate torch pandas matplotlib seaborn

In [None]:
from transformers import pipeline
import torch
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns

device = 0 if torch.cuda.is_available() else -1
print(f"üñ•Ô∏è Device: {'GPU' if device == 0 else 'CPU'}")

## 1. Z√°kladn√≠ sentiment anal√Ωza

In [None]:
# Z√°kladn√≠ sentiment classifier
sentiment = pipeline("sentiment-analysis", device=device)

texty = [
    "This product exceeded all my expectations!",
    "Terrible experience, complete waste of money.",
    "It's okay, nothing special but does the job.",
    "I'm absolutely in love with this purchase!",
    "Never buying from this company again."
]

print("üìä Z√°kladn√≠ sentiment anal√Ωza:\n")
for text in texty:
    result = sentiment(text)[0]
    emoji = "üòä" if result['label'] == 'POSITIVE' else "üòû"
    print(f"{emoji} {result['label']} ({result['score']:.1%}): {text}")

## 2. Pokroƒçil√° sentiment anal√Ωza s v√≠ce t≈ô√≠dami

In [None]:
# 5-hvƒõzdiƒçkov√Ω sentiment (jako recenze)
sentiment_5star = pipeline("sentiment-analysis", 
                           model="nlptown/bert-base-multilingual-uncased-sentiment",
                           device=device)

recenze = [
    "Absolutely amazing product, best purchase ever!",
    "Pretty good, minor issues but overall satisfied.",
    "Average product, nothing to complain about.",
    "Below expectations, not worth the price.",
    "Complete disaster, do not buy this!"
]

print("‚≠ê 5-hvƒõzdiƒçkov√° anal√Ωza:\n")
for text in recenze:
    result = sentiment_5star(text)[0]
    stars = int(result['label'].split()[0])
    print(f"{'‚≠ê' * stars}{'‚òÜ' * (5-stars)} ({result['score']:.1%}): {text[:50]}...")

## 3. Anal√Ωza emoc√≠

In [None]:
# Model pro detekci emoc√≠
emotion = pipeline("text-classification", 
                   model="j-hartmann/emotion-english-distilroberta-base",
                   top_k=None,
                   device=device)

texty = [
    "I just got promoted! This is the best day of my life!",
    "I can't believe they cancelled the project after all our hard work.",
    "The presentation is tomorrow and I'm not ready at all.",
    "This constant noise from neighbors is driving me crazy!",
    "Watching old photos brings back so many memories."
]

emoji_map = {
    'joy': 'üòä', 'sadness': 'üò¢', 'anger': 'üò†', 
    'fear': 'üò∞', 'surprise': 'üò≤', 'disgust': 'ü§¢', 'neutral': 'üòê'
}

print("üé≠ Anal√Ωza emoc√≠:\n")
for text in texty:
    result = emotion(text)[0]
    top_emotion = result[0]
    emoji = emoji_map.get(top_emotion['label'], '‚ùì')
    print(f"{emoji} {top_emotion['label'].upper()} ({top_emotion['score']:.1%})")
    print(f"   \"{text}\"")
    print(f"   Dal≈°√≠: {result[1]['label']} ({result[1]['score']:.1%}), {result[2]['label']} ({result[2]['score']:.1%})\n")

## 4. V√≠cejazyƒçn√° sentiment anal√Ωza (ƒçe≈°tina)

In [None]:
# V√≠cejazyƒçn√Ω model
multilang_sentiment = pipeline("sentiment-analysis",
                               model="nlptown/bert-base-multilingual-uncased-sentiment",
                               device=device)

ceske_recenze = [
    "Skvƒõl√Ω produkt, naprost√° spokojenost!",
    "Docela dobr√©, ale ƒçekal jsem v√≠c.",
    "Pr≈Ømƒõrn√©, nic extra.",
    "Zklam√°n√≠, nekvalitn√≠ zpracov√°n√≠.",
    "Naprosto hrozn√©, vyhozen√© pen√≠ze!"
]

print("üá®üáø ƒåesk√© recenze:\n")
for text in ceske_recenze:
    result = multilang_sentiment(text)[0]
    stars = int(result['label'].split()[0])
    print(f"{'‚≠ê' * stars}{'‚òÜ' * (5-stars)} {text}")

## 5. Praktick√° automatizace: Anal√Ωza recenz√≠ produkt≈Ø

In [None]:
# Simulace dat z e-shopu
recenze_df = pd.DataFrame({
    "produkt": ["Notebook Pro"] * 5 + ["Wireless Mouse"] * 5 + ["USB Hub"] * 5,
    "text": [
        "Amazing laptop, super fast!", "Good but battery life could be better.",
        "Perfect for work and gaming.", "Overpriced for what it offers.",
        "Best laptop I've owned!",
        
        "Works great, comfortable grip.", "Stopped working after 2 weeks.",
        "Good value for money.", "Connection issues with Bluetooth.",
        "Perfect size and weight.",
        
        "Does what it's supposed to do.", "Cheap plastic, feels fragile.",
        "All ports work perfectly.", "Gets hot with heavy use.",
        "Great for the price."
    ],
    "datum": pd.date_range("2024-01-01", periods=15, freq="D")
})

# Anal√Ωza v≈°ech recenz√≠
results = sentiment_5star(recenze_df["text"].tolist())

recenze_df["hvezdicky"] = [int(r['label'].split()[0]) for r in results]
recenze_df["sentiment_score"] = [r['score'] for r in results]

print("üìä P≈ôehled recenz√≠:")
print(recenze_df[["produkt", "hvezdicky", "text"]].to_string(index=False))

In [None]:
# Agregovan√© statistiky
stats = recenze_df.groupby("produkt").agg({
    "hvezdicky": ["mean", "std", "count"],
    "sentiment_score": "mean"
}).round(2)

stats.columns = ["Pr≈Ømƒõr ‚≠ê", "Std", "Poƒçet", "Jistota"]
print("\nüìà Statistiky podle produktu:")
print(stats)

In [None]:
# Vizualizace
fig, axes = plt.subplots(1, 2, figsize=(14, 5))

# Graf 1: Pr≈Ømƒõrn√© hodnocen√≠
produkty = recenze_df.groupby("produkt")["hvezdicky"].mean().sort_values()
colors = ['#ff6b6b' if x < 3 else '#ffd93d' if x < 4 else '#6bcb77' for x in produkty]
axes[0].barh(produkty.index, produkty.values, color=colors)
axes[0].set_xlabel("Pr≈Ømƒõrn√© hodnocen√≠ ‚≠ê")
axes[0].set_title("Sentiment podle produktu")
axes[0].set_xlim(0, 5)

# Graf 2: Distribuce hodnocen√≠
recenze_df.groupby(["produkt", "hvezdicky"]).size().unstack().plot(
    kind="bar", ax=axes[1], colormap="RdYlGn"
)
axes[1].set_xlabel("Produkt")
axes[1].set_ylabel("Poƒçet recenz√≠")
axes[1].set_title("Distribuce hodnocen√≠")
axes[1].legend(title="Hvƒõzdiƒçky")

plt.tight_layout()
plt.show()

## 6. Detekce problematick√Ωch recenz√≠

In [None]:
def analyzuj_recenzi(text):
    """Komplexn√≠ anal√Ωza recenze."""
    
    # Sentiment
    sent_result = sentiment_5star(text)[0]
    hvezdicky = int(sent_result['label'].split()[0])
    
    # Emoce
    emo_result = emotion(text)[0]
    hlavni_emoce = emo_result[0]['label']
    
    # Flagy pro problematick√© recenze
    flags = []
    if hvezdicky <= 2:
        flags.append("üî¥ NEGATIVN√ç")
    if hlavni_emoce == 'anger':
        flags.append("üò† NA≈†TVAN√ù Z√ÅKAZN√çK")
    if hlavni_emoce == 'disgust':
        flags.append("ü§¢ ZNECHUCEN√ù")
    
    # Kl√≠ƒçov√° slova pro eskalaci
    eskalace_slova = ['lawyer', 'sue', 'refund', 'scam', 'fraud', 'report']
    if any(word in text.lower() for word in eskalace_slova):
        flags.append("‚ö†Ô∏è MO≈ΩN√Å ESKALACE")
    
    return {
        'hvezdicky': hvezdicky,
        'emoce': hlavni_emoce,
        'flags': flags,
        'priorita': 'VYSOK√Å' if len(flags) > 1 else 'ST≈òEDN√ç' if flags else 'N√çZK√Å'
    }

# Test na problematick√Ωch recenz√≠ch
problematicke = [
    "This is a scam! I want my money back immediately or I'll contact my lawyer!",
    "Product works fine, just as described.",
    "Absolutely disgusting quality, worst purchase ever!",
    "Delivery was slow but product is okay."
]

print("üö® Anal√Ωza problematick√Ωch recenz√≠:\n")
for text in problematicke:
    vysledek = analyzuj_recenzi(text)
    print(f"üìù \"{text[:60]}...\"")
    print(f"   ‚≠ê {vysledek['hvezdicky']}/5 | üé≠ {vysledek['emoce']} | Priorita: {vysledek['priorita']}")
    if vysledek['flags']:
        print(f"   Flags: {', '.join(vysledek['flags'])}")
    print()

## 7. Real-time monitoring (simulace)

In [None]:
import time
from IPython.display import clear_output

# Simulace p≈ô√≠choz√≠ch recenz√≠
stream_recenze = [
    "Just received my order, looks great!",
    "WHERE IS MY PACKAGE?! I've been waiting for 3 weeks!",
    "Good quality, fast shipping.",
    "This company is a complete fraud! Reporting to authorities!",
    "Nice product, would recommend."
]

print("üì° Real-time monitoring recenz√≠ (simulace):\n")

statistiky = {'pozitivni': 0, 'negativni': 0, 'eskalace': 0}

for i, recenze in enumerate(stream_recenze):
    vysledek = analyzuj_recenzi(recenze)
    
    # Aktualizace statistik
    if vysledek['hvezdicky'] >= 4:
        statistiky['pozitivni'] += 1
    elif vysledek['hvezdicky'] <= 2:
        statistiky['negativni'] += 1
    if '‚ö†Ô∏è MO≈ΩN√Å ESKALACE' in vysledek.get('flags', []):
        statistiky['eskalace'] += 1
    
    # V√Ωpis
    status = "üî¥ ALERT" if vysledek['priorita'] == 'VYSOK√Å' else "üü°" if vysledek['priorita'] == 'ST≈òEDN√ç' else "üü¢"
    print(f"[{i+1}] {status} {recenze[:50]}...")
    print(f"    ‚Üí {vysledek['hvezdicky']}‚≠ê | {vysledek['emoce']} | {vysledek['priorita']}")
    
    time.sleep(0.5)  # Simulace delay

print(f"\nüìä Souhrn: ‚úÖ {statistiky['pozitivni']} pozitivn√≠ch | ‚ùå {statistiky['negativni']} negativn√≠ch | ‚ö†Ô∏è {statistiky['eskalace']} eskalac√≠")

---
## üèÅ Shrnut√≠

- ‚úÖ Z√°kladn√≠ bin√°rn√≠ sentiment (pozitivn√≠/negativn√≠)
- ‚úÖ V√≠ce√∫rov≈àov√Ω sentiment (1-5 hvƒõzdiƒçek)
- ‚úÖ Detekce emoc√≠ (radost, smutek, hnƒõv, strach...)
- ‚úÖ Automatick√° prioritizace recenz√≠
- ‚úÖ Real-time monitoring

**Dal≈°√≠ notebook:** Sumarizace a generov√°n√≠ textu