# 📘 Sentiment Analysis mit RoBERTa und MongoDB

Dieses Notebook erlaubt es dir, Texte aus MongoDB zu laden, mit einem RoBERTa-Modell das Sentiment zu analysieren und die Ergebnisse abzuspeichern.

Am Ende kann das Ganze in ein Python Script überführt werden.

## 1. 🔧 Setup & Imports

In [None]:
from transformers import AutoTokenizer, AutoModelForSequenceClassification
import torch
from pymongo import MongoClient

Collecting transformers
  Downloading transformers-4.50.0-py3-none-any.whl.metadata (39 kB)
Collecting torch
  Downloading torch-2.6.0-cp313-none-macosx_11_0_arm64.whl.metadata (28 kB)
Collecting filelock (from transformers)
  Using cached filelock-3.18.0-py3-none-any.whl.metadata (2.9 kB)
Collecting huggingface-hub<1.0,>=0.26.0 (from transformers)
  Downloading huggingface_hub-0.29.3-py3-none-any.whl.metadata (13 kB)
Collecting numpy>=1.17 (from transformers)
  Using cached numpy-2.2.4-cp313-cp313-macosx_14_0_arm64.whl.metadata (62 kB)
Collecting pyyaml>=5.1 (from transformers)
  Using cached PyYAML-6.0.2-cp313-cp313-macosx_11_0_arm64.whl.metadata (2.1 kB)
Collecting regex!=2019.12.17 (from transformers)
  Using cached regex-2024.11.6-cp313-cp313-macosx_11_0_arm64.whl.metadata (40 kB)
Collecting tokenizers<0.22,>=0.21 (from transformers)
  Downloading tokenizers-0.21.1-cp39-abi3-macosx_11_0_arm64.whl.metadata (6.8 kB)
Collecting safetensors>=0.4.3 (from transformers)
  Downloading saf

  from .autonotebook import tqdm as notebook_tqdm


## 2. 🧱 Verbindung zur MongoDB

In [None]:
# MongoDB URI hier einfügen
MONGO_URI = "mongodb+srv://<user>:<pass>@<cluster>.mongodb.net"
DB_NAME = "deine_datenbank"
COLLECTION_NAME = "deine_collection"

client = MongoClient(MONGO_URI)
collection = client[DB_NAME][COLLECTION_NAME]

## 3. 🤗 RoBERTa Modell laden

In [None]:
model_name = "cardiffnlp/twitter-roberta-base-sentiment"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForSequenceClassification.from_pretrained(model_name)
labels = ['negative', 'neutral', 'positive']

## 4. 🔍 Beispielhafte Sentiment-Inferenz

In [None]:
def get_sentiment(text):
    inputs = tokenizer(text, return_tensors="pt", truncation=True, padding=True)
    with torch.no_grad():
        logits = model(**inputs).logits
    predicted_class_id = torch.argmax(logits).item()
    return labels[predicted_class_id]

# Beispiel testen
get_sentiment("Ich liebe dieses Produkt!")

## 5. 🧪 Anwendung auf MongoDB-Daten

In [None]:
for doc in collection.find().limit(5):
    text = doc.get("text", "")
    sentiment = get_sentiment(text)
    print(f"Text: {text}\n→ Sentiment: {sentiment}\n")

## 6. 💾 Sentiment in MongoDB speichern

In [None]:
# Optional: Sentiment direkt in die Datenbank schreiben
for doc in collection.find():
    text = doc.get("text", "")
    sentiment = get_sentiment(text)
    collection.update_one({"_id": doc["_id"]}, {"$set": {"sentiment": sentiment}})