# Session 6: Domain Safari
[![Open in Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/buildLittleWorlds/level-2-course-material/blob/main/session-06/notebook.ipynb)

Same models, different worlds of text. Watch them struggle.

In [None]:
# Setup — run this cell first!
!pip install -q transformers torch

from transformers import pipeline

print("Loading 3 sentiment models...")
movie_model = pipeline("sentiment-analysis",
    model="distilbert-base-uncased-finetuned-sst-2-english")
twitter_model = pipeline("sentiment-analysis",
    model="cardiffnlp/twitter-roberta-base-sentiment-latest")
review_model = pipeline("sentiment-analysis",
    model="nlptown/bert-base-multilingual-uncased-sentiment")
print("All 3 models loaded!")

## What We Built Tonight

No new Space tonight — we reused the **Sentiment Showdown** from Session 4.

We tested these 3 models on text they were **never trained on** and watched them struggle.

Check out the live Space: [Sentiment Showdown on Hugging Face](https://huggingface.co/spaces/profplate/sentiment-showdown)

In [None]:
# Helper function: run text through all 3 models
def compare_models(text):
    results = {
        "Movie Model": movie_model(text[:512])[0],
        "Twitter Model": twitter_model(text[:512])[0],
        "Review Model": review_model(text[:512])[0],
    }
    for name, r in results.items():
        print(f"  {name}: {r['label']} ({r['score']:.1%})")
    print()

## Domain 1: News Articles

News text is neutral/factual. The movie model has to pick POSITIVE or NEGATIVE — it has no neutral option.

In [None]:
news = "The Federal Reserve announced a quarter-point interest rate cut on Wednesday, signaling confidence that inflation is moving sustainably toward its 2 percent target."
print("NEWS:")
compare_models(news)

## Domain 2: Tweets

Which model handles slang best? Has the product review model ever seen this kind of language?

In [None]:
tweet = "ngl this new update is mid at best. they really thought they did something"
print("TWEET:")
compare_models(tweet)

## Domain 3: Song Lyrics

Sounds upbeat (dancing, laughing, celebrating) but it's about destruction. Can the models tell?

In [None]:
lyrics = "Dancing on the ceiling, burning down the walls, laughing at the wreckage as the empire falls. We'll celebrate the ending with confetti made of ash."
print("SONG LYRICS:")
compare_models(lyrics)

## Domain 4: Student Essay

Academic writing is carefully balanced — not really positive or negative.

In [None]:
essay = "In conclusion, while both authors present compelling arguments, Smith's analysis is more thoroughly supported by evidence. However, Jones raises important counterpoints that cannot be ignored."
print("STUDENT ESSAY:")
compare_models(essay)

## Domain 5: Text Messages

Is "lol ok sure whatever" positive, negative, or sarcastic? Humans would need context.

In [None]:
texts = [
    "lol ok sure whatever u say",
    "omg YES that's literally the best thing ever im so happy rn",
]
for t in texts:
    print(f"TEXT MESSAGE: {t}")
    compare_models(t)

## Domain 6: Legal Text

Legal text has no sentiment — it's purely functional. But the models HAVE to output something.

In [None]:
legal = "The party of the first part shall indemnify and hold harmless the party of the second part against any and all claims, damages, losses, costs, and expenses arising out of or relating to any breach of this agreement."
print("LEGAL TEXT:")
compare_models(legal)

## Experiments

### Experiment 1: Test Your Own Domain

Pick a type of text and run it through all 3 models. Record what happens.

In [None]:
# Experiment 1: Paste your own text here
my_text = ""  # <-- Put your text inside the quotes

if my_text:
    print("MY TEXT:")
    compare_models(my_text)
else:
    print("Paste some text between the quotes above and run this cell again!")

### Experiment 2: Find the Domain That Breaks All Models

Try to find text where ALL 3 models get confused or give wrong answers.

Ideas: mixing languages, heavy sarcasm, recipes, math problems, meme text, code comments.

In [None]:
# Experiment 2: Try to break all 3 models
breaking_text = ""  # <-- Put your text here

if breaking_text:
    print("BREAKING TEXT:")
    compare_models(breaking_text)
else:
    print("Paste some text that you think will confuse the models!")

### Experiment 3: Which Model Is Most Consistent?

Run 3 different domains through all models. Which model gives the most reasonable answers across all of them?

Record your observations here:

| Domain | Movie Model | Twitter Model | Review Model | Which was best? |
|--------|------------|---------------|--------------|----------------|
| | | | | |
| | | | | |
| | | | | |

In [None]:
# Experiment 3: Test 3 domains in a row
domains = {
    "Poetry": "I wandered lonely through the ash of things that used to gleam. The world had shed its golden mask and left a hollow dream.",
    "Code comment": "// HACK: This is a terrible workaround for the race condition. TODO: fix this properly before it breaks production.",
    "Meme": "Nobody: Absolutely nobody: My cat at 3am: knocks everything off the counter",
}

for domain, text in domains.items():
    print(f"{domain.upper()}:")
    compare_models(text)

## My Observations

Write your observations in this cell (double-click to edit):

- Which model worked best across the most domains?
- Were there domains where ALL models struggled?
- What makes a domain "hard" for these models?

*Your answers here...*

## Challenge

Find a domain that **breaks all three models**. Write down:
1. The text you used
2. What each model predicted
3. Why you think it confused them

**GitHub:** Write a README.md for your `my-ai-portfolio` repo this week!
1. Go to your repo on github.com
2. Click the pencil icon on README.md (or create one if it's missing)
3. Write a few sentences about what you're learning in this course
4. Add links to any notebooks you've uploaded
5. Commit the changes

## Vocabulary

| Term | Meaning |
|------|---------|
| **Domain** | A category of text (tweets, legal documents, poetry, product reviews, etc.) |
| **Overfitting** | When a model is too specialized in its training data to handle new situations |
| **Domain shift** | When the real-world data is different from the training data |
| **Generalization** | A model's ability to work well on data it hasn't seen before |
| **Distribution** | The patterns and characteristics of a particular type of text |