In [None]:
# DS776 Environment Setup & Package Update
# Configures storage paths for proper cleanup/sync, then updates introdl if needed
# If this cell fails, see Lessons/Course_Tools/AUTO_UPDATE_SYSTEM.md for help
%run ../../Lessons/Course_Tools/auto_update_introdl.py

# Homework 7: Exploring Hugging Face Pipelines and LLM Prompting

In this assignment, you will explore different NLP tasks using Hugging Face's transformers pipelines and LLM-based prompting with `llm_generate()`. You will experiment with different models, zero-shot prompting, and compare results across approaches.

**Total Points: 40**
- Part 1 (Sentiment Analysis): 4 points
- Part 2 (Named Entity Recognition): 6 points  
- Part 3 (Text Generation): 6 points
- Part 4 (Translation): 6 points
- Part 5 (Summarization): 8 points
- Part 6 (Sarcasm Detection): 8 points
- Part 7 (Reflection): 2 points

In [None]:
# YOUR IMPORTS HERE
# Add any additional imports you need below this line

import os
import torch
from transformers import pipeline
from introdl import (
    get_device,
    wrap_print_text,
    config_paths_keys,
    llm_generate,
    clear_pipeline,
    print_pipeline_info,
    display_markdown, 
    show_session_spending 
)
# Wrap print to format text nicely at 120 characters
print = wrap_print_text(print, width=120)

device = get_device()

# Configure paths
paths = config_paths_keys()
DATA_PATH = paths['DATA_PATH']
MODELS_PATH = paths['MODELS_PATH']

## Notes About Using LLMs Programmatically

**Using `llm_generate()` for all LLM tasks:**
- Use `llm_generate()` with `'gemini-flash-lite'` as your default model (fast and cost-effective)
- For each task, also try at least **one other model** to compare results (e.g., `'gpt-4o-mini'`, `'mistral-medium'`, `'llama-3.3-70b'`)
- You can pass `temperature=0` to get more deterministic (reproducible) responses
- Use `mode='json'` when you need structured JSON output

**System and User Prompts:**
- Use the **system prompt** to set the overall behavior (e.g., "You are a sentiment analysis expert")
- Use the **user prompt** for specific instructions and the text to analyze

## Storage Guidance

**Always use the path variables** (`MODELS_PATH`, `DATA_PATH`, `CACHE_PATH`) instead of hardcoded paths. The actual locations depend on your environment:

| Variable | CoCalc Home Server | Compute Server |
|----------|-------------------|----------------|
| `MODELS_PATH` | `Homework_07_Models/` | `Homework_07_Models/` *(synced)* |
| `DATA_PATH` | `~/home_workspace/data/` | `~/cs_workspace/data/` *(local)* |
| `CACHE_PATH` | `~/home_workspace/downloads/` | `~/cs_workspace/downloads/` *(local)* |

**Why this matters:**
- On **Compute Servers**: Only `MODELS_PATH` syncs back to CoCalc (~10GB limit). Data and cache stay local (~50GB).
- On **CoCalc Home**: Everything syncs and counts against the ~10GB limit.
- **Storage_Cleanup.ipynb** (in this folder) helps free synced space when needed.

**Tip:** Always write `MODELS_PATH / 'model.pt'` ‚Äî never hardcode paths like `'Homework_07_Models/model.pt'`.

In [None]:
# Provided Texts for Tasks 1 and 2

texts = [
    "The new AI technology developed by OpenAI is revolutionizing various industries, from healthcare to finance.",
    "Marie Curie was a physicist and chemist who conducted research on radioactivity.",
    "In 2023, NASA successfully landed another rover on Mars, aiming to explore signs of ancient life.",
    "The recent advancements in quantum computing by IBM have the potential to solve complex problems that are currently unsolvable with classical computers.",
    "Despite the company's efforts, the new product launch by XYZ Corp was a complete failure, leading to significant financial losses and a drop in stock prices.",
]

## Part 1 - Sentiment Analysis (4 pts)

**1a.** Use the default sentiment analysis pipeline from HuggingFace to determine the sentiment of each text. Use `clear_pipeline()` to free memory when done.

In [None]:
# YOUR CODE HERE

**1b.** Now use `llm_generate()` with `'gemini-flash-lite'` to perform sentiment analysis. Create appropriate system and user prompts.

In [None]:
# YOUR CODE HERE

**1c.** Try a different model with `llm_generate()` for the same task.

In [None]:
# YOUR CODE HERE

**1d.** Which approach (HuggingFace pipeline or which LLM model) best captures the sentiments? Explain your reasoning.

üìù **YOUR ANSWER HERE:**

## Part 2 - Named Entity Recognition (6 pts)

**2a.** Apply the default HuggingFace NER pipeline to each of the texts. Display results in a pandas DataFrame.

In [None]:
# YOUR CODE HERE

**2b.** Use `llm_generate()` with `mode='json'` to get structured NER output. Use `'gemini-flash-lite'` and craft prompts to return JSON with entity information. Display results as a DataFrame.

In [None]:
# YOUR CODE HERE
# Hint: Use mode='json' and describe the JSON structure you want in the prompt

**2c.** Try a different LLM model with JSON mode and compare the results.

In [None]:
# YOUR CODE HERE

## Part 3 - Text Generation (6 pts)

Think of a short creative task (e.g., writing an advertisement, lyrics for a jingle, a product description, etc.).

**3a.** Use the default HuggingFace text generation pipeline for your task.

In [None]:
# YOUR CODE HERE

**3b.** Use `llm_generate()` with two different models to perform the same task.

In [None]:
# Model 1: gemini-flash-lite
# YOUR CODE HERE

In [None]:
# Model 2: (your choice)
# YOUR CODE HERE

**3c.** Which approach produced the best result? Explain why.

üìù **YOUR ANSWER HERE:**

## Part 4 - Translation (6 pts)

Pick your own short text (at least 3 sentences) and translate it to another language (not Spanish) and back, then compare.

**4a.** Use HuggingFace translation pipelines. Search for an appropriate model on HuggingFace Hub for your chosen language.

In [None]:
# YOUR CODE HERE
# Translate to target language
# Translate back to English
# Compare

**4b.** Use `llm_generate()` with two different models to translate to your chosen language and back.

In [None]:
# YOUR CODE HERE

**4c.** Which works better - the specialized HuggingFace model or the LLMs? Explain.

üìù **YOUR ANSWER HERE:**

## Part 5 - Summarization (8 pts)

For this task you'll generate summaries of ["The Bitter Lesson"](http://www.incompleteideas.net/IncIdeas/BitterLesson.html) by Rich Sutton. This is an important paper about deep learning that you should read!

In [None]:
# Get the text from "The Bitter Lesson"
# You may need to: !pip install beautifulsoup4

import requests
from bs4 import BeautifulSoup

url = "http://www.incompleteideas.net/IncIdeas/BitterLesson.html"
response = requests.get(url)
soup = BeautifulSoup(response.content, 'html.parser')

# Extract the text from the webpage
text = soup.get_text()

print(text)

**5a.** Use the default HuggingFace summarization pipeline. Note: "The Bitter Lesson" is too long for the default model. Split the text roughly in half and summarize each half separately.

In [None]:
# YOUR CODE HERE

**5b.** Use `llm_generate()` with `'gemini-flash-lite'` to summarize the entire article (no need to split).

In [None]:
# YOUR CODE HERE

**5c.** Try a different LLM model for summarization.

In [None]:
# YOUR CODE HERE

**5d.** Compare all three summaries. Which one seems best and why?

üìù **YOUR ANSWER HERE:**

## Part 6 - Sarcasm Detection (8 pts)

The Sarcasm News Headlines dataset contains headlines labeled as sarcastic (1) or not sarcastic (0).

In [None]:
# Load Sarcasm News Headlines Dataset
from datasets import load_dataset
import pandas as pd

dataset = load_dataset("raquiba/Sarcasm_News_Headline")

# Convert to Pandas DataFrame
df = pd.DataFrame(dataset['train'])
df = df.rename(columns={'is_sarcastic': 'label'})
df.head(10)

**6a.** Use `llm_generate()` with `'gemini-flash-lite'` to classify the first 10 headlines as sarcastic or not.  You should build the prompts programatically as shown in the lesson notebooks. Compare to actual labels.

In [None]:
# YOUR CODE HERE

**6b.** Try a different model for sarcasm detection and compare results.

In [None]:
# YOUR CODE HERE

**6c.**  Arguably, many of the examples in the dataset are ironic and not sarcastic.  (Irony pertains to situations while sarcasm is a form of expression.)  Try prompting your 'gemini-flash-lite' to be an irony detector to see if it performs better.

In [None]:
# YOUR CODE HERE (optional)

**6d.** Which model and/or prompting approach performs best?  Give a brief explanation.

üìù **YOUR ANSWER HERE:**

## Part 7 - Reflection (2 pts)

1. What, if anything, did you find difficult to understand for the lesson? Why?

üìù **YOUR ANSWER HERE:**

2. What resources did you find supported your learning most and least for this lesson? (Be honest - I use your input to shape the course.)

üìù **YOUR ANSWER HERE:**

### Export Notebook to HTML for Canvas Upload

Uncomment the two lines below and run the cell to export the current notebook to HTML.

In [2]:
# from introdl import export_this_to_html
# export_this_to_html()