### Update the Course Package

* run the cell below
* change `force_update = False` to avoid unecessary reinstalls
* restart the kernel

In [0]:
# run this cell to ensure course package is installed
import sys
from pathlib import Path

course_tools_path = Path('../../Lessons/Course_Tools/').resolve() # change this to the local path of the course package
sys.path.append(str(course_tools_path))

from install_introdl import ensure_introdl_installed
ensure_introdl_installed(force_update=True, local_path_pkg= course_tools_path / 'introdl')

# Homework 7:  Exploring Hugging Face Pipelines and LLM Prompting

In this assignment, you will explore different NLP tasks using Hugging Face's transformers and LLM-based prompting. You will also experiment with different models, zero-shot prompting, and few-shot prompting.


In [5]:
import os
import torch
from transformers import pipeline
from introdl.utils import get_device, wrap_print_text, config_paths_keys
from introdl.nlp import llm_configure, llm_generate, clear_pipeline, print_pipeline_info, display_markdown

# overload print to wrap text
print = wrap_print_text(print)

device = get_device()

paths = config_paths_keys()

mistral_config = llm_configure('mistral-7B')
gemini_config = llm_configure('gemini-flash-lite')


MODELS_PATH=/home/user/cs_workspace/models
DATA_PATH=/home/user/cs_workspace/data
TORCH_HOME=/home/user/cs_workspace/downloads
HF_HOME=/home/user/cs_workspace/downloads


### Some notes about LLMs and Prompting

You can get deterministic responses by setting `search_strategy = "deterministic".   This can help when using an LLM to produce repeatable results.  We'll learn about how LLMs generate text in Lesson 11.

In [2]:
system_prompt = "You are an AI assistant who gives helpful, concise answers."
user_prompt = "Tell me about the number pi."

mistral_output = llm_generate(mistral_config, user_prompt, system_prompt=system_prompt, search_strategy='deterministic')
print(mistral_output)

In [3]:
print(mistral_output)

Pi (symbolized by the Greek letter π) is a mathematical constant that represents
the ratio of a circle's circumference to its diameter. It is approximately equal
to 3.14159 and has been calculated to trillions of decimal places beyond its
common representation. The value of pi is an irrational number, meaning it
cannot be expressed as a simple fraction and its decimal representation goes on
indefinitely without repeating.
Pi appears frequently in many areas of mathematics, particularly when dealing
with circles or trigonometry. Some notable formulas involving pi include:
- Area of a circle: A = πr²
- Circumference of a circle: C = 2πr
- Volume of a sphere: V = (4/3)πr³
In addition to its mathematical significance, pi also holds cultural importance
due to its ubiquity across various fields such as physics, engineering, computer
science, and


#### System and User Prompts

* Use the system prompt to set the overall behavior of the LLM.  e.g. 'You are a Named Entity Recognition expert.'
* The main prompt is often called the user prompt, especially in the context of chats.
* Use the user prompt to give detailed instructions, possibly examples, the particular text you want analyzed.  e.g. 'Input:{text}, Entities:'.

These are not rules for prompts, only guidelines.  API based LLMs may actually use or weight the system and user prompts differently but local LLMs generally just concatenate the system and user prompts.

In [24]:
# Provided Texts for Tasks 1 and 2

texts = [
    "The new AI technology developed by OpenAI is revolutionizing various industries, from healthcare to finance.",
    "Marie Curie was a physicist and chemist who conducted research on radioactivity.",
    "In 2023, NASA successfully landed another rover on Mars, aiming to explore signs of ancient life.",
    "The recent advancements in quantum computing by IBM have the potential to solve complex problems that are currently unsolvable with classical computers.",
    "Despite the company's efforts, the new product launch by XYZ Corp was a complete failure, leading to significant financial losses and a drop in stock prices.",
]



### Task 1: Sentiment Analysis (6 pts)

1.  Use the default sentiment analysis pipeline from HuggingFace to get determine the sentiment of each of the texts. Note that some HuggingFace pipelines can handle multiple inputs passed as a list or you can write a loop to iterate over the texts.
 Use clear_pipeline() to free the memory after you're done.

2. Try a different HuggingFace model, pass `model="cardiffnlp/twitter-roberta-base-sentiment-latest"` to `pipeline()` to instantiate a different classifier.

3.  Explore HuggingFace to find a different model for sentiment analysis and apply it each of the texts.  You may have to try more than one model to find which produces relevant classifications.

4.  Which of the three models seems to best capture the sentiments of the texts.  EXPLAIN.

### Task 2: Named Entity Recognition (6 pts)

1.  Apply the default HuggingFace NER pipeline to each of the texts.

2.  Now use an LLM  (local or API-based) along with a zero-shot prompt (no examples) to try to get the LLM provide a list of entities in JSON format like this:  [{"entity": "entity_name", "type": "entity_type"}].  You'll have to experiment with the system and user prompt to get this to work.  Use `llm_generate` from the introdl package.  You can pass `search_strategy="deterministic"` to get reproducible results.



### Task 3 - Text Generation (6 pts)

Think of a short creative task like writing text for an advertisement, lyrics for a jingle, etc.  Create a prompt for the task.  Use the default HuggingFace pipeline to generate text for the task.  Use `llm_generate` with two different models to do the tasks.  Compare the results.  Which model or pipeline produced the best result?

### Task 4: Translation (6 pts)

Pick your own short text of at least 3 sentences and translate it to another language (not Spanish) and back and compare the back-translated result to the original text (or if you're fluent in the other language you can comment directly on the translation).

1.  Do this with a HuggingFace pipeline (search HuggingFace for an appropriate model)
2.  Do this by using an LLM and llm_generate.

Which works better?  The specialized model or the LLM?  

### Task 5: Summarization (8 pts)

For this task you're going to generate summaries of ["The Bitter Lesson"](http://www.incompleteideas.net/IncIdeas/BitterLesson.html) by Rich Sutton.  If you haven't read it already, you should since it's directly related to deep learning.  

The next code cell grabs the text of "The Bitter Lesson".  You may need to `!pip install bs4` to install BeautifulSoup - a webscraping library.

In [0]:
import requests
from bs4 import BeautifulSoup

url = "http://www.incompleteideas.net/IncIdeas/BitterLesson.html"
response = requests.get(url)
soup = BeautifulSoup(response.content, 'html.parser')

# Extract the text from the webpage
text = soup.get_text()

print(text)

Now generate three different summaries of the text and comment on the differences.  For the three summaries:
1.  Use the default HuggingFace pipeline for summarization.  Note,  "The Bitter Lesson" is too long for the default summarization model.  Split the input text about halfway between paragraphs and create a summary for each half.
2.  Find another model on HuggingFace and use it.  Try to find one with a long enough context length to handle the whole article, or split again.
3.  Use an LLM with `llm_generate`.  No need to split here.
4.  Compare the summaries.  Which one seems best and why?

### Task 6:  Sarcasm Detection with an LLM (8 pts)

The code in the next cell loads the Sarcasm News Headlines dataset from HuggingFace (we'll use this again next week) and puts the results in a dataframe.  A label of 1 indicates sarcasm and a label of 0 is not sarcastic.

In [6]:
# Load Twitter Sarcasm Dataset
from datasets import load_dataset
import pandas as pd
dataset = load_dataset("raquiba/Sarcasm_News_Headline")

# Convert to Pandas DataFrame
df = pd.DataFrame(dataset['train'])
df = df.rename(columns={'is_sarcastic': 'label'})
df.head(10)


Repo card metadata block was not found. Setting CardData to empty.




Unnamed: 0,label,headline,article_link
0,1,thirtysomething scientists unveil doomsday clo...,https://www.theonion.com/thirtysomething-scien...
1,0,dem rep. totally nails why congress is falling...,https://www.huffingtonpost.com/entry/donna-edw...
2,0,eat your veggies: 9 deliciously different recipes,https://www.huffingtonpost.com/entry/eat-your-...
3,1,inclement weather prevents liar from getting t...,https://local.theonion.com/inclement-weather-p...
4,1,mother comes pretty close to using word 'strea...,https://www.theonion.com/mother-comes-pretty-c...
5,0,my white inheritance,https://www.huffingtonpost.com/entry/my-white-...
6,0,5 ways to file your taxes with less stress,https://www.huffingtonpost.com/entry/5-ways-to...
7,1,richard branson's global-warming donation near...,https://www.theonion.com/richard-bransons-glob...
8,1,shadow government getting too large to meet in...,https://politics.theonion.com/shadow-governmen...
9,0,lots of parents know this scenario,https://www.huffingtonpost.comhttp://pubx.co/6...


For this task use an LLM to classify the first 10 headlines in dataset as sarcastic or not sarcastic.  Use a local LLM or an API-based LLM (or both).  If you want to take this a step further you can experiment with few-shot prompting by providing some examples in your user prompt to the LLM.  Compare your results to the actual labels.