# Part 1: Question & Answering with HuggingFace Transformers

In this notebook, we demonstrate how to use HuggingFace's pre-trained question-answering models to extract answers from text. Question-answering (QA) is a task where the model reads a passage of text (the **context**) and answers questions about it by identifying the relevant span of text that contains the answer.

## Key Concepts:

- **Extractive QA**: The model finds and extracts the answer directly from the provided context (it doesn't generate new text).
- **Context**: The passage of text that contains the information needed to answer the question.
- **Confidence Score**: A value between 0 and 1 indicating how confident the model is in its answer. Higher scores mean greater confidence.
- **Start/End Positions**: Character positions in the context where the answer begins and ends.

## How It Works:

1. **Initialize the pipeline**: Creates a question-answering model (by default, uses a model trained on the SQuAD dataset).
2. **Provide context and question**: The model searches the context for the answer to your question.
3. **Returns the answer**: Along with a confidence score and the position where it found the answer.

**Important Notes:**
- The model can only answer questions based on information present in the context.
- If the answer isn't in the context, the model will still try to return something, but with low confidence.
- This is useful for: customer support bots, document search, reading comprehension, information extraction, and more.

In [18]:
# Import the pipeline function from the transformers library
from transformers import pipeline

# Initialize the question-answering pipeline:
# This loads a pre-trained model (default: distilbert-base-cased-distilled-squad)
question_answerer = pipeline(task="question-answering")

No model was supplied, defaulted to distilbert/distilbert-base-cased-distilled-squad and revision 564e9b5 (https://huggingface.co/distilbert/distilbert-base-cased-distilled-squad).
Using a pipeline without specifying a model name and revision in production is not recommended.


config.json:   0%|          | 0.00/473 [00:00<?, ?B/s]

model.safetensors:   0%|          | 0.00/261M [00:00<?, ?B/s]

tokenizer_config.json:   0%|          | 0.00/49.0 [00:00<?, ?B/s]

vocab.txt: 0.00B [00:00, ?B/s]

tokenizer.json: 0.00B [00:00, ?B/s]

Device set to use cpu


## Example 1: Customer Support Automation

This example shows how QA can be used to automatically answer customer service questions by searching through policy documents.

In [33]:
context = """
Our return policy allows customers to return items within 30 days of purchase
for a full refund. Items must be unused and in original packaging. To initiate
a return, contact customer service at support@example.com or call 1-800-RETURNS.
Shipping costs are refunded for defective items only.
"""

# The question_answerer takes two inputs:
# - question: The question you want answered
# - context: The text that contains the answer
preds = question_answerer(
    question="How long do I have to return an item?",
    context=context
)

# The output includes:
# - answer: The extracted text span
# - score: Confidence level (0 to 1)
# - start/end: Character positions in the context
print(f"Answer: {preds['answer']} (start/end: {preds['start']}/{preds['end']}, confidence: {round(preds['score'], 4)})")

Answer: 30 days (start/end: 59/66, confidence: 0.5664)


## Example 2: Educational Content - Reading Comprehension
QA models are excellent for testing reading comprehension or creating study aids from textbooks.

In [40]:
context = """
Photosynthesis is the process by which plants convert light energy into chemical
energy. During this process, plants absorb carbon dioxide from the air and water
from the soil. Using chlorophyll in their leaves, they capture sunlight and
convert these materials into glucose and oxygen. The glucose provides energy
for the plant, while oxygen is released into the atmosphere.
"""

preds = question_answerer(
    question="What gas do plants release during photosynthesis?",
    context=context
)

print(f"Answer: {preds['answer']} (confidence: {round(preds['score'], 4)})")

Answer: oxygen (confidence: 0.8627)


## Example 3: Legal Document Analysis
QA can help extract specific information from contracts, policies, and legal documents.

In [39]:
context = """
Section 4.2: The Employee agrees to maintain confidentiality of all proprietary
information for a period of five years following termination of employment.
Proprietary information includes but is not limited to: trade secrets, client
lists, financial data, and unreleased product specifications.
"""

preds = question_answerer(
    question="How long must confidentiality be maintained after leaving the company?",
    context=context
)
print(f"Answer: {preds['answer']} (confidence: {round(preds['score'], 4)})")

Answer: five years (confidence: 0.864)


## Example 4: Text Analysis
QA can be applied to any text corpus, including religious texts, literature, historical documents, etc.

In [38]:
context = """
In the beginning, God created the heavens and the earth. The earth was without
form and void, and darkness was over the face of the deep. And the Spirit of
God was hovering over the face of the waters. And God said, "Let there be light,"
and there was light. (Genesis 1:1-3)
"""

preds = question_answerer(
    question="What did God create first?",
    context=context
)
print(f"Answer: {preds['answer']} (confidence: {round(preds['score'], 4)})")

Answer: the heavens and the earth (confidence: 0.584)


## Question & Answering in Practice

Now let's apply this to a real document: our course syllabus! This demonstrates how you could build a simple FAQ bot for any document.

**Setup:** Make sure `LPP_syllabus.txt` is uploaded to your Colab environment.

In [26]:
with open('LPP_syllabus.txt', 'r') as f:
    course_info = f.read()

### Single Question Example

Let's ask one question about the syllabus:

In [41]:
question = "When is the final project due?"

# Pass the question and the entire syllabus as context:
answer = question_answerer(question=question, context=course_info)

print(f"Q: {question}")
print(f"A: {answer['answer']}")
print(f"Confidence: {round(answer['score'], 2)}")

Q: When is the final project due?
A: Weeks 14–15
Confidence: 0.81


### Multiple Questions

**Mini-Challenge:**
1. Add your own questions to the `questions` list below
2. Try questions about: grading policies, office hours, assignment deadlines, course topics, etc.
3. Run your questions through a for-loop.
3. Observe the confidence scores—what kinds of questions does the model struggle with?

**Things to Explore:**
- What happens when you ask a question whose answer isn't in the text?
- How does the confidence score change for ambiguous questions?
- Can you find questions where the model gives incorrect answers?

In [None]:
# TODO: Students should modify these questions to explore the syllabus!
questions = [
    "What is the instructor's email?",
    "What percentage is Mini-Project 1 worth?",
    "What Python libraries will we use?",
]

# Loop through each question and get an answer:
for q in questions:
    result = question_answerer(question=q, context=course_info)
    print(f"\nQ: {q}")
    print(f"A: {result['answer']} (confidence: {round(result['score'], 2)})")

# Part 2: Text Summarization in HuggingFace

In this section, we demonstrate how to use HuggingFace's pre-trained summarization models to automatically generate concise summaries of longer texts. Text summarization is the task of condensing a document while preserving its key information and meaning.

## Key Concepts:

- **Abstractive Summarization**: The model generates new sentences to summarize the text (doesn't just copy sentences from the original).
- **Extractive Summarization**: The model selects important sentences directly from the original text.
- **Length Control**: You can specify minimum and maximum lengths for the generated summary.
- **Use Cases**: Article summaries, meeting notes, document previews, email digests, research paper abstracts.

## How It Works:

1. **Initialize the pipeline**: Loads a pre-trained summarization model (default: a BART or T5-based model).
2. **Provide input text**: Feed in the document or passage you want summarized.
3. **Returns a summary**: The model generates a condensed version capturing the main points.

**Important Notes:**
- Most models work best with texts between 50-1024 tokens (roughly 40-800 words).
- **The `max_length` parameter should be significantly shorter than your input** to create a true summary.
- Very short texts may not need summarization; very long texts may need to be chunked.
- The quality depends on how well the input text is structured.
- These models "out-of-the-box" often times truncate texts instead of doing a good job of summarization. Therefore, the last half of the semester will focus on building our own summarization and question & answering systems.

## Text Summarization in HuggingFace

In [51]:
# Import the pipeline function from the transformers library
from transformers import pipeline
# Initialize the summarization pipeline:
summarizer = pipeline(task="summarization", model="facebook/bart-large-cnn")

config.json: 0.00B [00:00, ?B/s]

model.safetensors:   0%|          | 0.00/1.63G [00:00<?, ?B/s]

generation_config.json:   0%|          | 0.00/363 [00:00<?, ?B/s]

vocab.json: 0.00B [00:00, ?B/s]

merges.txt: 0.00B [00:00, ?B/s]

tokenizer.json: 0.00B [00:00, ?B/s]

Device set to use cpu


## Example 1: Academic Paper Abstract

This example shows summarizing a research paper conclusion (from the famous Transformer paper).

In [52]:
text = """
In this work, we presented the Transformer, the first sequence transduction model
based entirely on attention, replacing the recurrent layers most commonly used in
encoder-decoder architectures with multi-headed self-attention. For translation tasks,
the Transformer can be trained significantly faster than architectures based on
recurrent or convolutional layers. On both WMT 2014 English-to-German and WMT 2014
English-to-French translation tasks, we achieve a new state of the art. In the former
task our best model outperforms even all previously reported ensembles.
"""

# Generate a summary:
summary = summarizer(text, max_length = 50, min_length = 20)

# The output is a list with one dictionary containing 'summary_text'
print(f"Original length: {len(text.split())} words")
print(f"Summary length: {len(summary[0]['summary_text'].split())} words")
print(f"Summary: {summary[0]['summary_text']}")

Original length: 79 words
Summary length: 34 words
Summary: In this work, we presented the Transformer, the first sequence transduction model based entirely on attention. For translation tasks, the Transformer can be trained significantly faster than architectures based on recurrent or convolutional layers


## Example 2: News Article
Summarization works best with longer, well-structured text like news articles.

In [54]:
text = """
Scientists at the European Space Agency announced today a major breakthrough in
the search for habitable exoplanets. The James Webb Space Telescope has detected
potential biosignatures in the atmosphere of an Earth-sized planet located 120
light-years away in the constellation Lyra. The planet, designated K2-18b, orbits
within its star's habitable zone where liquid water could exist. Researchers found
evidence of dimethyl sulfide, a compound on Earth primarily produced by marine
phytoplankton. While not definitive proof of life, this marks the first time such
a biosignature has been detected on a potentially habitable world. The discovery
was made possible by Webb's advanced infrared capabilities, which can analyze the
chemical composition of distant atmospheres. The team plans further observations
over the next year to confirm these findings and rule out alternative explanations.
If confirmed, this would represent one of the most significant discoveries in the
history of astronomy and could reshape our understanding of life in the universe.
"""

summary = summarizer(text, max_length=60, min_length=30)

# The output is a list with one dictionary containing 'summary_text'
print(f"Original length: {len(text.split())} words")
print(f"Summary length: {len(summary[0]['summary_text'].split())} words")
print(f"Summary: {summary[0]['summary_text']}")

Original length: 156 words
Summary length: 41 words
Summary: Scientists at the European Space Agency announced today a major breakthrough in the search for habitable exoplanets. The James Webb Space Telescope has detected potential biosignatures in the atmosphere of an Earth-sized planet located 120 light-years away. The planet, designated K2-18b,


# Part 3: Using the OpenAI API in Google Colab

In this notebook, we demonstrate how to use the OpenAI API within a Google Colab environment to generate chat completions. The code does the following:

1. **Imports the required libraries:**
   - `openai`: Provides access to the OpenAI API.
   - `google.colab.userdata`: Helps retrieve user-specific data, such as API keys, in Colab.

2. **Initializes the OpenAI client:**
   - Retrieves the API key stored in the Colab environment.
   - Creates an instance of the OpenAI client using the API key.

3. **Creates a chat completion:**
   - Sends a prompt asking for a joke in the form of a question along with its answer.
   - Uses the model `"gpt-5-nano"` for this task (note that this is an example model name).

4. **Prints the generated response:**
   - The result, which is the joke provided by the API, is printed to the output.

**Important:**  
- Make sure your API key is properly stored in the Colab environment.
- Verify that the model `"gpt-5-nano"` is available or adjust the model name as needed.
- I've made `gpt-5-nano`, `gpt-5-mini`, and `gpt-4o-mini` available with your API keys.


In [9]:
# Import the necessary libraries:
from openai import OpenAI  # OpenAI library to interact with the API
from google.colab import userdata  # Colab module to access stored user data (like API keys)

# Retrieve the API key from Colab's user data and initialize the OpenAI client:
client = OpenAI(api_key=userdata.get('OPENAI_API_KEY'))
# The userdata.get() function fetches the 'OPENAI_API_KEY' stored in your Colab environment.
# This key is used to authenticate your API requests.

# Create a response by sending a request to the OpenAI API:
response = client.responses.create(
    model="gpt-5-nano",  # Specify the model to use (ensure this model exists and is accessible)
    input="Write a one-sentence bedtime story about a unicorn."  # Define your prompt input
)

# Print the generated response:
print(response.output_text)

Under a silver moon, a gentle unicorn trotted through the sleepy meadow, whispering lullabies to the stars and tucking every dream into a soft bed of clouds.


# Part 4: Building LLM Chains with LangChain

In this section, we introduce **LangChain**, a powerful framework for building applications with Large Language Models (LLMs). LangChain allows you to create sophisticated workflows by chaining together prompts, models, and output processors.

## Key Concepts:

- **Prompt Templates**: Reusable templates with placeholders that can be filled with dynamic values.
- **Chains**: A sequence of operations that process data through multiple steps (prompt → model → parser).
- **Output Parsers**: Tools that format or structure the raw output from the LLM.
- **LCEL (LangChain Expression Language)**: The `|` (pipe) operator chains components together in a readable way.

## Why Use LangChain?

While you could directly call the OpenAI API (as we did earlier), LangChain provides:
1. **Modularity**: Easily swap components (different models, prompts, parsers)
2. **Reusability**: Create prompt templates once, use them many times with different inputs
3. **Composability**: Build complex workflows by chaining simple components
4. **Consistency**: Standard interfaces across different LLM providers

## How It Works:

1. **Define a Prompt Template**: Create a template with placeholders (variables) that will be filled in later.
2. **Initialize the LLM**: Set up your language model with desired parameters.
3. **Create a Chain**: Connect the prompt → model → parser using the `|` operator.
4. **Invoke the Chain**: Run the chain with specific values to generate output.

In [10]:
# Install the langchain_community package (uncomment if running for the first time)
!pip install -U langchain langchain-openai langchain_community



In [None]:
# Import necessary modules from LangChain:
from langchain_core.prompts import PromptTemplate  # For creating prompt templates
from langchain_core.output_parsers import StrOutputParser # Helpes with the final output processing
from langchain_openai import ChatOpenAI  # OpenAI wrapper for the language model

# Define a prompt template with variables for dynamic text generation.
prompt_template = "Write a short story about a {adjective} {noun} who discovers a secret in an enchanted forest."
prompt = PromptTemplate(input_variables = ["adjective", "noun"],
                        template = prompt_template)  # The template string containing placeholders

# Initialize the OpenAI model (LLM). Ensure your OPENAI_API_KEY is set in your environment.
llm = ChatOpenAI(
    model = "gpt-5-mini",
    temperature = 0.7, # Optional, defaults to 0.7
    api_key = userdata.get('OPENAI_API_KEY')) # userdata.get('OPENAI_API_KEY') retrieves your API key from Colab's user data storage.

# Create a "chain" to combine the prompt template with the language model.
chain = prompt | llm | StrOutputParser()

# Run the chain with specific values for the placeholders:
result = chain.invoke({"adjective" : "mysterious", "noun" : "wanderer"})

# Print the generated story:
print("Generated Story:\n", result)