# Deploying AI
## Assignment 1: Evaluating Summaries

A key application of LLMs is to summarize documents. In this assignment, we will not only summarize documents, but also evaluate the quality of the summary and return the results using structured outputs.

**Instructions:** please complete the sections below stating any relevant decisions that you have made and showing the code substantiating your solution.

## Select a Document

Please select one out of the following articles:

+ [Managing Oneself, by Peter Druker](https://www.thecompleteleader.org/sites/default/files/imce/Managing%20Oneself_Drucker_HBR.pdf)  (PDF)
+ [The GenAI Divide: State of AI in Business 2025](https://www.artificialintelligence-news.com/wp-content/uploads/2025/08/ai_report_2025.pdf) (PDF)
+ [What is Noise?, by Alex Ross](https://www.newyorker.com/magazine/2024/04/22/what-is-noise) (Web)


For this assignment, I chose to work with the web article **'What is Noise?'** by music critic AlexÂ Ross. This piece examines the concept of noise in music and culture. Understanding the distinction between signal and noise is fundamental for AI professionals because it parallels the challenge of distinguishing useful information from irrelevant data in machineâ€‘learning systems. By summarizing this article, we practice condensing nuanced content while reflecting on how notions of noise relate to data science, signal processing, and AI ethics.

# Load Secrets

In [55]:
%reload_ext dotenv
%dotenv ../05_src/.secrets

In [56]:
%load_ext dotenv
%dotenv ../../05_src/.env
%dotenv ../../05_src/.secrets

The dotenv extension is already loaded. To reload it, use:
  %reload_ext dotenv
cannot find .env file
cannot find .env file


In [58]:
from dotenv import load_dotenv
load_dotenv('../../05_src/.secrets')

False

In [59]:
import sys
!{sys.executable} -m pip install -U langchain-community langchain-openai --break-system-packages



[1m[[0m[34;49mnotice[0m[1;39;49m][0m[39;49m A new release of pip is available: [0m[31;49m24.3.1[0m[39;49m -> [0m[32;49m26.0.1[0m
[1m[[0m[34;49mnotice[0m[1;39;49m][0m[39;49m To update, run: [0m[32;49mpython3.11 -m pip install --upgrade pip[0m


## Load Document

Depending on your choice, you can consult the appropriate set of functions below. Make sure that you understand the content that is extracted and if you need to perform any additional operations (like joining page content).

### PDF

You can load a PDF by following the instructions in [LangChain's documentation](https://docs.langchain.com/oss/python/langchain/knowledge-base#loading-documents). Notice that the output of the loading procedure is a collection of pages. You can join the pages by using the code below.

```python
document_text = ""
for page in docs:
    document_text += page.page_content + "\n"
```

### Web

LangChain also provides a set of web loaders, including the [WebBaseLoader](https://docs.langchain.com/oss/python/integrations/document_loaders/web_base). You can use this function to load web pages.

In [10]:
from langchain_community.document_loaders import WebBaseLoader


USER_AGENT environment variable not set, consider setting it to identify your requests.


In [63]:
from langchain_community.document_loaders import WebBaseLoader

url = "https://www.newyorker.com/magazine/2015/06/29/what-is-noise"
loader = WebBaseLoader(url)
docs = loader.load()


# Join the individual page documents into a single context string
document_text = ""
for page in docs:
    context += page.page_content + "\n"


# Display a short preview of the context
print(document_text[:1000])




## Generation Task

Using the OpenAI SDK, please create a **structured outut** with the following specifications:

+ Use a model that is NOT in the GPT-5 family.
+ Output should be a Pydantic BaseModel object. The fields of the object should be:

    - Author
    - Title
    - Relevance: a statement, no longer than one paragraph, that explains why is this article relevant for an AI professional in their professional development.
    - Summary: a concise and succinct summary no longer than 1000 tokens.
    - Tone: the tone used to produce the summary (see below).
    - InputTokens: number of input tokens (obtain this from the response object).
    - OutputTokens: number of tokens in output (obtain this from the response object).
       
+ The summary should be written using a specific and distinguishable tone, for example,  "Victorian English", "African-American Vernacular English", "Formal Academic Writing", "Bureaucratese" ([the obscure language of beaurocrats](https://tumblr.austinkleon.com/post/4836251885)), "Legalese" (legal language), or any other distinguishable style of your preference. Make sure that the style is something you can identify. 
+ In your implementation please make sure to use the following:

    - Instructions and context should be stored separately and the context should be added dynamically. Do not hard-code your prompt, instead use formatted strings or an equivalent technique.
    - Use the developer (instructions) prompt and the user prompt.


In [60]:

import os
os.getenv("API_GATEWAY_KEY")

'c2WWDYBeS46jdqJmT9SfL6w0PlH6LwzZ4rxOd7GJ'

In [64]:
# --- Generation Task (OpenAI SDK + Pydantic structured output) ---

import os, json
from pydantic import BaseModel, Field, ValidationError
from openai import OpenAI

# 1) OpenAI client (same pattern as the Lab: API Gateway + x-api-key header)
client = OpenAI(
    base_url="https://k7uffyg03f.execute-api.us-east-1.amazonaws.com/prod/openai/v1",
    api_key="any value",
    default_headers={"x-api-key": os.getenv("API_GATEWAY_KEY")}
)

# 2) Pydantic model for structured output
class ArticleSummary(BaseModel):
    Author: str = Field(..., description="Author of the article")
    Title: str = Field(..., description="Title of the article")
    Relevance: str = Field(..., description="<= 1 paragraph: why relevant to an AI professional")
    Summary: str = Field(..., description="Concise summary <= 1000 tokens")
    Tone: str = Field(..., description="The chosen writing tone/style")
    InputTokens: int = Field(..., description="Number of input tokens from response.usage.input_tokens")
    OutputTokens: int = Field(..., description="Number of output tokens from response.usage.output_tokens")

# 3) Keep instructions (developer prompt) separate from context (user prompt)
developer_instructions = """
You are a meticulous technical editor.
You must return ONLY valid JSON (no markdown, no code fences, no commentary).
The JSON must match exactly this schema:

{
  "Author": string,
  "Title": string,
  "Relevance": string,   // <= 1 paragraph
  "Summary": string,     // concise, <= 1000 tokens
  "Tone": string
}

Rules:
- Do not invent citations or quotes.
- Do not include extra keys.
- Ensure the tone is clearly distinguishable and consistent.
"""

# NOTE: context is dynamically injected (do not hard-code it).
# You should already have `document_text` (PDF joined) or `context` (web joined) from your earlier cell.
# For this snippet, we assume it's stored in variable `document_text`.
tone = "Victorian English"
user_prompt_template = """
TASK:
Summarize the article below in the tone: "{tone}".

OUTPUT:
Return ONLY JSON matching the required schema (Author, Title, Relevance, Summary, Tone).

ARTICLE TEXT:
{context}
"""

# Inject context dynamically
user_prompt = user_prompt_template.format(tone=tone, context=document_text)

# 4) Call Responses API with a NON-GPT-5 model
# (Examples: "gpt-4o-mini", "gpt-4.1", "gpt-4o" â€” pick what your course allows)
response = client.responses.create(
    model="gpt-4o-mini",
    input=[
        {"role": "developer", "content": developer_instructions},
        {"role": "user", "content": user_prompt},
    ],
)

# 5) Parse model output JSON, then attach token usage from response.usage
raw_text = response.output_text  # aggregated text output convenience property
data = json.loads(raw_text)

# Add token usage fields from the response object (required by assignment)
data["InputTokens"] = response.usage.input_tokens
data["OutputTokens"] = response.usage.output_tokens

# 6) Validate + create a Pydantic object
try:
    structured_summary = ArticleSummary(**data)
except ValidationError as e:
    print("Validation error. Model output didn't match schema.")
    print(e)
    raise

structured_summary

PermissionDeniedError: Error code: 403 - {'message': 'Forbidden'}

# Evaluate the Summary

Use the DeepEval library to evaluate the **summary** as follows:

+ Summarization Metric:

    - Use the [Summarization metric](https://deepeval.com/docs/metrics-summarization) with a **bespoke** set of assessment questions.
    - Please use, at least, five assessment questions.

+ G-Eval metrics:

    - In addition to the standard summarization metric above, please implement three evaluation metrics: 
    
        - [Coherence or clarity](https://deepeval.com/docs/metrics-llm-evals#coherence)
        - [Tonality](https://deepeval.com/docs/metrics-llm-evals#tonality)
        - [Safety](https://deepeval.com/docs/metrics-llm-evals#safety)

    - For each one of the metrics above, implement five assessment questions.

+ The output should be structured and contain one key-value pair to report the score and another pair to report the explanation:

    - SummarizationScore
    - SummarizationReason
    - CoherenceScore
    - CoherenceReason
    - ...

# Enhancement

Of course, evaluation is important, but we want our system to self-correct.  

+ Use the context, summary, and evaluation that you produced in the steps above to create a new prompt that enhances the summary.
+ Evaluate the new summary using the same function.
+ Report your results. Did you get a better output? Why? Do you think these controls are enough?

Please, do not forget to add your comments.


# Submission Information

ðŸš¨ **Please review our [Assignment Submission Guide](https://github.com/UofT-DSI/onboarding/blob/main/onboarding_documents/submissions.md)** ðŸš¨ for detailed instructions on how to format, branch, and submit your work. Following these guidelines is crucial for your submissions to be evaluated correctly.

## Submission Parameters

- The Submission Due Date is indicated in the [readme](../README.md#schedule) file.
- The branch name for your repo should be: assignment-1
- What to submit for this assignment:
    + This Jupyter Notebook (assignment_1.ipynb) should be populated and should be the only change in your pull request.
- What the pull request link should look like for this assignment: `https://github.com/<your_github_username>/production/pull/<pr_id>`
    + Open a private window in your browser. Copy and paste the link to your pull request into the address bar. Make sure you can see your pull request properly. This helps the technical facilitator and learning support staff review your submission easily.

## Checklist

+ Created a branch with the correct naming convention.
+ Ensured that the repository is public.
+ Reviewed the PR description guidelines and adhered to them.
+ Verify that the link is accessible in a private browser window.

If you encounter any difficulties or have questions, please don't hesitate to reach out to our team via our Slack. Our Technical Facilitators and Learning Support staff are here to help you navigate any challenges.
