# Welcome to the Day 2 Lab!


<table style="margin: 0; text-align: left;">
    <tr>
        <td style="width: 150px; height: 150px; vertical-align: middle;">
            <img src="../assets/resources.jpg" width="150" height="150" style="display: block;" />
        </td>
        <td>
            <h2 style="color:#f71;">Just before we get started --</h2>
            <span style="color:#f71;">I thought I'd take a second to point you at this page of useful resources for the course. This includes links to all the slides.<br/>
            <a href="https://edwarddonner.com/2024/11/13/llm-engineering-resources/">https://edwarddonner.com/2024/11/13/llm-engineering-resources/</a><br/>
            Please keep this bookmarked, and I'll continue to add more useful links there over time.
            </span>
        </td>
    </tr>
</table>

## First - let's talk about the Chat Completions API

1. The simplest way to call an LLM
2. It's called Chat Completions because it's saying: "here is a conversation, please predict what should come next"
3. The Chat Completions API was invented by OpenAI, but it's so popular that everybody uses it!

### We will start by calling OpenAI again - but don't worry non-OpenAI people, your time is coming!


In [1]:
import os
from dotenv import load_dotenv

load_dotenv(override=True)
api_key = os.getenv('OPENAI_API_KEY')

if not api_key:
    print("No API key was found - please head over to the troubleshooting notebook in this folder to identify & fix!")
elif not api_key.startswith("sk-proj-"):
    print("An API key was found, but it doesn't start sk-proj-; please check you're using the right key - see troubleshooting notebook")
else:
    print("API key found and looks good so far!")


API key found and looks good so far!


## Do you know what an Endpoint is?

If not, please review the Technical Foundations guide in the guides folder

And, here is an endpoint that might interest you...

In [2]:
import requests

headers = {"Authorization": f"Bearer {api_key}", "Content-Type": "application/json"}

payload = {
    "model": "gpt-5-nano",
    "messages": [
        {"role": "user", "content": "Tell me a fun fact"}]
}

payload

{'model': 'gpt-5-nano',
 'messages': [{'role': 'user', 'content': 'Tell me a fun fact'}]}

In [3]:
response = requests.post(
    "https://api.openai.com/v1/chat/completions",
    headers=headers,
    json=payload
)

response.json()

{'id': 'chatcmpl-CSqtD3K8X76drCo8zSOidRyJ1qZGu',
 'object': 'chat.completion',
 'created': 1760992527,
 'model': 'gpt-5-nano-2025-08-07',
 'choices': [{'index': 0,
   'message': {'role': 'assistant',
    'content': 'Fun fact: Wombat poop is cube-shaped. The square pellets help it stack up without rolling away, which makes it easier for wombats to mark their territory. Want another fun fact?',
    'refusal': None,
    'annotations': []},
   'finish_reason': 'stop'}],
 'usage': {'prompt_tokens': 11,
  'completion_tokens': 688,
  'total_tokens': 699,
  'prompt_tokens_details': {'cached_tokens': 0, 'audio_tokens': 0},
  'completion_tokens_details': {'reasoning_tokens': 640,
   'audio_tokens': 0,
   'accepted_prediction_tokens': 0,
   'rejected_prediction_tokens': 0}},
 'service_tier': 'default',
 'system_fingerprint': None}

In [4]:
response.json()["choices"][0]["message"]["content"]

'Fun fact: Wombat poop is cube-shaped. The square pellets help it stack up without rolling away, which makes it easier for wombats to mark their territory. Want another fun fact?'

# What is the openai package?

It's known as a Python Client Library.

It's nothing more than a wrapper around making this exact call to the http endpoint.

It just allows you to work with nice Python code instead of messing around with janky json objects.

But that's it. It's open-source and lightweight. Some people think it contains OpenAI model code - it doesn't!


In [5]:
# Create OpenAI client

from openai import OpenAI
openai = OpenAI()

response = openai.chat.completions.create(model="gpt-5-nano", messages=[{"role": "user", "content": "Tell me a fun fact"}])

response.choices[0].message.content

'Fun fact: Bananas are technically berries, but strawberries aren’t. Botanically, a berry is a fruit from a single ovary with seeds inside the flesh, which fits bananas but not strawberries.'

## And then this great thing happened:

OpenAI's Chat Completions API was so popular, that the other model providers created endpoints that are identical.

They are known as the "OpenAI Compatible Endpoints".

For example, google made one here: https://generativelanguage.googleapis.com/v1beta/openai/

And OpenAI decided to be kind: they said, hey, you can just use the same client library that we made for GPT. We'll allow you to specify a different endpoint URL and a different key, to use another provider.

So you can use:

```python
gemini = OpenAI(base_url="https://generativelanguage.googleapis.com/v1beta/openai/", api_key="AIz....")
gemini.chat.completions.create(...)
```

And to be clear - even though OpenAI is in the code, we're only using this lightweight python client library to call the endpoint - there's no OpenAI model involved here.

If you're confused, please review Guide 9 in the Guides folder!

And now let's try it!

In [6]:
GEMINI_BASE_URL = "https://generativelanguage.googleapis.com/v1beta/openai/"

google_api_key = os.getenv("GOOGLE_API_KEY")

if not google_api_key:
    print("No API key was found - please head over to the troubleshooting notebook in this folder to identify & fix!")
elif not google_api_key.startswith("AIz"):
    print("An API key was found, but it doesn't start AIz")
else:
    print("API key found and looks good so far!")



API key found and looks good so far!


In [7]:
gemini = OpenAI(base_url=GEMINI_BASE_URL, api_key=google_api_key)

response = gemini.chat.completions.create(model="gemini-2.5-pro", messages=[{"role": "user", "content": "Tell me a fun fact"}])

response.choices[0].message.content

'Wombats are the only animals in the world that produce cube-shaped poop.\n\nThey stack their cubic droppings to mark their territory, and the flat sides prevent them from rolling away.'

## And Ollama also gives an OpenAI compatible endpoint

...and it's on your local machine!

If the next cell doesn't print "Ollama is running" then please open a terminal and run `ollama serve`

In [9]:
requests.get("http://localhost:11434").content

b'Ollama is running'

### Download llama3.2 from meta

Change this to llama3.2:1b if your computer is smaller.

Don't use llama3.3 or llama4! They are too big for your computer..

In [10]:
!ollama pull llama3.2:1b

[?2026h[?25l[1Gpulling manifest ⠋ [K[?25h[?2026l[?2026h[?25l[1Gpulling manifest ⠙ [K[?25h[?2026l[?2026h[?25l[1Gpulling manifest ⠹ [K[?25h[?2026l[?2026h[?25l[1Gpulling manifest ⠸ [K[?25h[?2026l[?2026h[?25l[1Gpulling manifest ⠼ [K[?25h[?2026l[?2026h[?25l[1Gpulling manifest ⠴ [K[?25h[?2026l[?2026h[?25l[1Gpulling manifest ⠦ [K[?25h[?2026l[?2026h[?25l[1Gpulling manifest ⠧ [K[?25h[?2026l[?2026h[?25l[1Gpulling manifest ⠇ [K[?25h[?2026l[?2026h[?25l[1Gpulling manifest ⠏ [K[?25h[?2026l[?2026h[?25l[1Gpulling manifest ⠋ [K[?25h[?2026l[?2026h[?25l[1Gpulling manifest ⠙ [K[?25h[?2026l[?2026h[?25l[1Gpulling manifest ⠹ [K[?25h[?2026l[?2026h[?25l[1Gpulling manifest ⠸ [K[?25h[?2026l[?2026h[?25l[1Gpulling manifest ⠼ [K[?25h[?2026l[?2026h[?25l[1Gpulling manifest [K
pulling 74701a8c35f6:   0% ▕                  ▏ 646 KB/1.3 GB                  [K[?25h[?2026l[?2026h[?25l[A[1Gpulling manifest [K
pulling 74701

In [11]:
OLLAMA_BASE_URL = "http://localhost:11434/v1"

ollama = OpenAI(base_url=OLLAMA_BASE_URL, api_key='ollama')


In [12]:
# Get a fun fact

response = ollama.chat.completions.create(model="llama3.2:1b", messages=[{"role": "user", "content": "Tell me a fun fact"}])

response.choices[0].message.content

'Here\'s a fun fact: There is a type of jellyfish that is immortal. The Turritopsis dohrnii, also known as the "immortal jellyfish," is a species of jellyfish that can transform its body into a younger state through a process called transdifferentiation. This means that it can essentially revert back to its polyp stage, which is the juvenile form of a jellyfish, and then grow back into an adult again. This process can be repeated indefinitely, making the Turritopsis dohrnii theoretically immortal!'

In [13]:
# Now let's try deepseek-r1:1.5b - this is DeepSeek "distilled" into Qwen from Alibaba Cloud

!ollama pull deepseek-r1:1.5b

[?2026h[?25l[1Gpulling manifest ⠋ [K[?25h[?2026l[?2026h[?25l[1Gpulling manifest ⠙ [K[?25h[?2026l[?2026h[?25l[1Gpulling manifest ⠹ [K[?25h[?2026l[?2026h[?25l[1Gpulling manifest ⠸ [K[?25h[?2026l[?2026h[?25l[1Gpulling manifest ⠼ [K[?25h[?2026l[?2026h[?25l[1Gpulling manifest ⠴ [K[?25h[?2026l[?2026h[?25l[1Gpulling manifest ⠦ [K[?25h[?2026l[?2026h[?25l[1Gpulling manifest ⠧ [K[?25h[?2026l[?2026h[?25l[1Gpulling manifest ⠇ [K[?25h[?2026l[?2026h[?25l[1Gpulling manifest ⠏ [K[?25h[?2026l[?2026h[?25l[1Gpulling manifest ⠋ [K[?25h[?2026l[?2026h[?25l[1Gpulling manifest ⠙ [K[?25h[?2026l[?2026h[?25l[1Gpulling manifest ⠹ [K[?25h[?2026l[?2026h[?25l[1Gpulling manifest ⠸ [K[?25h[?2026l[?2026h[?25l[1Gpulling manifest [K
pulling aabd4debf0c8:   0% ▕                  ▏ 1.5 MB/1.1 GB                  [K[?25h[?2026l[?2026h[?25l[A[1Gpulling manifest [K
pulling aabd4debf0c8:   0% ▕                  ▏ 4.5 MB/1.1 GB      

In [14]:
response = ollama.chat.completions.create(model="deepseek-r1:1.5b", messages=[{"role": "user", "content": "Tell me a fun fact"}])

response.choices[0].message.content

'The fun fact about elephants is that they are known to live for thousands of years. This is surprising unless you have encountered elephants or learn about them through documentaries and books.\n\n**Answer:** Elephants can live for hundreds of thousands of years before passing away, making this an intriguing piece of animal knowledge.'

# HOMEWORK EXERCISE ASSIGNMENT

Upgrade the day 1 project to summarize a webpage to use an Open Source model running locally via Ollama rather than OpenAI

You'll be able to use this technique for all subsequent projects if you'd prefer not to use paid APIs.

**Benefits:**
1. No API charges - open-source
2. Data doesn't leave your box

**Disadvantages:**
1. Significantly less power than Frontier Model

## Recap on installation of Ollama

Simply visit [ollama.com](https://ollama.com) and install!

Once complete, the ollama server should already be running locally.  
If you visit:  
[http://localhost:11434/](http://localhost:11434/)

You should see the message `Ollama is running`.  

If not, bring up a new Terminal (Mac) or Powershell (Windows) and enter `ollama serve`  
And in another Terminal (Mac) or Powershell (Windows), enter `ollama pull llama3.2`  
Then try [http://localhost:11434/](http://localhost:11434/) again.

If Ollama is slow on your machine, try using `llama3.2:1b` as an alternative. Run `ollama pull llama3.2:1b` from a Terminal or Powershell, and change the code from `MODEL = "llama3.2"` to `MODEL = "llama3.2:1b"`

In [4]:
# Test the enhanced scraper with summarization using DeepSeek-R1:1.5b

from scraper import summarize_website
from IPython.display import Markdown, display

# Simple one-liner to fetch and summarize!
summary = summarize_website("https://edwarddonner.com")
display(Markdown(summary))

✓ Fetched 1526 characters using requests


**Landing Page Navigation Structure**

The landing page hosted at `edwarddonner.com` features a standard navigation menu, including sections for Home, Connect Four, Outsmart, an AI arena, and Agentic AI Engineering Courses. Each section likely offers unique content related to different tech areas, such as game development, innovation, and AI. The header presents bios (Ed from Nebula.io) and contact information, while quotes highlight their co-founder's leadership in the AI ecosystem. Timeline shows focus on AI projects in 2025. The newsletter subscription invites users to engage via email channels.

In [5]:
# Try with a different model (optional - if you have other models installed)
summary = summarize_website("https://edwarddonner.com", model="llama3.2:1b")
display(Markdown(summary))

✓ Fetched 1526 characters using requests


# Edward Donner's Website Summary

## Overview of the Website

*   Home: A platform that hosts various articles and posts from a personal blog by Ed. The website seems to be focused on AI, technology, and entrepreneurship.
*   About: Ed's profile and background as co-founder and CTO of Nebula.io, an AI startup.

## Upcoming News and Announcements

*   **September 15, 2025:** "AI in Production: Gen AI and Agentic AI on AWS at scale" - This seems to be a press release announcing the successful deployment of AI models on AWS.
*   **May 28, 2025:** Connecting your courses (possibly educational resources) and "Become an LLM expert and leader" - Ed is promoting online courses through their website.
*   **May 18, 2025:** "2025 AI Executive Briefing" - This may be a workshop or conference organized by Ed related to the field of AI.

In [8]:
# Test with a JavaScript-heavy site (force Selenium)
summary = summarize_website("https://japyh.vercel.app/", use_selenium=False)
display(Markdown(summary))

⚠️ Minimal content (0 chars) - using Selenium to render JavaScript...
✓ Fetched 545 characters using Selenium
✓ Fetched 545 characters using Selenium


Sure! Here's your summary formatted in markdown:

---

**Welcome to Japyh | Your Guide to Modern Development**

At **Japyh**, we're diving into the world of **Programming, Data Science & AI**. From **Flights & Tilts** to **Math & Logic**, discover how Python and its libraries are transforming development. 🚀✨  

Inside our blog, we’re exploring the journey of 🧵🎨 **derya umut kuulari** — a passionate developer who’s embracing Python’s power with innovative projects! 💡✨  

# News  
- A heartwarming introduction to our journey as developer & architect in Python and TensorFlow.  
- Highlights from Japyh's landing page ready for your exploration.  

Let us know what you're building with Python and data science — we’d love to hear about it! 🚀

---

This summary focuses on relevant content and avoids navigation-related keywords while keeping it concise and informative.

In [7]:
# Alternative: Using the class directly
from scraper import WebsiteWithSelenium

website = WebsiteWithSelenium("https://cnn.com")
summary = website.summarize()
display(Markdown(summary))

✓ Fetched 12523 characters using requests


**Breaking News: Latest News and Videos |CNN**

This website from *Cable News Network* (CNN) provides a comprehensive look at breaking news across various domains, including sports, health, weather, politics, international relations, and real-time updates. It features images of videos, photos, and curated content to keep you updated on the top stories.

**Key Sections:**

1. **Breaking News**
   - *National Safety Tips*  
     Learn about safety measures in your neighborhood, especially during critical times.
   - *Sports Coverage*  
     Watch highlights from legendary athletes like Johnnie Ball or Sammy Sosa while watching highlights from World杯决赛.
   
2. **International Relations**  
   - Real-time videos of U.N.-meetings on climate change — a significant event is being addressed by Mark Zuckerberg, who will discuss a summit with Climate Change Day leader Johnnie Ball this week.

3. **Sports News**  
   - Keep updated on thrilling Olympic performances and real-time highlights of basketball stars like LeBron James.

4. **World Events**
   - Stay on the pulse with real-time updates about World War II nostalgia or climate change concerns from late October onwards.

5. **Health & Wellness**  
   - Watch trending health issues, including a recent recall of vaccines because of adverse effects related to children's sleep apnea.

6. **Weather & Climate**  
   - Learn about current weather patterns and climate-related news, such as the U.S.-China trade deal focusing on trade relations with countries in both regions.

7. **Real-Time Videos**
   - The website features high-quality video clips from various sources, including sports events like the FIFA World Cup semis being live-streamed to watch.

**Note:** For comprehensive access and tips, like real-time videos are a great way to stay updated on hot topics, CNN often curates top news images. Fact-checking is included based on the content provided.

This structure helps readers quickly digest important updates while engaging with visual evidence of the press — such as video clips highlighting breaking news in areas like sports and weather.