# Welcome to your first assignment!

Instructions are below. Please give this a try, and look in the solutions folder if you get stuck (or feel free to ask me!)

# HOMEWORK EXERCISE ASSIGNMENT

Upgrade the day 1 project to summarize a webpage to use an Open Source model running locally via Ollama rather than OpenAI

You'll be able to use this technique for all subsequent projects if you'd prefer not to use paid APIs.

**Benefits:**
1. No API charges - open-source
2. Data doesn't leave your box

**Disadvantages:**
1. Significantly less power than Frontier Model


In [1]:
import requests
from bs4 import BeautifulSoup
from IPython.display import Markdown, display

In [2]:
# Constants

OLLAMA_API = "http://localhost:11434/api/chat"
HEADERS = {"Content-Type": "application/json"}
MODEL = "llama3.2"

In [3]:


messages = [
    {"role": "user", "content": "Describe some of the business applications of Generative AI"}
]

In [4]:
payload = {
        "model": MODEL,
        "messages": messages,
        "stream": False
    }

In [5]:
response = requests.post(OLLAMA_API, json=payload, headers=HEADERS)
print(response.json()['message']['content'])

Generative AI has numerous business applications across various industries, including:

1. **Content Creation**: AI-powered tools can generate high-quality content such as articles, social media posts, product descriptions, and even entire books. This can help businesses save time and resources while maintaining consistency in their content.
2. **Marketing and Advertising**: Generative AI can create personalized ads, product recommendations, and marketing campaigns that are tailored to individual customers' preferences. It can also analyze customer data to predict buying behavior and optimize marketing strategies.
3. **Product Design and Development**: AI-powered generative design tools can create 2D and 3D designs for products such as furniture, electronics, and automotive components. This can help businesses reduce design time and costs while improving product quality.
4. **Customer Service and Support**: Chatbots powered by Generative AI can provide personalized customer support, an

# Introducing the ollama package

And now we'll do the same thing, but using the elegant ollama python package instead of a direct HTTP call.

Under the hood, it's making the same call as above to the ollama server running at localhost:11434

In [6]:
import ollama

response = ollama.chat(model=MODEL, messages=messages)
print(response['message']['content'])

Generative AI has numerous business applications across various industries, including:

1. **Content Generation**: AI can generate high-quality content such as articles, social media posts, product descriptions, and even entire books. This helps reduce content creation costs and increases productivity.
2. **Image and Video Creation**: Generative AI can create realistic images and videos for advertising, marketing, and entertainment purposes. This technology is being used to generate custom product visuals, avatars, and even synthetic celebrities.
3. **Chatbots and Virtual Assistants**: AI-powered chatbots can help businesses provide 24/7 customer support, answer frequently asked questions, and offer personalized recommendations.
4. **Product Design and Development**: Generative AI can assist in the design of products such as cars, planes, and buildings by generating concept designs, optimizing structures, and reducing material usage.
5. **Music and Audio Creation**: AI can generate mus

## Alternative approach - using OpenAI python library to connect to Ollama

In [7]:
# There's actually an alternative approach that some people might prefer
# You can use the OpenAI client python library to call Ollama:

from openai import OpenAI
ollama_via_openai = OpenAI(base_url='http://localhost:11434/v1', api_key='ollama')

response = ollama_via_openai.chat.completions.create(
    model=MODEL,
    messages=messages
)

print(response.choices[0].message.content)

Generative Artificial Intelligence (AI) has numerous business applications across various industries. Here are some examples:

1. **Content Generation**: Generative AI can create high-quality content, such as articles, social media posts, and product descriptions, at scale and speed. This reduces the reliance on human writers and allows businesses to produce more content without increasing costs.
2. **Product Design**: Generative AI algorithms can design products quickly and efficiently, using computer-aided design (CAD) software and other tools. This enables businesses to develop new products or modify existing ones without significant investment in hardware and infrastructure.
3. **Marketing Campaigns**: Generative AI can analyze customer data, identify trends, and create personalized marketing campaigns. It can also generate ads, social media posts, and promotional materials, making it easier for businesses to tailor their marketing efforts.
4. **Image and Video Generation**: Genera

## Are you confused about why that works?

It seems strange, right? We just used OpenAI code to call Ollama?? What's going on?!

Here's the scoop:

The python class `OpenAI` is simply code written by OpenAI engineers that makes calls over the internet to an endpoint.  

When you call `openai.chat.completions.create()`, this python code just makes a web request to the following url: "https://api.openai.com/v1/chat/completions"

Code like this is known as a "client library" - it's just wrapper code that runs on your machine to make web requests. The actual power of GPT is running on OpenAI's cloud behind this API, not on your computer!

OpenAI was so popular, that lots of other AI providers provided identical web endpoints, so you could use the same approach.

So Ollama has an endpoint running on your local box at http://localhost:11434/v1/chat/completions  
And in week 2 we'll discover that lots of other providers do this too, including Gemini and DeepSeek.

And then the team at OpenAI had a great idea: they can extend their client library so you can specify a different 'base url', and use their library to call any compatible API.

That's it!

So when you say: `ollama_via_openai = OpenAI(base_url='http://localhost:11434/v1', api_key='ollama')`  
Then this will make the same endpoint calls, but to Ollama instead of OpenAI.

## Also trying the amazing reasoning model DeepSeek

Here we use the version of DeepSeek-reasoner that's been distilled to 1.5B.  
This is actually a 1.5B variant of Qwen that has been fine-tuned using synethic data generated by Deepseek R1.

Other sizes of DeepSeek are [here](https://ollama.com/library/deepseek-r1) all the way up to the full 671B parameter version, which would use up 404GB of your drive and is far too large for most!

In [8]:
# This may take a few minutes to run! You should then see a fascinating "thinking" trace inside <think> tags, followed by some decent definitions

response = ollama_via_openai.chat.completions.create(
    model="deepseek-r1:1.5b",
    messages=[{"role": "user", "content": "Please give definitions of some core concepts behind LLMs: a neural network, attention and the transformer"}]
)

print(response.choices[0].message.content)

<think>
Okay, I need to understand a few key concepts about Large Language Models (LLMs) including neural networks, attention, and transformers. Let me start by breaking down each term step by step.

First, neural networks. From what I remember, neural networks are a subset of machine learning models inspired by the structure and function of the human brain's nervous system. They consist of layers of interconnected nodes or "neurons" that process information through connections between them. Each node has weights that adjust as part of the learning process. They're used for tasks like image recognition, speech processing, and many others.

Now, LLMs build on these neural networks. So a key point is their role in NLP (natural language processing). The models are trained on vast amounts of text data to learn patterns and contexts that allow them to predict or generate new languages.

Next up is the attention mechanism. I think this has something to do with how models focus on specific pa

# NOW the exercise for you

Take the code from day1 and incorporate it here, to build a website summarizer that uses Llama 3.2 running locally instead of OpenAI; use either of the above approaches.

In [9]:
headers = {
 "User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/117.0.0.0 Safari/537.36"
}

class Website:

    def __init__(self, url):
        """
        Create this Website object from the given url using the BeautifulSoup library
        """
        self.url = url
        response = requests.get(url, headers=headers)
        soup = BeautifulSoup(response.content, 'html.parser')
        self.title = soup.title.string if soup.title else "No title found"
        for irrelevant in soup.body(["script", "style", "img", "input"]):
            irrelevant.decompose()
        self.text = soup.body.get_text(separator="\n", strip=True)

system_prompt = "You are an assistant that analyzes the contents of a website \
and provides a short summary, ignoring text that might be navigation related. \
Respond in markdown."


def user_prompt_for(website):
    user_prompt = f"You are looking at a website titled {website.title}"
    user_prompt += "\nThe contents of this website is as follows; \
please provide a short summary of this website in markdown. \
If it includes news or announcements, then summarize these too.\n\n"
    user_prompt += website.text
    return user_prompt



def messages_for(website):
    return [
        {"role": "system", "content": system_prompt},
        {"role": "user", "content": user_prompt_for(website)}
    ]

## Versions

In [10]:
OLLAMA_API = "http://localhost:11434/api/chat"
HEADERS = {"Content-Type": "application/json"}
MODEL = "llama3.2"

In [11]:
def summarize_1(url):
    website = Website(url)
    payload = {
        "model": MODEL,
        "messages": messages_for(website),
        "stream": False
    }

    response = requests.post(OLLAMA_API, json=payload, headers=HEADERS)
    return response.json()['message']['content']

In [12]:
summarize_1("https://edwarddonner.com")

'# Website Summary\n### Home - Edward Donner\n\nThe website appears to be a personal blog or portfolio of Edward Donner, the co-founder and CTO of Nebula.io. It covers his interests in writing code, experimenting with Large Language Models (LLMs), DJing, and amateur electronic music production.\n\n### News/Announcements\n\n*   **Courses:** Edward Donner is offering courses on LLMs and Agentic AI Engineering, including "Connecting my courses" and "The Complete Agentic AI Engineering Course".\n*   **Newsletters and Subscriptions:** A newsletter subscription form is available for visitors to receive updates from Edward.\n*   **Recent Posts:**\n    *   May 28, 2025 - Connecting my courses\n    *   May 18, 2025 - 2025 AI Executive Briefing\n    *   April 21, 2025 - The Complete Agentic AI Engineering Course\n    *   January 23, 2025 - LLM Workshop – Hands-on with Agents'

In [13]:
def summarize_2(url):
    website = Website(url)
    response = ollama.chat(model=MODEL, messages=messages_for(website))
    return response['message']['content']

In [14]:
summarize_2("https://edwarddonner.com")

'# Website Summary\n## Overview\n\nThe website is a personal blog written by Edward Donner, the co-founder and CTO of Nebula.io. It appears to be focused on writing, AI, and technology.\n\n### Content Highlights\n\n*   **LLM Arena**: An arena where Large Language Models (LLMs) compete against each other in diplomacy and strategy games.\n*   **Talent Acquisition**: Edward discusses his work at Nebula.io, applying AI to help people discover their potential and pursue their goals. The company uses proprietary LLMs for talent matching and has a patented matching model.\n\n### Recent News and Announcements\n\n*   **Course Announcement**: A new course titled "Connecting my courses" is announced.\n*   **2025 AI Executive Briefing**: An upcoming briefing on AI in 2025.\n*   **New Course Release**: Another course, "The Complete Agentic AI Engineering Course," is released.\n*   **LLM Workshop**: A hands-on workshop for learning about agents and LLMs.\n\n### Contact Information\n\nEdward Donner c

In [15]:
def summarize_3(url):
    website = Website(url)
    ollama_via_openai = OpenAI(base_url='http://localhost:11434/v1', api_key='ollama')
    response = ollama_via_openai.chat.completions.create(
    model=MODEL,
    messages=messages_for(website)
)
    return response.choices[0].message.content

In [16]:
summarize_3("https://edwarddonner.com")

'# Website Summary\n\nThis website belongs to Edward Donner and appears to be a personal homepage that combines elements of programming, artificial intelligence (AI), and entrepreneurship.\n\n## What He Does\n\n* Co-founder and CTO of [Nebula.io](https://www.nebulai.com/), applying AI to help people discover their potential.\n* Previously founded and CEO of AI startup [untapt](https://untapt.artificial-intelligence), which was acquired in 2021.\n\n## Announcements and News\n\n* **Connecting my courses – become an LLM expert and leader** (May 28, 2025)\n* **2025 AI Executive Briefing** (April 21, 2025)\n* **The Complete Agentic AI Engineering Course** (January 23, 2025)\n* **LLM Workshop – Hands-on with Agents – resources** (date not specified)\n* **New courses and announcements will be posted as they become available.**'