# Welcome to your first assignment!

Instructions are below. Please give this a try, and look in the solutions folder if you get stuck (or feel free to ask me!)

<table style="margin: 0; text-align: left;">
    <tr>
        <td style="width: 150px; height: 150px; vertical-align: middle;">
            <img src="../resources.jpg" width="150" height="150" style="display: block;" />
        </td>
        <td>
            <h2 style="color:#f71;">Just before we get to the assignment --</h2>
            <span style="color:#f71;">I thought I'd take a second to point you at this page of useful resources for the course. This includes links to all the slides.<br/>
            <a href="https://edwarddonner.com/2024/11/13/llm-engineering-resources/">https://edwarddonner.com/2024/11/13/llm-engineering-resources/</a><br/>
            Please keep this bookmarked, and I'll continue to add more useful links there over time.
            </span>
        </td>
    </tr>
</table>

# HOMEWORK EXERCISE ASSIGNMENT

Upgrade the day 1 project to summarize a webpage to use an Open Source model running locally via Ollama rather than OpenAI

You'll be able to use this technique for all subsequent projects if you'd prefer not to use paid APIs.

**Benefits:**
1. No API charges - open-source
2. Data doesn't leave your box

**Disadvantages:**
1. Significantly less power than Frontier Model

## Recap on installation of Ollama

Simply visit [ollama.com](https://ollama.com) and install!

Once complete, the ollama server should already be running locally.  
If you visit:  
[http://localhost:11434/](http://localhost:11434/)

You should see the message `Ollama is running`.  

If not, bring up a new Terminal (Mac) or Powershell (Windows) and enter `ollama serve`  
And in another Terminal (Mac) or Powershell (Windows), enter `ollama pull llama3.2`  
Then try [http://localhost:11434/](http://localhost:11434/) again.

If Ollama is slow on your machine, try using `llama3.2:1b` as an alternative. Run `ollama pull llama3.2:1b` from a Terminal or Powershell, and change the code below from `MODEL = "llama3.2"` to `MODEL = "llama3.2:1b"`

In [1]:
# imports

import requests
from bs4 import BeautifulSoup
from IPython.display import Markdown, display

In [2]:
# Constants

OLLAMA_API = "http://localhost:11434/api/chat"
HEADERS = {"Content-Type": "application/json"}
MODEL = "llama3.2"

In [3]:
# Create a messages list using the same format that we used for OpenAI

messages = [
    {"role": "user", "content": "Describe some of the business applications of Generative AI"}
]

In [4]:
payload = {
        "model": MODEL,
        "messages": messages,
        "stream": False
    }

In [5]:
# Let's just make sure the model is loaded

!ollama pull llama3.2

[?2026h[?25l[1Gpulling manifest ⠋ [K[?25h[?2026l[?2026h[?25l[1Gpulling manifest ⠙ [K[?25h[?2026l[?2026h[?25l[1Gpulling manifest ⠹ [K[?25h[?2026l[?2026h[?25l[1Gpulling manifest ⠸ [K[?25h[?2026l[?2026h[?25l[1Gpulling manifest ⠼ [K[?25h[?2026l[?2026h[?25l[1Gpulling manifest ⠴ [K[?25h[?2026l[?2026h[?25l[1Gpulling manifest ⠦ [K[?25h[?2026l[?2026h[?25l[1Gpulling manifest ⠧ [K[?25h[?2026l[?2026h[?25l[1Gpulling manifest ⠇ [K[?25h[?2026l[?2026h[?25l[1Gpulling manifest ⠏ [K[?25h[?2026l[?2026h[?25l[1Gpulling manifest [K
pulling dde5aa3fc5ff... 100% ▕████████████████▏ 2.0 GB                         [K
pulling 966de95ca8a6... 100% ▕████████████████▏ 1.4 KB                         [K
pulling fcc5a6bec9da... 100% ▕████████████████▏ 7.7 KB                         [K
pulling a70ff7e570d9... 100% ▕████████████████▏ 6.0 KB                         [K
pulling 56bb8bd477a5... 100% ▕████████████████▏   96 B                         [K
pullin

In [6]:
# If this doesn't work for any reason, try the 2 versions in the following cells
# And double check the instructions in the 'Recap on installation of Ollama' at the top of this lab
# And if none of that works - contact me!

response = requests.post(OLLAMA_API, json=payload, headers=HEADERS)
print(response.json()['message']['content'])

Generative AI has numerous business applications across various industries. Here are some examples:

1. **Content Generation**: Generative AI can be used to generate high-quality content such as articles, blog posts, social media posts, and product descriptions. This can help businesses save time and resources on content creation.
2. **Image and Video Generation**: Generative AI can be used to create custom images and videos for marketing materials, product launches, or branding purposes. For example, a company can use generative AI to create a unique image of their new product.
3. **Chatbots and Virtual Assistants**: Generative AI can be used to power chatbots and virtual assistants that can provide customer support, answer frequently asked questions, and offer personalized recommendations.
4. **Predictive Analytics**: Generative AI can be used to generate predictive models that can forecast sales, customer behavior, and market trends. This can help businesses make data-driven decisio

# Introducing the ollama package

And now we'll do the same thing, but using the elegant ollama python package instead of a direct HTTP call.

Under the hood, it's making the same call as above to the ollama server running at localhost:11434

In [7]:
import ollama

response = ollama.chat(model=MODEL, messages=messages)
print(response['message']['content'])

Generative AI has numerous business applications across various industries, including:

1. **Content Generation**: Generative AI can be used to generate high-quality content such as articles, social media posts, product descriptions, and more, at scale and speed.
2. **Product Design**: AI-generated 3D models and designs can be used for product development, prototyping, and manufacturing, reducing the need for manual design and improving product efficiency.
3. **Image and Video Editing**: Generative AI can be used to create realistic images and videos for advertising, marketing, and entertainment purposes, or for generating high-quality product images and videos.
4. **Chatbots and Virtual Assistants**: Generative AI-powered chatbots and virtual assistants can provide 24/7 customer support, answer frequently asked questions, and offer personalized recommendations.
5. **Recommendation Systems**: Generative AI-based recommendation systems can analyze user behavior and preferences to sugges

## Alternative approach - using OpenAI python library to connect to Ollama

In [8]:
# There's actually an alternative approach that some people might prefer
# You can use the OpenAI client python library to call Ollama:

from openai import OpenAI
ollama_via_openai = OpenAI(base_url='http://localhost:11434/v1', api_key='ollama')

response = ollama_via_openai.chat.completions.create(
    model=MODEL,
    messages=messages
)

print(response.choices[0].message.content)

Generative Artificial Intelligence (AI) has numerous business applications, revolutionizing various industries with its capabilities. Here are some significant examples:

1. **Content Creation**: Generative AI toolkits like DALL-E and Midjourney enable businesses to create high-quality content, such as:
	* Artistic images and videos
	* Customized product designs (e.g., logos, packaging)
	* Written text (e.g., social media posts, blog articles)
2. **Product Design and Development**: Generative AI can aid in:
	* Architecture design (e.g., homes, buildings)
	* Product conceptualization (e.g., appliances, electronics)
	* Fashion design (e.g., clothing, accessories)
3. **Marketing and Advertising**:
	* Personalized ad targeting using generative images or text
	* Automated ad copywriting and optimization
	* Social media content generation (e.g., social media posts, stories)
4. **Data Analysis and Insights**: Generative AI can help analyze complex data sets to identify patterns, trends, and i

## Also trying the amazing reasoning model DeepSeek

Here we use the version of DeepSeek-reasoner that's been distilled to 1.5B.  
This is actually a 1.5B variant of Qwen that has been fine-tuned using synethic data generated by Deepseek R1.

Other sizes of DeepSeek are [here](https://ollama.com/library/deepseek-r1) all the way up to the full 671B parameter version, which would use up 404GB of your drive and is far too large for most!

In [9]:
!ollama pull deepseek-r1:1.5b

[?2026h[?25l[1Gpulling manifest ⠋ [K[?25h[?2026l[?2026h[?25l[1Gpulling manifest ⠙ [K[?25h[?2026l[?2026h[?25l[1Gpulling manifest ⠹ [K[?25h[?2026l[?2026h[?25l[1Gpulling manifest ⠸ [K[?25h[?2026l[?2026h[?25l[1Gpulling manifest ⠼ [K[?25h[?2026l[?2026h[?25l[1Gpulling manifest ⠴ [K[?25h[?2026l[?2026h[?25l[1Gpulling manifest ⠦ [K[?25h[?2026l[?2026h[?25l[1Gpulling manifest ⠧ [K[?25h[?2026l[?2026h[?25l[1Gpulling manifest ⠇ [K[?25h[?2026l[?2026h[?25l[1Gpulling manifest ⠏ [K[?25h[?2026l[?2026h[?25l[1Gpulling manifest ⠋ [K[?25h[?2026l[?2026h[?25l[1Gpulling manifest ⠙ [K[?25h[?2026l[?2026h[?25l[1Gpulling manifest ⠹ [K[?25h[?2026l[?2026h[?25l[1Gpulling manifest ⠸ [K[?25h[?2026l[?2026h[?25l[1Gpulling manifest ⠼ [K[?25h[?2026l[?2026h[?25l[1Gpulling manifest ⠴ [K[?25h[?2026l[?2026h[?25l[1Gpulling manifest ⠦ [K[?25h[?2026l[?2026h[?25l[1Gpulling manifest ⠧ [K[?25h[?2026l[?2026h[?25l[1Gpulling ma

In [10]:
# This may take a few minutes to run! You should then see a fascinating "thinking" trace inside <think> tags, followed by some decent definitions

response = ollama_via_openai.chat.completions.create(
    model="deepseek-r1:1.5b",
    messages=[{"role": "user", "content": "Please give definitions of some core concepts behind LLMs: a neural network, attention and the transformer"}]
)

response = ollama_via_openai.chat.completions.create(
    model='deepseek-r1:1.5b',
    messages=[{"role":"user","content":"Please give definitions of some core concepts behind LLMs: a neural network, attention and the transformer"}]
)

#print(response.choices[0].message.content)
print(response.choices[0].message.content)

<think>
Okay, I'm trying to understand the core concepts behind language models (LLMs) specifically focusing on neural networks, attention, and the Transformer. Hmm, let me start by breaking this down step by step.

First, I know that a language model is essentially a system that can predict the next word in a sequence of text. Examples include things like ChatGPT or OpenAI's Stable Textmodels. These models are built on some underlying technology, so neural networks must be involved here.

Neural networks, if I recall correctly, are part of deep learning. They're composed of layers like input, hidden, and output layers. Each node in these layers processes information passed through them by applying weights and biases to the features they receive. The idea is that a complex pattern recognition or classification can be done with many nonlinear (and usually non-intuitive) units working together.

So, why would a neural network be used for something like predicting the next word? Well, per

# NOW the exercise for you

Take the code from day1 and incorporate it here, to build a website summarizer that uses Llama 3.2 running locally instead of OpenAI; use either of the above approaches.

In [25]:
# A class to represent a Webpage
# If you're not familiar with Classes, check out the "Intermediate Python" notebook

# Some websites need you to use proper headers when fetching them:
headers = {
 "User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/117.0.0.0 Safari/537.36"
}

class Website:
    
    def __init__(self, url):
        self.url = url
        response = requests.get(url, headers=headers)
        soup = BeautifulSoup(response.content, 'html.parser')
        self.title = soup.title.string if soup.title else "No title found"
        for irrelevant in soup.body(["script", "style", "img", "input"]):
            irrelevant.decompose()
        self.text = soup.body.get_text(separator="\n", strip=True)
        
system_prompt = "You are an assistant that analyzes the contents of a website \
and provides a short summary, ignoring text that might be navigation related. \
Respond in markdown. Please use english"

def user_prompt(website):
    user_prompt = f"You are looking at a website titled {website.title}"
    user_prompt += "\nThe contents of this website is as follows; \
please provide a short summary of this website in markdown. \
If it includes news or announcements, then summarize these too.\n\n"
    user_prompt += website.text
    return user_prompt

def message_for(website):
    return [
        {"role":"system", "content":system_prompt},
        {"role":"user", "content":user_prompt(website)}
    ]
    
def summarize(url):
    website = Website(url)
    response = ollama_via_openai.chat.completions.create(
        model='deepseek-r1:1.5b',
       #model=MODEL,
        messages=message_for(website)
        )
    return print(response.choices[0].message.content)

summarize("https://github.com/dhaneswaramandrasa")

<think>
Okay, so I'm trying to write a short summary for this GitHub repository about dhaneswaramandrasa. From what I know, the person is someone with a credentials like an email address and 4 followers. They've contributed 43 repositories, 0 packages, and 27 stars. It seems active.

The project they're involved in is a Jupyter notebook for predicting Starbucks member behavior using Offer View and prediction models. That's pretty specific to some finance or data science work. But I'm not exactly sure why these prediction techniques are being used. Maybe it's predicting how people will spend money on drinks, which could be smart because more beverage sales mean more profit.

Also, there was a project about the NLP pipeline for classifying messages in disasters. That probably means managing security or helping events where critical data is involved. Dog breed classifiers using CNNs. Interesting maybe for something tech that needs accurate predictions. TV script generation with RNN/LSTM m

In [26]:
system_prompt = "You are an assistant that skillful in frontend. Please use english"

user_prompt = "Please make me a simple website that contain my portfolio and skill, with no extensive coding needed"

def message_for():
    return [
        {"role":"system", "content":system_prompt},
        {"role":"user", "content":user_prompt}
    ]
    
def summarize():
    response = ollama_via_openai.chat.completions.create(
        model='deepseek-r1:1.5b',
       #model=MODEL,
        messages=message_for()
        )
    return print(response.choices[0].message.content)

summarize()

<think>
Alright, the user wants a simple website for their portfolio and skills without extensive coding. So, I need to keep it lightweight but still effective.

First, I'll use HTML since it's straightforward and can handle most basic requirements. A single page site is ideal here.

Next, including social media links would make the site look more modern. I remember that using anchor tags like https://linkedin.com/in/ and https://twitter.design/ adds nice icons which can be interactive with libraries like Facebook's GraphAPI. But since the user didn't mention needing social interactions, maybe just keeping it static is fine.

Then, the profile image section should have a comment box where they comment on their skills. This keeps things simple without getting too complex.

Styling-wise, I'll use basic CSS to make it look clean and mobile-friendly. Maybe some padding and borders can be added later for more depth if needed.

I need to mention that while this is straightforward, if they wa