# Welcome to your first assignment!

Instructions are below. Please give this a try, and look in the solutions folder if you get stuck (or feel free to ask me!)

<table style="margin: 0; text-align: left;">
    <tr>
        <td style="width: 150px; height: 150px; vertical-align: middle;">
            <img src="../resources.jpg" width="150" height="150" style="display: block;" />
        </td>
        <td>
            <h2 style="color:#f71;">Just before we get to the assignment --</h2>
            <span style="color:#f71;">I thought I'd take a second to point you at this page of useful resources for the course. This includes links to all the slides.<br/>
            <a href="https://edwarddonner.com/2024/11/13/llm-engineering-resources/">https://edwarddonner.com/2024/11/13/llm-engineering-resources/</a><br/>
            Please keep this bookmarked, and I'll continue to add more useful links there over time.
            </span>
        </td>
    </tr>
</table>

# HOMEWORK EXERCISE ASSIGNMENT

Upgrade the day 1 project to summarize a webpage to use an Open Source model running locally via Ollama rather than OpenAI

You'll be able to use this technique for all subsequent projects if you'd prefer not to use paid APIs.

**Benefits:**
1. No API charges - open-source
2. Data doesn't leave your box

**Disadvantages:**
1. Significantly less power than Frontier Model

## Recap on installation of Ollama

Simply visit [ollama.com](https://ollama.com) and install!

Once complete, the ollama server should already be running locally.  
If you visit:  
[http://localhost:11434/](http://localhost:11434/)

You should see the message `Ollama is running`.  

If not, bring up a new Terminal (Mac) or Powershell (Windows) and enter `ollama serve`  
And in another Terminal (Mac) or Powershell (Windows), enter `ollama pull llama3.2`  
Then try [http://localhost:11434/](http://localhost:11434/) again.

If Ollama is slow on your machine, try using `llama3.2:1b` as an alternative. Run `ollama pull llama3.2:1b` from a Terminal or Powershell, and change the code below from `MODEL = "llama3.2"` to `MODEL = "llama3.2:1b"`

In [13]:
# imports

import requests
from bs4 import BeautifulSoup
from IPython.display import Markdown, display

In [19]:
# Constants

OLLAMA_API = "http://localhost:11434/api/chat"
HEADERS = {"Content-Type": "application/json"}
MODEL = "gemma"

In [20]:
# Create a messages list using the same format that we used for OpenAI

messages = [
    {"role": "user", "content": "Describe some of the business applications of Generative AI"}
]

In [21]:
payload = {
        "model": MODEL,
        "messages": messages,
        "stream": False
    }

In [22]:
response = requests.post(OLLAMA_API, json=payload, headers=HEADERS)
print(response.json()['message']['content'])

## Generative AI applications in business:

**1. Content creation and marketing:**

* Writing of marketing materials, website copy, and social media posts
* Generating unique product descriptions and reviews
* Creating realistic customer testimonials and product demos


**2. Customer service and support:**

* Automating customer service responses and FAQs
* Transcribing and summarizing customer calls for analysis
* Generating personalized responses based on customer sentiment


**3. Product development and design:**

* Brainstorming new product ideas and generating concepts
* Designing user interfaces and product mockups
* Optimizing product packaging for improved user experience


**4. Research and analysis:**

* Summarizing research papers and industry reports
* Identifying market trends and competitive analysis
* Generating insights from large datasets to predict future outcomes


**5. Operations and efficiency:**

* Automating repetitive tasks and workflows
* Transcribing meeting n

In [23]:
response

<Response [200]>

# Introducing the ollama package

And now we'll do the same thing, but using the elegant ollama python package instead of a direct HTTP call.

Under the hood, it's making the same call as above to the ollama server running at localhost:11434

In [24]:
import ollama

response = ollama.chat(model=MODEL, messages=messages)
print(response['message']['content'])

## Generative AI applications in business:

**1. Content Creation & Marketing:**

* Automated generation of high-quality marketing materials like website copy, ad copy, email campaigns, and product descriptions.
* Creation of unique and engaging customer stories and brand narratives.
* Sentiment analysis and automatic generation of targeted content based on audience preferences.


**2. Customer Service & Support:**

* Automated response generation to customer inquiries and complaints.
* Personalized product recommendations and support content based on user preferences and purchase history.
* Automated chatbots for streamlined customer service experiences.


**3. Research & Development:**

* Generating hypotheses and research papers based on existing data sets.
* Summarizing complex scientific documents and generating actionable insights.
* Developing innovative product concepts and market strategies.


**4. Product Development & Design:**

* Generating product ideas and brainstorming p

## Alternative approach - using OpenAI python library to connect to Ollama

In [25]:
# There's actually an alternative approach that some people might prefer
# You can use the OpenAI client python library to call Ollama:

from openai import OpenAI
ollama_via_openai = OpenAI(base_url='http://localhost:11434/v1', api_key='ollama')

response = ollama_via_openai.chat.completions.create(
    model=MODEL,
    messages=messages
)

print(response.choices[0].message.content)

## Business Applications of Generative AI:

**1. Content Creation & Editing:**

* Automated document and website content generation
* Summarization and translation of large datasets
* Personalized content recommendations for users
* Automated document review and editing for compliance 


**2. Customer Engagement & Research:**

* Generation of unique and engaging customer service chatbots and dialogue models
* Analysis of customer feedback and market research data
* Creation of personalized customer offers and recommendations


**3. Marketing & Sales:**

* Automatic generation of targeted ad copy and social media content
* Simulation of market trends and product testing
* Market research and competitive analysis


**4. Product & Service Development:**

* Design and optimization of new products and services based on user preferences
* Generation of innovative ideas and concepts through brainstorming


**5. Research & Development:**

* Analysis and interpretation of scientific research da

# NOW the exercise for you

Take the code from day1 and incorporate it here, to build a website summarizer that uses Llama 3.2 running locally instead of OpenAI; use either of the above approaches.

In [56]:
import os
import requests
from dotenv import load_dotenv
from bs4 import BeautifulSoup
from IPython.display import Markdown, display
from openai import OpenAI


In [57]:
headers = {
 "User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/117.0.0.0 Safari/537.36"
}

class Website:

    def __init__(self, url):
        """
        Create this Website object from the given url using the BeautifulSoup library
        """
        self.url = url
        response = requests.get(url, headers=headers)
        soup = BeautifulSoup(response.content, 'html.parser')
        self.title = soup.title.string if soup.title else "No title found"
        for irrelevant in soup.body(["script", "style", "img", "input"]):
            irrelevant.decompose()
        self.text = soup.body.get_text(separator="\n", strip=True)

attention_paper_html = Website(url='https://arxiv.org/html/1706.03762v7')
attention_paper_html.title

'Attention Is All You Need'

In [58]:
system_prompt = "You are a scientist helper which reads papers published on the web \
and provides a short summary, ignoring text that might be navigation related. \
Respond in markdown."

In [59]:
def user_prompt_for(website):
    user_prompt = f"You are looking at a website titled {website.title}"
    user_prompt += "\nThe contents of this website is as follows; \
please provide a short summary of this website in markdown. \n\n"
    user_prompt += website.text
    return user_prompt

In [60]:
def messages_for(website):
    return [
        {"role": "system", "content": system_prompt},
        {"role": "user", "content": user_prompt_for(website)}
    ]

In [61]:
payload = {
        "model": MODEL,
        "messages": messages_for(attention_paper_html),
        "stream": False
    }
payload

{'model': 'gemma',
 'messages': [{'role': 'system',
   'content': 'You are a scientist helper which reads papers published on the web and provides a short summary, ignoring text that might be navigation related. Respond in markdown.'},
  {'role': 'user',
   'content': 'You are looking at a website titled Attention Is All You Need\nThe contents of this website is as follows; please provide a short summary of this website in markdown. \n\n1\nIntroduction\n2\nBackground\n3\nModel Architecture\n3.1\nEncoder and Decoder Stacks\nEncoder:\nDecoder:\n3.2\nAttention\n3.2.1\nScaled Dot-Product Attention\n3.2.2\nMulti-Head Attention\n3.2.3\nApplications of Attention in our Model\n3.3\nPosition-wise Feed-Forward Networks\n3.4\nEmbeddings and Softmax\n3.5\nPositional Encoding\n4\nWhy Self-Attention\n5\nTraining\n5.1\nTraining Data and Batching\n5.2\nHardware and Schedule\n5.3\nOptimizer\n5.4\nRegularization\nResidual Dropout\nLabel Smoothing\n6\nResults\n6.1\nMachine Translation\n6.2\nModel Variati

In [62]:
response = requests.post(OLLAMA_API, json=payload, headers=HEADERS)
print(response.json()['message']['content'])

**Summary:**

The provided text discusses the attention mechanism in deep learning models, specifically focusing on its ability to capture long-distance dependencies and perform tasks such as anaphora resolution and sentence structure analysis.

**Key Concepts:**

* **Attention mechanism:** A technique that allows neural networks to focus on specific parts of the input data, improving their performance on tasks like sequence-to-sequence learning and machine translation.
* **Long-distance dependencies:** Relationships between words or phrases that are far apart in the sentence.
* **Anaphora resolution:** The process of identifying and interpreting pronouns and other referents in a sentence.
* **Sentence structure analysis:** Understanding the hierarchical structure of a sentence, including its constituent parts and relationships.

**Findings:**

* The attention mechanism can effectively capture long-distance dependencies in the encoder self-attention.
* Attention heads can be involved i

In [63]:
#Following lines are for debugging the response if you ask for a streamed answer.

In [54]:
import json

In [55]:
for chunk in response.iter_lines():
    if chunk:
        # Decode chunk and parse as JSON
        data = json.loads(chunk.decode('utf-8'))
        print(data)

{'model': 'gemma', 'created_at': '2024-12-22T22:04:32.172502456Z', 'message': {'role': 'assistant', 'content': '##'}, 'done': False}
{'model': 'gemma', 'created_at': '2024-12-22T22:04:33.889062232Z', 'message': {'role': 'assistant', 'content': ' Summary'}, 'done': False}
{'model': 'gemma', 'created_at': '2024-12-22T22:04:34.40955523Z', 'message': {'role': 'assistant', 'content': ' of'}, 'done': False}
{'model': 'gemma', 'created_at': '2024-12-22T22:04:34.959623569Z', 'message': {'role': 'assistant', 'content': ' the'}, 'done': False}
{'model': 'gemma', 'created_at': '2024-12-22T22:04:35.658893466Z', 'message': {'role': 'assistant', 'content': ' Attention'}, 'done': False}
{'model': 'gemma', 'created_at': '2024-12-22T22:04:36.237897497Z', 'message': {'role': 'assistant', 'content': ' Visual'}, 'done': False}
{'model': 'gemma', 'created_at': '2024-12-22T22:04:36.806951449Z', 'message': {'role': 'assistant', 'content': 'izations'}, 'done': False}
{'model': 'gemma', 'created_at': '2024-12-