# Lab2: Local Open Source on My PC Project
## Summarize All Websites without Selenium Using Open Source Models
This builds on my app from yesterday using Jina (https://jina.ai/reader) to turn all websites into markdown before summarizing by an LLM. And it uses Ollama to store open source LLMs on my PC to run things locally (jina is not local, so to be totally local you might need to go back to Selenium to do JavaScript sites).




In [None]:
# imports

import os
import requests
from dotenv import load_dotenv
from IPython.display import Markdown, display
from openai import OpenAI


In [None]:
# Setup access to the Ollama models

OLLAMA_BASE_URL = "http://localhost:11434/v1"

ollama = OpenAI(base_url=OLLAMA_BASE_URL, api_key='ollama')


In [None]:
# Step 1-a: Define the user prompt

user_prompt_prefix = """
Here are the contents of a website.
Provide a short summary of this website.
If it includes news or announcements, then summarize these too.
Make recommendations for improvement
"""

In [None]:
# Step 1-b: Define the system prompt

system_prompt = """You are to act like a smart Mckinsey Consultant specializing in website analysis. 
1) You should provide a short, clear, summary, ignoring text that might be navigation related.
2) Follow the summary by making recommendations for improving the website so it is better at serving its purpose.
3) Follow industry frameworks for reponses always give simple answers and stick to the point.
4) If possible try to group you recommendations, for example Grammar and Style, Clarity, Functional, etc.
5) Give confidence scores with every recommendation.
6) Always provide a summary of the website, explaining what it is.
7) if you do not understand the website's purpose or have no improvement recommendations, give out an error message along the lines of more data required for analysis or ask a follow up question.
8) Respond in markdown. Do not wrap the markdown in a code block - respond just with the markdown."""


In [None]:
# Add the website content to the user prompt

def messages_for(website):
    return [
        {"role": "system", "content": system_prompt},
        {"role": "user", "content": user_prompt_prefix + website}
    ]

In [None]:
# Step 5: Change the content utility to use jina

def fetch_url_content(url):
    jina_reader_url = f"https://r.jina.ai/{url}"
    try:
        response = requests.get(jina_reader_url)
        response.raise_for_status()                     # Raise an exception for HTTP errors
        return response.text
    except requests.exceptions.RequestException as e:
        print(f"Error fetching URL: {e}")
        return None


In [None]:
# Step 3: Call Ollama model & Step 4: print the result

def summarize(url):
    website = fetch_url_content(url)
    response = ollama.chat.completions.create(
        model = omodel,
        messages = messages_for(website)
    )
    summary = response.choices[0].message.content
    return display(Markdown(summary))


In [None]:
omodel = "llama3.2"
summarize("https://edwarddonner.com")

In [None]:
omodel = "deepseek-r1:1.5b"
summarize("https://edwarddonner.com")

In [None]:
omodel = "llama3.2"
summarize("https://cnn.com")

In [None]:
omodel = "deepseek-r1:1.5b"
summarize("https://cnn.com")

In [None]:
omodel = "llama3.2"
summarize("https://openai.com")

In [None]:
omodel = "deepseek-r1:1.5b"
summarize("https://openai.com")