# My First Lab = My 1st Frontier LLM Project
## Summarize All Websites without Selenium
This simple "app" uses Jina (https://jina.ai/reader) to turn all websites into markdown before summarizing by an LLM. As their website says: "Convert a URL to LLM-friendly input, by simply adding r.jina.ai in front". They have other tools that look useful too.




In [None]:
# imports

import os
import requests                                 # added for jina
from dotenv import load_dotenv
# from scraper import fetch_website_contents    # not needed for jina
from IPython.display import Markdown, display
from openai import OpenAI


In [None]:
# Load environment variables from a file called .env

load_dotenv(override=True)
api_key = os.getenv('OPENAI_API_KEY')

# Check the key

if not api_key:
    print("No API key was found - please head over to the troubleshooting notebook in this folder to identify & fix!")
elif not api_key.startswith("sk-proj-"):
    print("An API key was found, but it doesn't start sk-proj-; please check you're using the right key - see troubleshooting notebook")
elif api_key.strip() != api_key:
    print("An API key was found, but it looks like it might have space or tab characters at the start or end - please remove them - see troubleshooting notebook")
else:
    print("API key found and looks good so far!")

# Setup access to the frontier model

openai = OpenAI()

In [None]:
# Step 1-a: Define the user prompt

user_prompt_prefix = """
Here are the contents of a website.
Provide a short summary of this website.
If it includes news or announcements, then summarize these too.
"""

In [None]:
# Step 1-b: Define the system prompt

system_prompt = """
You are a smart assistant that analyzes the contents of a website,
and provides a short, clear, summary, ignoring text that might be navigation related.
Respond in markdown. Do not wrap the markdown in a code block - respond just with the markdown.
"""

In [None]:
# Add the website content to the user prompt

def messages_for(website):
    return [
        {"role": "system", "content": system_prompt},
        {"role": "user", "content": user_prompt_prefix + website}
    ]

In [None]:
# Step 5: Change the content utility to use jina

def fetch_url_content(url):
    jina_reader_url = f"https://r.jina.ai/{url}"
    try:
        response = requests.get(jina_reader_url)
        response.raise_for_status()                     # Raise an exception for HTTP errors
        return response.text
    except requests.exceptions.RequestException as e:
        print(f"Error fetching URL: {e}")
        return None


In [None]:
# Step 3: Call OpenAI & Step 4: print the result

def summarize(url):
    website = fetch_url_content(url)
    response = openai.chat.completions.create(
        model = "gpt-5-nano",
        messages = messages_for(website)
    )
    summary = response.choices[0].message.content
    return display(Markdown(summary))


In [None]:
summarize("https://edwarddonner.com")

In [None]:
summarize("https://cnn.com")

In [None]:
summarize("https://openai.com")

## Content Summary vs Technical Summary

In my work a technical summary of a website, or group of websites, would be useful too. For example, does it render on the server (HTML) or in the browser (JavaScript), what content management system (CMS) was used, how many pages, how many outbound links, how many inbound links, etc. Doing this exercise I realized LLMs can help with analyzing content, but I may need other tools to count pages, links, and other specifications.

A "Shout Out" to whoever put "Market_Research_Agent.ipynb" in the Community-Contributions. It is a great example of using an LLM as a management consultant. I think Jina might help with this usecase by offering web search results through an API to feed to your LLM. Here is the system prompt from that notebook and I plan to use this format often.

system_prompt = """You are to act like a Mckinsey Consultant specializing in market research. 
1) You are to follow legal guidelines and never give immoral advice. 
2) Your job is to maximise profits for your clients by analysing their companies initiatives and giving out recommendations for newer initiatives.\n 
3) Follow industry frameworks for reponses always give simple answers and stick to the point.
4) If possible try to see what competitors exist and what market gap can your clients company exploit.
5) Further more, USe SWOT, Porters 5 forces to summarize your recommendations, Give confidence score with every recommendations
6) Try to give unique solutions by seeing what the market gap is, if market gap is ambiguious skip this step
7) add an estimate of what rate the revenue of the comapany will increase at provided they follow the guidelines, give conservating estimates keeping in account non ideal conditions.
8) if the website isnt of a company or data isnt available, give out an error message along the lines of more data required for analysis"""