In [3]:
!ollama list

NAME                    ID              SIZE      MODIFIED       
llama3.2:latest         a80c4f17acd5    2.0 GB    47 seconds ago    
deepseek-r1:7b          0a8c26691023    4.7 GB    2 months ago      
codellama:latest        8fdf8f752f6e    3.8 GB    4 months ago      
qwen2.5-coder:latest    2b0496514337    4.7 GB    4 months ago      


In [23]:
# imports

import requests
from bs4 import BeautifulSoup
from IPython.display import Markdown, display

In [24]:
OLLAMA_API = "http://localhost:11434/api/chat"
HEADERS = {"Content-Type": "application/json"}
MODEL = "llama3.2"

In [25]:
# A class to represent a Webpage
# If you're not familiar with Classes, check out the "Intermediate Python" notebook

# Some websites need you to use proper headers when fetching them:
headers = {
 "User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/117.0.0.0 Safari/537.36"
}

class Website:

    def __init__(self, url):
        """
        Create this Website object from the given url using the BeautifulSoup library
        """
        self.url = url
        response = requests.get(url, headers=headers)
        soup = BeautifulSoup(response.content, 'html.parser')
        self.title = soup.title.string if soup.title else "No title found"
        for irrelevant in soup.body(["script", "style", "img", "input"]):
            irrelevant.decompose()
        self.text = soup.body.get_text(separator="\n", strip=True)

In [26]:

system_prompt = "You are an assistant that analyzes the contents of a website \
and provides a short summary, ignoring text that might be navigation related. \
Respond in markdown."

In [27]:
def user_prompt_for(website):
    user_prompt = f"You are looking at a website titled {website.title}"
    user_prompt += "\nThe contents of this website is as follows; \
please provide a short summary of this website in markdown. \
If it includes news or announcements, then summarize these too.\n\n"
    user_prompt += website.text
    return user_prompt

In [29]:
# payload = {
#     "model": MODEL,
#     "messages": messages,
#     "stream": False
# }

In [37]:
import ollama

def summarise(url):
    website = Website(url)
    response = ollama.chat(model=MODEL, messages=messages_for(website))
    return(response['message']['content'])

In [38]:
summarise("https://edwarddonner.com")

'# Website Summary\n### Home Page\n\n* The website belongs to Edward Donner, a co-founder and CTO of Nebula.io.\n* He shares his passion for writing code, experimenting with LLMs, and DJing.\n* Nebula.io applies AI to help people discover their potential, using proprietary LLMs in talent management.\n\n### News and Announcements\n\n* **LLM Workshop – Hands-on with Agents** (January 23, 2025): A workshop where attendees can engage with agents.\n* **Welcome, SuperDataScientists!** (November 13, 2024): An announcement for the community.\n* **Mastering AI and LLM Engineering – Resources** (October 16, 2024): Resources for learning about mastering AI and LLM engineering.\n* **From Software Engineer to AI Data Scientist – resources** (December 21, 2024): Resources to help professionals transition to AI data science.\n\n### General Information\n\n* Connect Four: A section that appears to be a game or puzzle, possibly related to LLMs.\n* Outsmart: An arena where LLMs engage in a battle of dipl

In [39]:
def display_summary(url):
    summary = summarise(url)
    display(Markdown(summary))

In [40]:
display_summary("https://edwarddonner.com")

### Summary of Edward Donner's Website

#### About the Founder
Ed is a writer, DJ, and amateur electronic music producer. He is the co-founder and CTO of Nebula.io, an AI startup that applies machine learning to help people discover their potential.

#### Company Overview
Nebula.io develops proprietary LLMs for talent matching and has patented its matching model. The platform has received award-winning recognition and press coverage.

#### News and Announcements

* **December 21, 2024**: Announcement of an upcoming LLM Workshop – Hands-on with Agents – resources.
* **November 13, 2024**: Welcome message for SuperDataScientists!
* **October 16, 2024**: Resources available on mastering AI and LLM Engineering.
* **January 23, 2025**: Upcoming event announcement (no details provided).

#### Contact Information
Reach out to Ed at [ed@edwarddonner.com](mailto:ed@edwarddonner.com) or follow him on social media:

* LinkedIn
* Twitter
* Facebook

In [None]:
import ollama

response = ollama.chat(model=MODEL, messages=messages)
print(response['message']['content'])

In [16]:
messages_for(ed)

[{'role': 'system',
  'content': 'You are an assistant that analyzes the contents of a website and provides a short summary, ignoring text that might be navigation related. Respond in markdown.'},
 {'role': 'user',
  'content': 'You are looking at a website titled Home - Edward Donner\nThe contents of this website is as follows; please provide a short summary of this website in markdown. If it includes news or announcements, then summarize these too.\n\nHome\nConnect Four\nOutsmart\nAn arena that pits LLMs against each other in a battle of diplomacy and deviousness\nAbout\nPosts\nWell, hi there.\nI’m Ed. I like writing code and experimenting with LLMs, and hopefully you’re here because you do too. I also enjoy DJing (but I’m badly out of practice), amateur electronic music production (\nvery\namateur) and losing myself in\nHacker News\n, nodding my head sagely to things I only half understand.\nI’m the co-founder and CTO of\nNebula.io\n. We’re applying AI to a field where it can make a

In [25]:
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
import undetected_chromedriver as uc  # You'll need to pip install this

class Website:
    def __init__(self, url):
        """
        Create this Website object using undetected-chromedriver to handle Cloudflare
        """
        self.url = url
        self.title = "..."
        self.text = ""
        
        # Configure Chrome options
        options = uc.ChromeOptions()
        options.add_argument('--headless')
        options.add_argument('--no-sandbox')
        options.add_argument('--disable-dev-shm-usage')
        # Add more browser-like headers
        options.add_argument('--user-agent=Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/122.0.0.0 Safari/537.36')
        
        try:
            # Use undetected-chromedriver instead of regular selenium
            self.driver = uc.Chrome(options=options)
            self.driver.get(self.url)
            
            # Wait longer for Cloudflare to clear
            WebDriverWait(self.driver, 20).until(
                EC.presence_of_element_located((By.CLASS_NAME, "calendar__table"))
            )
            
            # Extract calendar specific data
            calendar_table = self.driver.find_element(By.CLASS_NAME, "calendar__table")
            today_events = calendar_table.find_elements(By.CSS_SELECTOR, "tr.calendar__row.calendar_row")
            
            # Process the events
            events_data = []
            for event in today_events:
                try:
                    time = event.find_element(By.CLASS_NAME, "calendar__time").text
                    currency = event.find_element(By.CLASS_NAME, "calendar__currency").text
                    impact = event.find_element(By.CLASS_NAME, "calendar__impact").get_attribute("title")
                    event_name = event.find_element(By.CLASS_NAME, "calendar__event").text
                    
                    events_data.append({
                        "time": time,
                        "currency": currency,
                        "impact": impact,
                        "event": event_name
                    })
                except Exception as e:
                    continue
            
            # Store the formatted data
            self.title = self.driver.title
            self.events = events_data
            self.text = json.dumps(events_data, indent=2)  # Store as formatted JSON
            
        except Exception as e:
            print(f"Error loading page: {str(e)}")
            self.title = "Error loading page"
            self.text = ""
            self.events = []
            
        finally:
            self.driver.quit()

    def to_dict(self):
        """Convert Website object to JSON-serializable dictionary"""
        return {
            "url": self.url,
            "title": self.title,
            "events": self.events
        }

    def __str__(self):
        """Human-readable representation"""
        return f"Website(url={self.url}, events_count={len(self.events)})"

In [19]:
import json

class WebsiteEncoder(json.JSONEncoder):
    def default(self, obj):
        if isinstance(obj, Website):
            return obj.to_dict()
        return super().default(obj)

def summarize(website_obj):
    # Convert to JSON-serializable format first
    website_data = json.dumps(website_obj, cls=WebsiteEncoder)
    
    messages = [
        {"role": "system", "content": "Summarize this website:"},
        {"role": "user", "content": website_data}
    ]
    
    response = openai.chat.completions.create(
        model="gpt-4o-mini",
        messages=messages
    )
    return response.choices[0].message.content


In [27]:
fx = Website("https://www.forexfactory.com/calendar")
print(f"Found {len(fx.events)} events")
print(fx.text)

Found 0 events
[]


In [21]:
# A function to display this nicely in the Jupyter output, using markdown

def display_summary(url):
    summary = summarize(url)
    display(Markdown(summary))

In [24]:
display_summary("https://www.forexfactory.com/calendar")

The website Forex Factory provides a financial calendar that displays important economic events and indicators that can impact the forex market. Users can view scheduled announcements, including dates, times, and potential market impacts. The calendar is categorized by event types such as central bank meetings, economic reports, and other significant financial events that traders should monitor to make informed trading decisions. Users can customize their view and filter events based on their relevance or specific currencies.

In [30]:
# ... existing code ...

from selenium import webdriver
from selenium.webdriver.chrome.service import Service
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
from webdriver_manager.chrome import ChromeDriverManager
import json
from datetime import datetime

class ForexFactoryWebsite:
    def __init__(self, url="https://www.forexfactory.com/calendar"):
        """
        Create a specialized Website object for scraping ForexFactory calendar
        """
        self.url = url
        self.title = "ForexFactory Calendar"
        self.events = []
        
        # Configure Chrome options
        options = webdriver.ChromeOptions()
        options.add_argument('--headless')
        options.add_argument('--no-sandbox')
        options.add_argument('--disable-dev-shm-usage')
        options.add_argument('--user-agent=Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/122.0.0.0 Safari/537.36')
        
        try:
            # Setup the driver with ChromeDriverManager for automatic webdriver management
            self.driver = webdriver.Chrome(service=Service(ChromeDriverManager().install()), options=options)
            self.driver.get(self.url)
            
            # Wait for the calendar to load - looking for the table with calendar data
            WebDriverWait(self.driver, 20).until(
                EC.presence_of_element_located((By.CSS_SELECTOR, "table.calendar__table"))
            )
            
            print(f"Page loaded successfully, extracting events...")
            
            # Extract all event rows - based on the screenshot, these are tr elements in the calendar table
            event_rows = self.driver.find_elements(By.CSS_SELECTOR, "table.calendar__table tr[class*='calendar__row']")
            
            current_date = None
            
            for row in event_rows:
                try:
                    # Check if this is a date header row
                    date_cells = row.find_elements(By.CSS_SELECTOR, "td.calendar__date")
                    if date_cells and len(date_cells) > 0:
                        date_text = date_cells[0].text.strip()
                        if date_text:
                            current_date = date_text
                            print(f"Processing events for: {current_date}")
                        continue
                    
                    # Skip rows without proper event data
                    time_cells = row.find_elements(By.CSS_SELECTOR, "td.calendar__time")
                    if not time_cells or len(time_cells) == 0:
                        continue
                    
                    # Extract event details
                    time = time_cells[0].text.strip() if time_cells else "N/A"
                    
                    # Extract currency
                    currency_cells = row.find_elements(By.CSS_SELECTOR, "td.calendar__currency")
                    currency = currency_cells[0].text.strip() if currency_cells else "N/A"
                    
                    # Extract impact - look for the colored cells (red, orange, yellow)
                    impact_cells = row.find_elements(By.CSS_SELECTOR, "td.calendar__impact")
                    impact = "Low"
                    if impact_cells:
                        impact_cell = impact_cells[0]
                        impact_spans = impact_cell.find_elements(By.TAG_NAME, "span")
                        if impact_spans:
                            impact_class = impact_spans[0].get_attribute("class")
                            if "high" in impact_class or "calendar__impact--high" in impact_class:
                                impact = "High"
                            elif "medium" in impact_class or "calendar__impact--med" in impact_class:
                                impact = "Medium"
                    
                    # Extract event name
                    event_cells = row.find_elements(By.CSS_SELECTOR, "td.calendar__event")
                    event_name = event_cells[0].text.strip() if event_cells else "N/A"
                    
                    # Extract actual value
                    actual_cells = row.find_elements(By.CSS_SELECTOR, "td.calendar__actual")
                    actual = actual_cells[0].text.strip() if actual_cells and actual_cells[0].text.strip() else "N/A"
                    
                    # Extract forecast value
                    forecast_cells = row.find_elements(By.CSS_SELECTOR, "td.calendar__forecast")
                    forecast = forecast_cells[0].text.strip() if forecast_cells and forecast_cells[0].text.strip() else "N/A"
                    
                    # Extract previous value
                    previous_cells = row.find_elements(By.CSS_SELECTOR, "td.calendar__previous")
                    previous = previous_cells[0].text.strip() if previous_cells and previous_cells[0].text.strip() else "N/A"
                    
                    # Add event to our list
                    if event_name != "N/A" and time != "N/A":
                        self.events.append({
                            "date": current_date,
                            "time": time,
                            "currency": currency,
                            "impact": impact,
                            "event": event_name,
                            "actual": actual,
                            "forecast": forecast,
                            "previous": previous
                        })
                    
                except Exception as e:
                    print(f"Error processing row: {str(e)}")
                    continue
            
            # Create a text representation of the events
            self.text = self._format_events_text()
            
            print(f"Successfully extracted {len(self.events)} events")
            
        except Exception as e:
            print(f"Error loading ForexFactory calendar: {str(e)}")
            self.text = f"Error: {str(e)}"
        
        finally:
            if hasattr(self, 'driver'):
                self.driver.quit()
    
    def _format_events_text(self):
        """Format the events into a readable text format"""
        if not self.events:
            return "No financial events found for today."
        
        text = "Financial Calendar Events:\n\n"
        
        current_date = None
        for event in self.events:
            # Add date header if it's a new date
            if current_date != event['date']:
                current_date = event['date']
                text += f"\n=== {current_date} ===\n\n"
            
            text += f"Time: {event['time']}\n"
            text += f"Currency: {event['currency']}\n"
            text += f"Impact: {event['impact']}\n"
            text += f"Event: {event['event']}\n"
            
            if event['actual'] != "N/A":
                text += f"Actual: {event['actual']}\n"
            if event['forecast'] != "N/A":
                text += f"Forecast: {event['forecast']}\n"
            if event['previous'] != "N/A":
                text += f"Previous: {event['previous']}\n"
            
            text += "-" * 40 + "\n"
        
        return text
    
    def to_dict(self):
        """Convert to a dictionary for JSON serialization"""
        return {
            "url": self.url,
            "title": self.title,
            "events": self.events
        }
    
    def __str__(self):
        return f"ForexFactory Calendar with {len(self.events)} events"

In [31]:
# Create an instance of the ForexFactoryWebsite class
forex_calendar = ForexFactoryWebsite()

# Print the formatted events
print(forex_calendar.text)

# Access the structured event data
for event in forex_calendar.events:
    print(f"{event['time']} - {event['currency']} - {event['event']} (Impact: {event['impact']})")

Page loaded successfully, extracting events...
Processing events for: Sun
Mar 30
Processing events for: Mon
Mar 31
Processing events for: Tue
Apr 1
Processing events for: Wed
Apr 2
Processing events for: Thu
Apr 3
Processing events for: Fri
Apr 4
Processing events for: Sat
Apr 5
Successfully extracted 53 events
Financial Calendar Events:


=== Sun
Mar 30 ===

Time: 
Currency: EUR
Impact: Low
Event: Daylight Saving Time Shift
----------------------------------------
Time: 
Currency: GBP
Impact: Low
Event: Daylight Saving Time Shift
----------------------------------------

=== Mon
Mar 31 ===

Time: 
Currency: JPY
Impact: Low
Event: Retail Sales y/y
Actual: 1.4%
Forecast: 2.4%
Previous: 4.4%
----------------------------------------
Time: 1:00am
Currency: AUD
Impact: Low
Event: MI Inflation Gauge m/m
Actual: 0.7%
Previous: -0.2%
----------------------------------------
Time: 
Currency: NZD
Impact: Low
Event: ANZ Business Confidence
Actual: 57.5
Previous: 58.4
-----------------------------

In [32]:
# ... existing code ...

def summarize_forex_calendar(forex_calendar):
    """
    Summarize the forex calendar events using OpenAI
    
    Args:
        forex_calendar: An instance of ForexFactoryWebsite
    
    Returns:
        A string containing the summary
    """
    # Create a prompt that explains the data and asks for a summary
    system_prompt = """You are a financial analyst assistant that specializes in forex markets.
    Analyze the provided economic calendar events and create a concise summary.
    Focus on high-impact events, group by currency, and explain potential market implications.
    Respond in markdown format with appropriate headings and bullet points."""
    
    # Create a user prompt with the structured event data
    user_prompt = f"""Please analyze and summarize these forex calendar events:
    
    {forex_calendar.text}
    
    In your summary:
    1. Highlight the most important (high-impact) events
    2. Group events by currency where appropriate
    3. Mention any notable forecasts or actual values
    4. Explain briefly what these events might mean for traders
    """
    
    # Create the messages for the API call
    messages = [
        {"role": "system", "content": system_prompt},
        {"role": "user", "content": user_prompt}
    ]
    
    # Call the OpenAI API
    response = openai.chat.completions.create(
        model="gpt-4o-mini",  # or your preferred model
        messages=messages
    )
    
    return response.choices[0].message.content

# Function to display the summary in markdown
def display_forex_summary():
    # Create an instance of the ForexFactoryWebsite class
    forex_calendar = ForexFactoryWebsite()
    
    # Get the summary
    summary = summarize_forex_calendar(forex_calendar)
    
    # Display as markdown
    display(Markdown(summary))

In [34]:
display_forex_summary()

Page loaded successfully, extracting events...
Processing events for: Sun
Mar 30
Processing events for: Mon
Mar 31
Processing events for: Tue
Apr 1
Processing events for: Wed
Apr 2
Processing events for: Thu
Apr 3
Processing events for: Fri
Apr 4
Processing events for: Sat
Apr 5
Successfully extracted 53 events


# Forex Economic Calendar Summary

This summary highlights the significant economic events from the provided forex calendar, focusing on their potential implications on the currencies involved.

## EUR (Euro)

- **German Retail Sales m/m**
  - **Actual:** 0.8%
  - **Forecast:** 0.0%
  - **Previous:** 0.7%
  - **Implication:** A stronger-than-expected retail sales figure may boost the Euro, suggesting healthier consumer demand in Germany.

- **Italian Prelim CPI m/m**
  - **Actual:** 0.4%
  - **Forecast:** 0.0%
  - **Previous:** 0.2%
  - **Implication:** This positive inflation surprise could lead to increased confidence in the Eurozone's economic stability.

- **Core CPI Flash Estimate y/y**
  - **Forecast:** 2.5%
  - **Previous:** 2.6%
  - **Implication:** If the actual figures fall below expectations, it could diminish ECB tightening chances, negatively impacting the Euro.

## GBP (British Pound)

- **M4 Money Supply m/m**
  - **Actual:** 0.2%
  - **Forecast:** 1.1%
  - **Previous:** 1.4%
  - **Implication:** A weaker money supply growth indicates potential economic slowdown, which may lead to a bearish sentiment for the GBP.

- **Mortgage Approvals**
  - **Actual:** 65K
  - **Forecast:** 66K
  - **Previous:** 66K
  - **Implication:** A drop in approvals could signal a slowdown in the housing market and possibly consumer spending, further harming GBP.

## JPY (Japanese Yen)

- **Retail Sales y/y**
  - **Actual:** 1.4%
  - **Forecast:** 2.4%
  - **Previous:** 4.4%
  - **Implication:** The significant miss against expectations is concerning and may lead traders to reassess growth prospects for Japan.

- **Housing Starts y/y**
  - **Actual:** 2.4%
  - **Forecast:** -2.3%
  - **Previous:** -4.6%
  - **Implication:** A surprising rise in housing starts could provide a slight boost to the JPY, reflecting economic resilience.

## AUD (Australian Dollar)

- **Retail Sales m/m**
  - **Forecast:** 0.3%
  - **Previous:** 0.3%
  - **Implication:** If actual figures meet expectations, it can signal consumer stability and support the AUD.

- **Cash Rate**
  - **Forecast:** 4.10%
  - **Previous:** 4.10%
  - **Implication:** No change is expected, but any commentary could lead to volatility if the RBA indicates future policy shifts.

## CNY (Chinese Yuan)

- **Manufacturing PMI**
  - **Actual:** 50.5
  - **Forecast:** 50.4
  - **Previous:** 50.2
  - **Implication:** An improvement backs expectations for stability in the manufacturing sector, which may positively affect market sentiment towards the CNY.

- **Non-Manufacturing PMI**
  - **Actual:** 50.8
  - **Forecast:** 50.5
  - **Implication:** Maintaining expansion indicates good service sector reliability, uplifting the overall outlook for the Yuan.

## USD (U.S. Dollar)

- **Chicago PMI**
  - **Actual:** 47.6
  - **Forecast:** 45.5
  - **Previous:** 45.5
  - **Implication:** Increased activity in the manufacturing sector could strengthen the USD as economic health appears resilient.

- **ISM Manufacturing PMI**
  - **Forecast:** 49.5
  - **Previous:** 50.3
  - **Implication:** A drop in this index could raise concerns about the contraction in manufacturing, potentially leading to a bearish outlook on the USD.

## Summary

Overall, the data suggests that while many currencies are experiencing some pressure, sectors such as the Eurozone’s retail figures and Japan's housing starts offer signs of strength. The upcoming events, particularly those related to USD indicators, will likely influence trader sentiments significantly. Caution is advised, particularly in reaction to surprising PMI figures.