# Shipping Forecast Bot Prototype

## Step 1: Fetch the Shipping Forecast

In [1]:
! pip install requests


[1m[[0m[34;49mnotice[0m[1;39;49m][0m[39;49m A new release of pip is available: [0m[31;49m23.2.1[0m[39;49m -> [0m[32;49m25.0.1[0m
[1m[[0m[34;49mnotice[0m[1;39;49m][0m[39;49m To update, run: [0m[32;49mpip install --upgrade pip[0m


In [1]:
import requests

# URL of the web page with forecasts from the German Weather Service
url = "https://www.dwd.de/EN/ourservices/seewetternordostseeen/seewetternordostsee.html"

# Send a GET request to fetch the page content
response = requests.get(url)

# Check if the request was successful
if response.status_code == 200:
    page_content = response.text
    print("Page downloaded successfully. Here are the first 500 characters:")
    print(page_content[:500])  # Print the first 500 characters for preview
else:
    print("Failed to download the page. Status code:", response.status_code)



Page downloaded successfully. Here are the first 500 characters:
<!DOCTYPE html>
<html xml:lang="en" lang="en" class="no-js">
  <head>
   <meta http-equiv="X-UA-Compatible" content="IE=edge" /> 
   <base href="https://www.dwd.de/"/>
   <meta charset="utf-8">
   <meta name="viewport" content="width=320, minimum-scale=1.0, maximum-scale=1.0" />
    <title>Wetter und Klima - Deutscher Wetterdienst   -  Our services - Marine weather forecast North and Baltic Sea</title>
    <meta name="generator" content="Government Site Builder"/>

    <script type="text/javascr


## Step 2: Extract the Forecast Data

We need to extract the next information from the page:
- The forecast date and time
- General synoptic situation
- Warnings
- The current forecast for each area

In [4]:
! pip3 install beautifulsoup4


[1m[[0m[34;49mnotice[0m[1;39;49m][0m[39;49m A new release of pip is available: [0m[31;49m23.2.1[0m[39;49m -> [0m[32;49m25.0.1[0m
[1m[[0m[34;49mnotice[0m[1;39;49m][0m[39;49m To update, run: [0m[32;49mpip install --upgrade pip[0m


In [2]:
from bs4 import BeautifulSoup, NavigableString
import re

# Get the HTML content
html_content = response.text

# Parse the full HTML document
soup = BeautifulSoup(html_content, "html.parser")

# Find the <pre> tag which contains the bulletin text
pre_tag = soup.find("pre")
if not pre_tag:
    raise ValueError("No <pre> tag found in the HTML.")

# Get the plain text (for publication time, synoptic info, and warnings)
pre_text = pre_tag.get_text(separator="\n")

# Also keep the HTML of the pre tag to leverage the bold (<B>) tags for forecast areas.
pre_html = str(pre_tag)
pre_soup = BeautifulSoup(pre_html, "html.parser")

# --- 1. Extract Publication Time ---
# Look for a date/time pattern like "10.03.2025, 15.36 UTC"
pub_time_match = re.search(r"(\d{2}\.\d{2}\.\d{4},\s*\d{2}\.\d{2}\s*UTC)", pre_text)
publication_time = pub_time_match.group(1) if pub_time_match else "Not found"

# --- 2. Extract General Synoptic Information ---
# We look for the line after the bold header "General synoptic situation"
synoptic_info_lines = []
lines = pre_text.splitlines()
synoptic_flag = False
for line in lines:
    if "general synoptic situation" in line.lower():
        synoptic_flag = True
        continue
    if synoptic_flag:
        # Stop if we hit a blank line or a line that likely begins a new section (e.g. warnings)
        if line.strip().lower().startswith("forecast valid") or line.strip().lower().startswith("until"):
            break
        synoptic_info_lines.append(line.strip())
synoptic_info = " ".join(synoptic_info_lines)

# --- 3. Extract Warnings Information (e.g. gales, strong winds) ---
# The warnings are given in lines that start with "until ... in the following forecast areas ... are expected:"
warnings = []
i = 0
while i < len(lines):
    line = lines[i].strip()
    # Check for a warning header line using a case-insensitive match
    if line.lower().startswith("until"):
        # Persist the valid period and the warning type
        warning_type = line
        if lines[i+1].strip().lower().endswith("expected:"):
            i += 1
            line = lines[i].strip()
            warning_type += " " + line
        if line.lower().endswith("expected:"):
            # Collect subsequent lines as warning areas until a blank line or another section starts
            warning_areas = []
            i += 1
            while i < len(lines):
                next_line = lines[i].strip()
                if next_line == "" or next_line.lower().startswith("until") or next_line.startswith("<B>"):
                    break
                warning_areas.append(next_line)
                i += 1

            warnings.append({
                "warning_type": warning_type,
                "areas": warning_areas
            })
    else:
        i += 1

# --- 4. Extract Forecast Details for Each Region ---
# We only consider forecast areas that are marked with bold (<B>) tags,
# and skip any sections related to the outlook forecast.
forecast_header = pre_soup.find(lambda tag: tag.name == "b" and "forecast valid until" in tag.get_text().lower())
forecast_details = {}

if forecast_header:
    # Iterate over all <b> tags that come after the forecast header.
    for bold_tag in forecast_header.find_all_next("b"):
        bold_text = bold_tag.get_text(strip=True)
        # Skip any forecast section that is part of the outlook
        if "outlook" in bold_text.lower():
            break
        # Process only forecast areas: they should end with a colon (e.g., "German Bight:")
        if not bold_text.endswith(":"):
            continue
        region = bold_text[:-1].strip()  # Remove the trailing colon

        # To avoid duplicates, skip if the region is already present.
        if region in forecast_details:
            continue

        # Collect all following text (from sibling nodes) until the next bold tag is encountered.
        forecast_info = ""
        for sibling in bold_tag.next_siblings:
            # Stop at the next bold tag, which indicates the start of the next forecast area.
            if getattr(sibling, "name", None) == "b":
                break
            if isinstance(sibling, NavigableString):
                forecast_info += sibling.strip() + " "
            else:
                forecast_info += sibling.get_text(" ", strip=True) + " "
        forecast_details[region] = forecast_info.strip()

# --- Print Extracted Information ---
print("Publication Time:")
print(publication_time)
print("\nGeneral Synoptic Information:")
print(synoptic_info)
print("\nWarnings Information:")
for w in warnings:
    print(f"\nWarning Type: {w['warning_type']}")
    print("Areas:")
    for area in w['areas']:
        print(" -", area)
print("\nForecast Details per Region (only areas marked in bold):")
for region, forecast in forecast_details.items():
    print(f"\n{region}:")
    print(forecast)


Publication Time:
11.03.2025, 04.36 UTC

General Synoptic Information:
 A low 995 southeastern Baltic Sea slowly moves eastwards and weakens. A high 1010 east of the Haltenbank over Norway moves to southern Finland. A low 1005 Fischer slowly moves to the German Bight and deepens. A high 1028 Irminger See expands with a ridge to the western Bay of Biscay.   


Areas:
 - fisher
 - dogger
 - forties
 - viking
 - utsire
 - skagerrak
 - central baltic
 - english channel eastern part

Forecast Details per Region (only areas marked in bold):

German Bight:
variable directions 2 to 4, western part at times 
northwest 5, first fog, later shower squalls, sea eastern 
part 1 meter, western part 2,5 meter.

Southwestern North Sea:
northerly winds 4 to 5, for a time increasing a little, 
at times misty, sea increasing 3 meter.

Fisher:
northeasterly winds about 4, in some areas increasing 6, 
misty, sea 3 meter.

Dogger:
northerly winds 5 to 6, shower squalls, later thundery 
gusts, sea 4 meter.

F

## Step 3: Generate a Report

In [3]:
def generate_report(user_areas, publication_time, synoptic_info, warnings, forecast_details):
    """
    Generates a weather forecast report for the user's subscribed areas.

    Parameters:
    - user_areas: List[str]
        A list of area names the user is subscribed to (e.g. ["German Bight", "Dogger"]).
    - publication_time: str
        Forecast publication date and time.
    - synoptic_info: str
        General synoptic information.
    - warnings: List[dict]
        List of warning dictionaries, each with keys: 'valid_until', 'warning_type', and 'areas' (a list of strings).
    - forecast_details: dict
        Dictionary mapping forecast region (string) to its forecast details (string).

    Returns:
    - report: str
        A formatted report containing the publication time, general synoptic info, any relevant warnings,
        and forecasts for the subscribed areas.
    """
    report_lines = []

    # Forecast publication time
    report_lines.append(f"Forecast Publication Time: {publication_time}\n")

    # General Synoptic Information
    report_lines.append(f"General Synoptic Information: {synoptic_info}\n")

    # Warnings: include warnings only if any of the affected areas contain one of the user's subscribed areas.
    relevant_warnings = []
    for warning in warnings:
        for warning_area in warning.get("areas", []):
            for area in user_areas:
                if area.lower() in warning_area.lower():
                    relevant_warnings.append(warning)
                    break
            else:
                continue
            break

    if relevant_warnings:
        report_lines.append("Warnings:")
        for warning in relevant_warnings:
            report_lines.append(f"  Warning Type: {warning.get('warning_type', 'N/A')}")
            report_lines.append("  Affected Areas:")
            for w_area in warning.get("areas", []):
                report_lines.append(f"    - {w_area}")
            report_lines.append("")  # blank line for readability
    else:
        report_lines.append("No warnings for your subscribed areas.\n")

    # Forecasts for each of the user's areas
    report_lines.append("Forecasts for your subscribed areas:")
    for user_area in user_areas:
        found = False
        for region, forecast in forecast_details.items():
            # Check if the user's area is present in the forecast region name (case-insensitive)
            if user_area.lower() in region.lower():
                report_lines.append(f"{region}:")
                report_lines.append(forecast)
                report_lines.append("")  # add a blank line between regions
                found = True
        if not found:
            report_lines.append(f"{user_area}: Forecast not found.\n")

    return "\n".join(report_lines)


In [4]:
print(generate_report(["Western Baltic"], publication_time, synoptic_info, warnings, forecast_details))

Forecast Publication Time: 11.03.2025, 04.36 UTC

General Synoptic Information:  A low 995 southeastern Baltic Sea slowly moves eastwards and weakens. A high 1010 east of the Haltenbank over Norway moves to southern Finland. A low 1005 Fischer slowly moves to the German Bight and deepens. A high 1028 Irminger See expands with a ridge to the western Bay of Biscay.   


Forecasts for your subscribed areas:
Western Baltic:
first north 2 to 3, veering southerly winds, increasing 
4, first fog, otherwise misty, sea 0,5 meter.

