# Reading data

In this exercise we will cover how to use httpx and polars to read data from external data sources.

There are two primary data sources we will use:

1. Ferry data: https://wsdot.wa.gov/ (API documentation: https://www.wsdot.wa.gov/Ferries/API/Vessels/rest/help)
2. Weather data: https://open-meteo.com/

Specifically, there are 4 data sets we will focus on:

**Vessel History**: <https://www.wsdot.wa.gov/Ferries/API/Vessels/rest/help/operations/GetVesselHistory>

```
https://www.wsdot.wa.gov/Ferries/API/Vessels/rest/vesselhistory?apiaccesscode={APIACCESSCODE}
```

**Terminal Locations**: <https://www.wsdot.wa.gov/Ferries/API/terminals/rest/help/operations/GetAllTerminalLocations>

```
https://www.wsdot.wa.gov/Ferries/API/Terminals/rest/terminallocations?apiaccesscode={APIACCESSCODE}
```

**Vessel Verbose**: <https://www.wsdot.wa.gov/ferries/api/vessels/rest/help/operations/GetAllVesselVerboseDetails>

```
https://www.wsdot.wa.gov/Ferries/API/Vessels/rest/vesselverbose?apiaccesscode={APIACCESSCODE}
```

**Weather data**: <https://open-meteo.com/en/docs/historical-weather-api#start_date=2022-12-01&end_date=2022-12-31&hourly=temperature_2m,precipitation,weather_code,wind_speed_10m,wind_direction_10m,wind_gusts_10m&timezone=America%2FLos_Angeles>

```
https://archive-api.open-meteo.com/v1/archive?latitude=47.623651&longitude=122.360291&start_date=2022-12-01&end_date=2022-12-31&hourly=temperature_2m,precipitation,weather_code,wind_speed_10m,wind_direction_10m,wind_gusts_10m&timezone=America%2FLos_Angeles
```

## Task 0 - Virtual Environments

### 🔄 Task

Before starting on reading the data, lets first spend a few minutes undrstanding and setting up virtual environments.

### 🧑‍💻 Code

See [../06-bonus-stuff/virtual-environments-and-uv/README.md](../06-bonus-stuff/virtual-environments-and-uv/README.md)

## Task 1 - read the vessel verbose data

### 🔄 Task

- Download the **Vessel Verbose** data
- Convert the data into a polars dataframe

### 🧑‍💻 Code

The State of Washington data portal makes data available over an API. The API has lots of features, you can read more about how to use it here: <https://wsdot.wa.gov/traffic/api/>.

To download the data, many persons first instinct is to download via:

- clicking through your web browser, or
- the curl command in the terminal.

```bash
WSDOT_ACCESS_CODE='xxxx-xxxx-xxxx-xxxx-xxxx'
curl "https://www.wsdot.wa.gov/Ferries/API/Vessels/rest/vesselverbose?apiaccesscode=${WSDOT_ACCESS_CODE}"
```

There is a better way though! Using httpx we can download the data as JSON and then convert it into a Python dictionary. Then we use polars to create a DataFrame directly from the dictionary. First, lets download the data using httpx.

In [1]:
import os
from pathlib import Path

import httpx
from dotenv import load_dotenv

In [5]:
# Get the API key from an environment variable.
if Path(".env").exists():
    print("Loading .env")
    load_dotenv(override=True)

ws_dot_access_code = os.environ["WSDOT_ACCESS_CODE"]

Loading .env


In [6]:
base_url = "https://www.wsdot.wa.gov/Ferries/API/Vessels/rest"
base_url

'https://www.wsdot.wa.gov/Ferries/API/Vessels/rest'

In [7]:
path = "vesselverbose"
path

'vesselverbose'

In [8]:
# Define our params in a dictionary.
params = {"apiaccesscode": ws_dot_access_code}
params

{'apiaccesscode': '7fc01121-6cdc-40a5-b344-5bd1d5f8038f'}

In [9]:
with httpx.Client(base_url=base_url, params=params) as client:
    response = client.get(path)

response

<Response [200 OK]>

The `Response` object from httpx has several methods and attributes we can use to get more info about the request, and the response.

In [10]:
# The URL that was used to make the request.
response.url

URL('https://www.wsdot.wa.gov/Ferries/API/Vessels/rest/vesselverbose?apiaccesscode=7fc01121-6cdc-40a5-b344-5bd1d5f8038f')

In [11]:
# The status of the response
response.status_code

200

In [12]:
# Convert the response from JSON to a dictionary.
response.json()

[{'VesselID': 1,
  'VesselSubjectID': 1,
  'VesselName': 'Cathlamet',
  'VesselAbbrev': 'CAT',
  'Class': {'ClassID': 10,
   'ClassSubjectID': 310,
   'ClassName': 'Issaquah 130',
   'SortSeq': 40,
   'DrawingImg': 'https://www.wsdot.wa.gov/ferries/images/pages/boat_drawings/4-issaquah130.gif',
   'SilhouetteImg': 'https://www.wsdot.wa.gov/ferries/images/pages/boat_drawings/4-issaquah130-sillouette_sml.gif',
   'PublicDisplayName': 'Issaquah'},
  'Status': 1,
  'OwnedByWSF': True,
  'CarDeckRestroom': True,
  'CarDeckShelter': False,
  'Elevator': True,
  'ADAAccessible': True,
  'MainCabinGalley': True,
  'MainCabinRestroom': True,
  'PublicWifi': False,
  'ADAInfo': 'The MV Cathlamet has elevator access from the auto deck to the passenger deck. Notify a ticket seller if you are traveling by car and need to park near an elevator. The vessel has accessible restrooms located on both the main passenger deck and the auto deck. The main passenger deck also has vending and newspaper machine

In [13]:
# Check how many records are in the response.
len(response.json())

21

In [14]:
# Use the pprint function from rich for nicer formatting of the dictionary data.
from rich.pretty import pprint

In [15]:
pprint(response.json()[0])

Lastly, we can use polars to convert the dictionary into a DataFrame.


In [16]:
import polars as pl

In [17]:
vessel_verbose_raw = pl.DataFrame(response.json())
vessel_verbose_raw

VesselID,VesselSubjectID,VesselName,VesselAbbrev,Class,Status,OwnedByWSF,CarDeckRestroom,CarDeckShelter,Elevator,ADAAccessible,MainCabinGalley,MainCabinRestroom,PublicWifi,ADAInfo,AdditionalInfo,VesselNameDesc,VesselHistory,Beam,CityBuilt,SpeedInKnots,Draft,EngineCount,Horsepower,Length,MaxPassengerCount,PassengerOnly,FastFerry,PropulsionInfo,TallDeckClearance,RegDeckSpace,TallDeckSpace,Tonnage,Displacement,YearBuilt,YearRebuilt,VesselDrawingImg,SolasCertified,MaxPassengerCountForInternational
i64,i64,str,str,struct[7],i64,bool,bool,bool,bool,bool,bool,bool,bool,str,str,str,str,str,str,i64,str,i64,i64,str,i64,bool,bool,str,i64,i64,i64,i64,i64,i64,i64,null,bool,i64
1,1,"""Cathlamet""","""CAT""","{10,310,""Issaquah 130"",40,""https://www.wsdot.wa.gov/ferries/images/pages/boat_drawings/4-issaquah130.gif"",""https://www.wsdot.wa.gov/ferries/images/pages/boat_drawings/4-issaquah130-sillouette_sml.gif"",""Issaquah""}",1,true,true,false,true,true,true,true,false,"""The MV Cathlamet has elevator …",""" ""","""From the Kathlamet tribe, the …",""" ""","""78' 8""""","""Seattle, WA""",16,"""16' 6""""",2,5000,"""328'""",1200,false,false,"""DIESEL""",186,124,26,2477,3310,1981,1993,,false,
2,2,"""Chelan""","""CHE""","{10,310,""Issaquah 130"",40,""https://www.wsdot.wa.gov/ferries/images/pages/boat_drawings/4-issaquah130.gif"",""https://www.wsdot.wa.gov/ferries/images/pages/boat_drawings/4-issaquah130-sillouette_sml.gif"",""Issaquah""}",1,true,true,false,true,true,true,true,false,"""The MV Chelan has elevator acc…",""" ""","""From the Chelan language: Tsi…",""" ""","""78' 8""""","""Seattle, WA""",16,"""16' 9""""",2,5000,"""328'""",1200,false,false,"""DIESEL""",188,124,30,2477,3405,1981,2005,,true,1090
65,428,"""Chetzemoka""","""CHZ""","{162,427,""Kwa-di Tabil"",75,""https://www.wsdot.wa.gov/ferries/images/pages/boat_drawings/13-kwaditabil.gif"",""https://www.wsdot.wa.gov/ferries/images/pages/boat_drawings/13-kwaditabil-silouette_sml.gif"",""Kwa-di Tabil""}",1,true,false,false,true,true,true,true,false,"""MV Chetzemoka has elevator acc…",,"""The name honors a friendly Nat…",,"""64'""","""Seattle""",15,"""11'""",2,6000,"""273' 8""""",748,false,false,"""DIESEL""",192,64,9,4623,2415,2010,,,false,
74,487,"""Chimacum""","""CHM""","{100,319,""Olympic"",35,""https://www.wsdot.wa.gov/ferries/images/pages/boat_drawings/8-olympic-2014.gif"",""https://www.wsdot.wa.gov/ferries/images/pages/boat_drawings/3-issaquah-sillouette_sml.gif"",""Olympic""}",1,true,true,true,true,true,true,true,false,"""The vessel has two ADA complia…",,"""“The Chimacum People who spoke…","""Chimacum is the third of the 1…","""83' 2""""","""Seattle, WA""",17,"""18'""",2,6000,"""362' 3""""",1500,false,false,"""DIESEL""",192,144,34,3525,4384,2017,,,false,
15,15,"""Issaquah""","""ISS""","{10,310,""Issaquah 130"",40,""https://www.wsdot.wa.gov/ferries/images/pages/boat_drawings/4-issaquah130.gif"",""https://www.wsdot.wa.gov/ferries/images/pages/boat_drawings/4-issaquah130-sillouette_sml.gif"",""Issaquah""}",1,true,true,false,true,true,true,true,false,"""The MV Issaquah has elevator a…",""" ""","""""Snake."" Native Americans who …",""" ""","""78' 8""""","""Seattle, WA""",16,"""16' 6""""",2,5000,"""328'""",1200,false,false,"""DIESEL""",188,124,26,2475,3310,1979,1989,,false,
…,…,…,…,…,…,…,…,…,…,…,…,…,…,…,…,…,…,…,…,…,…,…,…,…,…,…,…,…,…,…,…,…,…,…,…,…,…,…
33,33,"""Tillikum""","""TIL""","{20,311,""Evergreen State"",60,""https://www.wsdot.wa.gov/ferries/images/pages/boat_drawings/5-evergreenstate.gif"",""https://www.wsdot.wa.gov/ferries/images/pages/boat_drawings/4-issaquah130-sillouette_sml.gif"",""Evergreen State""}",1,true,false,false,true,true,true,true,false,"""The MV Tillikum has an elevato…",""" ""","""Chinook Jargon: ""friends; rela…",""" ""","""73' 2""""","""Seattle, WA""",13,"""15 6""""",2,2500,"""310' 2""""",1061,false,false,"""DIESEL-ELECTRIC (AC)""",162,87,30,2070,2413,1959,1994,,false,
68,462,"""Tokitae""","""TOK""","{100,319,""Olympic"",35,""https://www.wsdot.wa.gov/ferries/images/pages/boat_drawings/8-olympic-2014.gif"",""https://www.wsdot.wa.gov/ferries/images/pages/boat_drawings/3-issaquah-sillouette_sml.gif"",""Olympic""}",1,true,false,true,true,true,true,true,false,"""The vessel has two ADA complia…",,"""Tokitae means ""nice day, prett…","""Tokitae is the first of the 14…","""83' 2""""","""Seattle, WA""",17,"""18'""",2,6000,"""362' 3""""",1500,false,false,"""DIESEL""",192,144,34,3525,4384,2014,,,false,
36,36,"""Walla Walla""","""WAL""","{70,316,""Jumbo"",20,""https://www.wsdot.wa.gov/ferries/images/pages/boat_drawings/1-jumbo.gif"",""https://www.wsdot.wa.gov/ferries/images/pages/boat_drawings/1-jumbo-sillouette_sml.gif"",""Jumbo""}",1,true,false,false,true,true,true,true,false,"""The MV Walla Walla has elevato…",""" ""","""Nez Perce for ""place of many w…",""" ""","""87'""","""Seattle""",18,"""18'""",4,11500,"""440'""",2000,false,false,"""DIESEL-ELECTRIC (DC)""",186,188,60,3246,4860,1973,2003,,false,
37,37,"""Wenatchee""","""WEN""","{90,318,""Jumbo Mark II"",10,""https://www.wsdot.wa.gov/ferries/images/pages/boat_drawings/0-mark2.gif"",""https://www.wsdot.wa.gov/ferries/images/pages/boat_drawings/0-mark2-sillouette_sml.gif"",""Jumbo Mark II""}",1,true,false,false,true,true,true,true,false,"""The MV Wenatchee has elevator …","""27 May 1998 Todd Pacific Ship…","""From the Yakama language comes…","""Todd Shipyard Delivery Date: 2…","""90'""","""Seattle, WA""",18,"""17' 3""""",4,16000,"""460' 2""""",2499,false,false,"""DIESEL-ELECTRIC (AC)""",184,202,60,4938,6184,1998,,,false,


## Task 2 - write data to database

### 🔄 Task

- Save `vessel_verbose_raw` to the database.
- Ideally we want to do most of our data tidying in "Step 2", but this dataset has a struct that won't save to the database. So we will need to do some tidying at this phase.
- This way, we do not need to hit the API every time we need to interact with the raw data.

### 🧑‍💻 Code

The column `Class` is a struct. Each row contains a dictionary object of key value pairs.

In [18]:
vessel_verbose_raw.get_column("Class")

Class
struct[7]
"{10,310,""Issaquah 130"",40,""https://www.wsdot.wa.gov/ferries/images/pages/boat_drawings/4-issaquah130.gif"",""https://www.wsdot.wa.gov/ferries/images/pages/boat_drawings/4-issaquah130-sillouette_sml.gif"",""Issaquah""}"
"{10,310,""Issaquah 130"",40,""https://www.wsdot.wa.gov/ferries/images/pages/boat_drawings/4-issaquah130.gif"",""https://www.wsdot.wa.gov/ferries/images/pages/boat_drawings/4-issaquah130-sillouette_sml.gif"",""Issaquah""}"
"{162,427,""Kwa-di Tabil"",75,""https://www.wsdot.wa.gov/ferries/images/pages/boat_drawings/13-kwaditabil.gif"",""https://www.wsdot.wa.gov/ferries/images/pages/boat_drawings/13-kwaditabil-silouette_sml.gif"",""Kwa-di Tabil""}"
"{100,319,""Olympic"",35,""https://www.wsdot.wa.gov/ferries/images/pages/boat_drawings/8-olympic-2014.gif"",""https://www.wsdot.wa.gov/ferries/images/pages/boat_drawings/3-issaquah-sillouette_sml.gif"",""Olympic""}"
"{10,310,""Issaquah 130"",40,""https://www.wsdot.wa.gov/ferries/images/pages/boat_drawings/4-issaquah130.gif"",""https://www.wsdot.wa.gov/ferries/images/pages/boat_drawings/4-issaquah130-sillouette_sml.gif"",""Issaquah""}"
…
"{20,311,""Evergreen State"",60,""https://www.wsdot.wa.gov/ferries/images/pages/boat_drawings/5-evergreenstate.gif"",""https://www.wsdot.wa.gov/ferries/images/pages/boat_drawings/4-issaquah130-sillouette_sml.gif"",""Evergreen State""}"
"{100,319,""Olympic"",35,""https://www.wsdot.wa.gov/ferries/images/pages/boat_drawings/8-olympic-2014.gif"",""https://www.wsdot.wa.gov/ferries/images/pages/boat_drawings/3-issaquah-sillouette_sml.gif"",""Olympic""}"
"{70,316,""Jumbo"",20,""https://www.wsdot.wa.gov/ferries/images/pages/boat_drawings/1-jumbo.gif"",""https://www.wsdot.wa.gov/ferries/images/pages/boat_drawings/1-jumbo-sillouette_sml.gif"",""Jumbo""}"
"{90,318,""Jumbo Mark II"",10,""https://www.wsdot.wa.gov/ferries/images/pages/boat_drawings/0-mark2.gif"",""https://www.wsdot.wa.gov/ferries/images/pages/boat_drawings/0-mark2-sillouette_sml.gif"",""Jumbo Mark II""}"


In [19]:
vessel_verbose_raw.get_column("Class").to_list()[0]

{'ClassID': 10,
 'ClassSubjectID': 310,
 'ClassName': 'Issaquah 130',
 'SortSeq': 40,
 'DrawingImg': 'https://www.wsdot.wa.gov/ferries/images/pages/boat_drawings/4-issaquah130.gif',
 'SilhouetteImg': 'https://www.wsdot.wa.gov/ferries/images/pages/boat_drawings/4-issaquah130-sillouette_sml.gif',
 'PublicDisplayName': 'Issaquah'}

This data would be easier to work with if it was in a tabular format, and not a nested dictionary. To do this, unnest the `Class` struct so that each data point is in its own column.

In [20]:
vessel_verbose_raw.select("VesselName", "Class").head(2)

VesselName,Class
str,struct[7]
"""Cathlamet""","{10,310,""Issaquah 130"",40,""https://www.wsdot.wa.gov/ferries/images/pages/boat_drawings/4-issaquah130.gif"",""https://www.wsdot.wa.gov/ferries/images/pages/boat_drawings/4-issaquah130-sillouette_sml.gif"",""Issaquah""}"
"""Chelan""","{10,310,""Issaquah 130"",40,""https://www.wsdot.wa.gov/ferries/images/pages/boat_drawings/4-issaquah130.gif"",""https://www.wsdot.wa.gov/ferries/images/pages/boat_drawings/4-issaquah130-sillouette_sml.gif"",""Issaquah""}"


In [21]:
vessel_verbose_raw = vessel_verbose_raw.unnest("Class")

In [22]:
vessel_verbose_raw.select("VesselName", "DrawingImg", "PublicDisplayName").head(2)

VesselName,DrawingImg,PublicDisplayName
str,str,str
"""Cathlamet""","""https://www.wsdot.wa.gov/ferri…","""Issaquah"""
"""Chelan""","""https://www.wsdot.wa.gov/ferri…","""Issaquah"""


VesselDrawingImg only as null values, so we should drop it.

In [23]:
vessel_verbose_raw.get_column("VesselDrawingImg").value_counts()

VesselDrawingImg,count
null,u32
,21


In [24]:
vessel_verbose_raw = vessel_verbose_raw.drop("VesselDrawingImg")

Now we can write the data to the database.

In [28]:
# Get the database credentials
if Path(".env").exists():
    print("Loading .env")
    load_dotenv(override=True)

uri = os.environ["DATABASE_URI_PYTHON"]

Loading .env


In [34]:
# Get your username
from posit.connect import Client

with Client() as client:
    username = client.me.username

username = "nateniemann"

In [35]:
# Write to the database
vessel_verbose_raw.write_database(
    table_name=f"{username}_vessel_verbose_raw",
    connection=uri,
    engine="adbc",
    if_table_exists='replace'
)

21

To reuse this data in future code we can use `pl.read_database_uri`.

In [37]:
# Test that you can read the data
pl.read_database_uri(
    query=f"SELECT * FROM {username}_vessel_verbose_raw LIMIT 5;",
    uri=uri,
    engine="adbc"
)

VesselID,VesselSubjectID,VesselName,VesselAbbrev,ClassID,ClassSubjectID,ClassName,SortSeq,DrawingImg,SilhouetteImg,PublicDisplayName,Status,OwnedByWSF,CarDeckRestroom,CarDeckShelter,Elevator,ADAAccessible,MainCabinGalley,MainCabinRestroom,PublicWifi,ADAInfo,AdditionalInfo,VesselNameDesc,VesselHistory,Beam,CityBuilt,SpeedInKnots,Draft,EngineCount,Horsepower,Length,MaxPassengerCount,PassengerOnly,FastFerry,PropulsionInfo,TallDeckClearance,RegDeckSpace,TallDeckSpace,Tonnage,Displacement,YearBuilt,YearRebuilt,SolasCertified,MaxPassengerCountForInternational
i64,i64,str,str,i64,i64,str,i64,str,str,str,i64,bool,bool,bool,bool,bool,bool,bool,bool,str,str,str,str,str,str,i64,str,i64,i64,str,i64,bool,bool,str,i64,i64,i64,i64,i64,i64,i64,bool,i64
1,1,"""Cathlamet""","""CAT""",10,310,"""Issaquah 130""",40,"""https://www.wsdot.wa.gov/ferri…","""https://www.wsdot.wa.gov/ferri…","""Issaquah""",1,True,True,False,True,True,True,True,False,"""The MV Cathlamet has elevator …",""" ""","""From the Kathlamet tribe, the …",""" ""","""78' 8""""","""Seattle, WA""",16,"""16' 6""""",2,5000,"""328'""",1200,False,False,"""DIESEL""",186,124,26,2477,3310,1981,1993.0,False,
2,2,"""Chelan""","""CHE""",10,310,"""Issaquah 130""",40,"""https://www.wsdot.wa.gov/ferri…","""https://www.wsdot.wa.gov/ferri…","""Issaquah""",1,True,True,False,True,True,True,True,False,"""The MV Chelan has elevator acc…",""" ""","""From the Chelan language: Tsi…",""" ""","""78' 8""""","""Seattle, WA""",16,"""16' 9""""",2,5000,"""328'""",1200,False,False,"""DIESEL""",188,124,30,2477,3405,1981,2005.0,True,1090.0
65,428,"""Chetzemoka""","""CHZ""",162,427,"""Kwa-di Tabil""",75,"""https://www.wsdot.wa.gov/ferri…","""https://www.wsdot.wa.gov/ferri…","""Kwa-di Tabil""",1,True,False,False,True,True,True,True,False,"""MV Chetzemoka has elevator acc…",,"""The name honors a friendly Nat…",,"""64'""","""Seattle""",15,"""11'""",2,6000,"""273' 8""""",748,False,False,"""DIESEL""",192,64,9,4623,2415,2010,,False,
74,487,"""Chimacum""","""CHM""",100,319,"""Olympic""",35,"""https://www.wsdot.wa.gov/ferri…","""https://www.wsdot.wa.gov/ferri…","""Olympic""",1,True,True,True,True,True,True,True,False,"""The vessel has two ADA complia…",,"""“The Chimacum People who spoke…","""Chimacum is the third of the 1…","""83' 2""""","""Seattle, WA""",17,"""18'""",2,6000,"""362' 3""""",1500,False,False,"""DIESEL""",192,144,34,3525,4384,2017,,False,
15,15,"""Issaquah""","""ISS""",10,310,"""Issaquah 130""",40,"""https://www.wsdot.wa.gov/ferri…","""https://www.wsdot.wa.gov/ferri…","""Issaquah""",1,True,True,False,True,True,True,True,False,"""The MV Issaquah has elevator a…",""" ""","""""Snake."" Native Americans who …",""" ""","""78' 8""""","""Seattle, WA""",16,"""16' 6""""",2,5000,"""328'""",1200,False,False,"""DIESEL""",188,124,26,2475,3310,1979,1989.0,False,


## Task 3 - Get Other Data Sets

### 🔄 Task

Get the following additional data sets:

- **Vessel History**: the `https://www.wsdot.wa.gov/Ferries/API/Vessels/rest/vesselhistory` endpoint contains historical data about sailings.
- **Terminal locations**: the `https://www.wsdot.wa.gov/Ferries/API/terminals/rest/terminallocations` endpoint contains information about ferry terminals locations.
- **Weather data**: the historical weather API from <https://open-meteo.com/en/docs/historical-weather-api> can be used to get weather data for all of the terminal locations.

### 🧑‍💻 Code

#### Vessel History

In [38]:
# Get all of the vessel names
base_url = "https://www.wsdot.wa.gov/Ferries/API/Vessels/rest"
params = {"apiaccesscode": os.environ["WSDOT_ACCESS_CODE"]}

with httpx.Client(base_url=base_url, params=params) as client:
    response = client.get("/vesselverbose")

vessel_names = [i["VesselName"] for i in response.json()]
vessel_names

['Cathlamet',
 'Chelan',
 'Chetzemoka',
 'Chimacum',
 'Issaquah',
 'Kaleetan',
 'Kennewick',
 'Kitsap',
 'Kittitas',
 'Puyallup',
 'Salish',
 'Samish',
 'Sealth',
 'Spokane',
 'Suquamish',
 'Tacoma',
 'Tillikum',
 'Tokitae',
 'Walla Walla',
 'Wenatchee',
 'Yakima']

In [39]:
# For each vessel, get all of the history from the desired date range. Define
# the start date and end date.
import datetime

In [40]:
# To speed things up, we will only download a subset of the data.
start_date = datetime.date(2024, 3, 1)
start_date

datetime.date(2024, 3, 1)

In [41]:
# Subtract 1 week from today, the Weather API has a 5 day delay.
end_date = datetime.date.today() - datetime.timedelta(weeks=1)
end_date

datetime.date(2024, 8, 5)

The vessel history data set is large. Instead of httpx, we will use hishel, which has built in easy caching. This is really useful when you are developing, and will prevent you from hitting the API too many times.

In [42]:
import hishel

storage = hishel.FileStorage(ttl=60 * 60 * 8)
controller = hishel.Controller(allow_heuristics=True)

cache_transport = hishel.CacheTransport(
    transport=httpx.HTTPTransport(),
    controller=controller,
    storage=storage
)

In [44]:
%%time
# Get the vessel history for each vessel.
vessel_history_json = []

for vessel_name in vessel_names:

    print(f"Getting vessel history for {vessel_name}...")

    with httpx.Client(base_url=base_url, params=params, transport=cache_transport) as client:

        response = client.get(
            f"/vesselhistory/{vessel_name}/{start_date}/{end_date}",
            timeout=30,
            extensions={"force_cache": True}
        )

        print(f"\t{len(response.json()):,} records retrieved for {vessel_name}.")
        print(f"\tCache used: {response.extensions['from_cache']}")

    vessel_history_json += response.json()

Getting vessel history for Cathlamet...
	3,437 records retrieved for Cathlamet.
	Cache used: True
Getting vessel history for Chelan...
	1,807 records retrieved for Chelan.
	Cache used: True
Getting vessel history for Chetzemoka...
	5,099 records retrieved for Chetzemoka.
	Cache used: True
Getting vessel history for Chimacum...
	3,336 records retrieved for Chimacum.
	Cache used: True
Getting vessel history for Issaquah...
	2,201 records retrieved for Issaquah.
	Cache used: True
Getting vessel history for Kaleetan...
	2,181 records retrieved for Kaleetan.
	Cache used: True
Getting vessel history for Kennewick...
	3,078 records retrieved for Kennewick.
	Cache used: True
Getting vessel history for Kitsap...
	3,426 records retrieved for Kitsap.
	Cache used: True
Getting vessel history for Kittitas...
	6,368 records retrieved for Kittitas.
	Cache used: True
Getting vessel history for Puyallup...
	1,883 records retrieved for Puyallup.
	Cache used: True
Getting vessel history for Salish...
	89

Try running the above code again. The second time you run the cell block it will be much faster because all of the results are cached!

In [45]:
# Check how many records were returned.
f"{len(vessel_history_json):,}"

'57,561'

In [46]:
# Preview the first two records.
vessel_history_json[0:2]

[{'VesselId': 31,
  'Vessel': 'Cathlamet',
  'Departing': 'Vashon',
  'Arriving': 'Southworth',
  'ScheduledDepart': '/Date(1709280900000-0800)/',
  'ActualDepart': '/Date(1709280969000-0800)/',
  'EstArrival': '/Date(1709281999000-0800)/',
  'Date': '/Date(1709280900000-0800)/'},
 {'VesselId': 31,
  'Vessel': 'Cathlamet',
  'Departing': 'Southworth',
  'Arriving': 'Fauntleroy',
  'ScheduledDepart': '/Date(1709282100000-0800)/',
  'ActualDepart': '/Date(1709282171000-0800)/',
  'EstArrival': '/Date(1709283349000-0800)/',
  'Date': '/Date(1709282100000-0800)/'}]

In [47]:
# Convert the data from JSON to a polars DataFrame
vessel_history_raw = pl.DataFrame(vessel_history_json)
vessel_history_raw

VesselId,Vessel,Departing,Arriving,ScheduledDepart,ActualDepart,EstArrival,Date
i64,str,str,str,str,str,str,str
31,"""Cathlamet""","""Vashon""","""Southworth""","""/Date(1709280900000-0800)/""","""/Date(1709280969000-0800)/""","""/Date(1709281999000-0800)/""","""/Date(1709280900000-0800)/"""
31,"""Cathlamet""","""Southworth""","""Fauntleroy""","""/Date(1709282100000-0800)/""","""/Date(1709282171000-0800)/""","""/Date(1709283349000-0800)/""","""/Date(1709282100000-0800)/"""
31,"""Cathlamet""","""Fauntleroy""","""Vashon""","""/Date(1709283900000-0800)/""","""/Date(1709284050000-0800)/""","""/Date(1709284950000-0800)/""","""/Date(1709283900000-0800)/"""
31,"""Cathlamet""","""Vashon""","""Southworth""","""/Date(1709285400000-0800)/""","""/Date(1709285430000-0800)/""","""/Date(1709286180000-0800)/""","""/Date(1709285400000-0800)/"""
32,"""Cathlamet""","""Vashon""","""Fauntleroy""","""/Date(1709294700000-0800)/""","""/Date(1709294795000-0800)/""","""/Date(1709295629000-0800)/""","""/Date(1709294700000-0800)/"""
…,…,…,…,…,…,…,…
25,"""Yakima""","""Lopez""","""Anacortes""","""/Date(1722907200000-0700)/""","""/Date(1722913410000-0700)/""","""/Date(1722916143000-0700)/""","""/Date(1722907200000-0700)/"""
27,"""Yakima""","""Anacortes""","""Lopez""","""/Date(1722908700000-0700)/""","""/Date(1722909353000-0700)/""","""/Date(1722911966000-0700)/""","""/Date(1722908700000-0700)/"""
25,"""Yakima""","""Anacortes""","""Shaw""","""/Date(1722911100000-0700)/""","""/Date(1722916920000-0700)/""","""/Date(1722919911000-0700)/""","""/Date(1722911100000-0700)/"""
25,"""Yakima""","""Shaw""","""Orcas""","""/Date(1722914400000-0700)/""","""/Date(1722920125000-0700)/""","""/Date(1722920695000-0700)/""","""/Date(1722914400000-0700)/"""


In [48]:
# Write to the database
vessel_history_raw.write_database(
    table_name=f"{username}_vessel_history_raw",
    connection=uri,
    engine="adbc",
    if_table_exists='replace'
)

57561

#### Terminal Locations

In [49]:
# Get all of the terminal location data
base_url = "https://www.wsdot.wa.gov/Ferries/API/terminals/rest"
params = {"apiaccesscode": os.environ["WSDOT_ACCESS_CODE"]}

with httpx.Client(base_url=base_url, params=params) as client:
    response = client.get("/terminallocations")

In [50]:
# Check how many records were returned.
f"{len(response.json()):,}"

'20'

In [51]:
# Preview the first two records.
response.json()[0:2]

[{'TerminalID': 1,
  'TerminalSubjectID': 111,
  'RegionID': 1,
  'TerminalName': 'Anacortes',
  'TerminalAbbrev': 'ANA',
  'SortSeq': 10,
  'Latitude': 48.507351,
  'Longitude': -122.677,
  'AddressLineOne': '2100 Ferry Terminal Road',
  'AddressLineTwo': None,
  'City': 'Anacortes',
  'State': 'WA',
  'ZipCode': '98221',
  'Country': 'USA',
  'MapLink': 'https://www.google.com/maps/place/Anacortes+Ferry+Terminal,+Anacortes,+WA+98221/@48.5060112,-122.6776819,15z/data=!4m2!3m1!1s0x5485790ea2748ed5:0x5c9a071494b5411f</p>',
  'Directions': 'From Interstate 5 take exit 230 and follow SR 20 westbound to Anacortes. After arriving in Anacortes continue north on Commercial Ave. Turn left on 12th St, which becomes Oakes Ave, and then continue to the ferry terminal.<p>\r\n<b>Dropping off or picking up?</b><p>\r\nWhen approaching the ferry terminal to pick up or drop off and not disabled, follow Ferry Terminal Road to the left of the auto toll booths to the parking lot near the terminal. There s

In [52]:
# List all of the terminal names
{terminal["TerminalName"]: terminal["TerminalAbbrev"] for terminal in response.json()}

{'Anacortes': 'ANA',
 'Bainbridge Island': 'BBI',
 'Bremerton': 'BRE',
 'Clinton': 'CLI',
 'Coupeville ': 'COU',
 'Edmonds': 'EDM',
 'Fauntleroy': 'FAU',
 'Friday Harbor': 'FRH',
 'Kingston': 'KIN',
 'Lopez Island': 'LOP',
 'Mukilteo': 'MUK',
 'Orcas Island': 'ORI',
 'Point Defiance': 'PTD',
 'Port Townsend': 'POT',
 'Seattle': 'P52',
 'Shaw Island': 'SHI',
 'Sidney B.C.': 'SID',
 'Southworth': 'SOU',
 'Tahlequah': 'TAH',
 'Vashon Island': 'VAI'}

In [53]:
terminal_locations_raw = pl.DataFrame(response.json())
terminal_locations_raw

TerminalID,TerminalSubjectID,RegionID,TerminalName,TerminalAbbrev,SortSeq,Latitude,Longitude,AddressLineOne,AddressLineTwo,City,State,ZipCode,Country,MapLink,Directions,DispGISZoomLoc
i64,i64,i64,str,str,i64,f64,f64,str,str,str,str,str,str,str,str,list[struct[3]]
1,111,1,"""Anacortes""","""ANA""",10,48.507351,-122.677,"""2100 Ferry Terminal Road""",,"""Anacortes""","""WA""","""98221""","""USA""","""https://www.google.com/maps/pl…","""From Interstate 5 take exit 23…","[{0,48.507351,-122.677}, {1,48.507351,-122.677}, … {17,48.506612,-122.678006}]"
3,103,4,"""Bainbridge Island""","""BBI""",40,47.622339,-122.509617,"""270 Olympic Drive SE""",,"""Bainbridge Island""","""WA""","""98110""","""USA""","""http://maps.google.com/maps?f=…","""Northbound on Highway 3: Take …","[{0,47.622339,-122.509617}, {1,47.622339,-122.509617}, … {17,47.622682,-122.510387}]"
4,102,4,"""Bremerton""","""BRE""",30,47.561847,-122.624089,"""211 1st Street""",,"""Bremerton""","""WA""","""98337""","""USA""","""https://www.google.com/maps/pl…","""Northbound on Highway 3: Exit…","[{0,47.561847,-122.624089}, {1,47.561847,-122.624089}, … {17,47.562207,-122.624843}]"
5,112,2,"""Clinton""","""CLI""",20,47.9754,-122.349581,"""64 South Ferrydock Road""",,"""Clinton""","""WA""","""98236""","""USA""","""http://maps.yahoo.com/#mvt=m&l…","""Highway 20 on Whidbey Island t…","[{0,47.9754,-122.349581}, {1,47.9754,-122.349581}, … {17,47.975027,-122.351335}]"
11,116,2,"""Coupeville ""","""COU""",40,48.159008,-122.672603,"""1400 South State Route 20""",,"""Coupeville""","""WA""","""98239""","""USA""","""https://maps.google.com/maps?q…","""Northbound/from Clinton ferry …","[{0,48.159008,-122.672603}, {1,48.159008,-122.672603}, … {17,48.159206,-122.672671}]"
…,…,…,…,…,…,…,…,…,…,…,…,…,…,…,…,…
18,118,1,"""Shaw Island""","""SHI""",30,48.584792,-122.92965,"""PO Box 455 (Mailing)""","""12 Blind Bay Rd.""","""Shaw Island""","""WA""","""98286""","""USA""","""http://maps.yahoo.com/py/maps.…","""From Shaw Island County Park o…","[{0,48.584792,-122.92965}, {1,48.584792,-122.92965}, … {17,48.58448,-122.929844}]"
19,120,1,"""Sidney B.C.""","""SID""",60,48.643114,-123.396739,"""PO Box 2248 Mailing""","""2499 Ocean Avenue (Physical)""","""Sidney""","""BC""","""V8L1T3""","""Canada""",""" http://maps.google.com/maps?…","""<b>Sidney/Anacortes ferry Term…","[{0,48.643114,-123.396739}, {1,48.643114,-123.396739}, … {17,48.643608,-123.397385}]"
20,105,5,"""Southworth""","""SOU""",35,47.513064,-122.495742,"""11700 SE SEDGWICK RD""",,"""Southworth""","""WA""","""98386""","""USA""","""http://maps.google.com/maps?q=…","""From I-5: Take exit 132 and pr…","[{0,47.513064,-122.495742}, {1,47.513064,-122.495742}, … {17,47.512954,-122.495893}]"
21,121,5,"""Tahlequah""","""TAH""",55,47.331961,-122.507786,"""Vashon Hwy SW and SW Tahlequah…",,"""Vashon""","""WA""","""98070""","""USA""","""http://maps.google.com/maps?q=…","""The Tahlequah terminal is loca…","[{0,47.331961,-122.507786}, {1,47.331961,-122.507786}, … {17,47.332705,-122.507328}]"


Before saving to the database drop the `DispGISZoomLoc` column which we will not need and is not in a format supported by the database.

In [54]:
# Write to the database
terminal_locations_raw.drop("DispGISZoomLoc").write_database(
    table_name=f"{username}_terminal_locations_raw",
    connection=uri,
    engine="adbc",
    if_table_exists='replace'
)

20

#### Terminal Weather

Get the weather data from <https://open-meteo.com/en/docs>. Here is an example URL:

`https://api.open-meteo.com/v1/forecast?latitude=52.52&longitude=13.41&hourly=temperature_2m,precipitation,cloud_cover,visibility,wind_speed_10m`

First, generate a list of date ranges. If we should provide the entire date range the API call will take a really long time and is more likely to time out. So instead we should break up the API call into many smaller chunks.

In [55]:
# Get a list of of date ranges, starting from start_date.
_start_date = start_date
_end_date = _start_date + datetime.timedelta(weeks=4)
date_ranges = [(start_date, _end_date)]

while True:
    _start_date = _end_date + datetime.timedelta(days=1)
    _end_date = min(_start_date + datetime.timedelta(weeks=4), end_date)
    date_ranges.append((_start_date, _end_date))

    if _end_date == end_date:
        break

date_ranges

[(datetime.date(2024, 3, 1), datetime.date(2024, 3, 29)),
 (datetime.date(2024, 3, 30), datetime.date(2024, 4, 27)),
 (datetime.date(2024, 4, 28), datetime.date(2024, 5, 26)),
 (datetime.date(2024, 5, 27), datetime.date(2024, 6, 24)),
 (datetime.date(2024, 6, 25), datetime.date(2024, 7, 23)),
 (datetime.date(2024, 7, 24), datetime.date(2024, 8, 5))]

Iterate over each date range, and each terminal location, saving the weather data. All of the data will be saved to the `json_data` variable.

In [57]:
%%time

import time
from typing import TypedDict
from itertools import product

base_url = "https://archive-api.open-meteo.com/v1/"


class WeatherParams(TypedDict):
    hourly: list[str]
    latitude: float
    longitude: float
    start_date: datetime.date
    end_date: datetime.date


json_data = []

with httpx.Client(base_url=base_url, transport=cache_transport) as client:

    for terminal, date_range in product(
        terminal_locations_raw.select("Latitude", "Longitude", "TerminalName").to_dicts(),
        date_ranges
    ):

        params: WeatherParams = {
            "hourly": [
                "weather_code",
                "temperature_2m",
                "precipitation",
                "cloud_cover",
                "wind_speed_10m",
                "wind_direction_10m",
                "wind_gusts_10m",
            ],
            "start_date": date_range[0],
            "end_date": date_range[1],
            "latitude": round(terminal["Latitude"], 2),
            "longitude": round(terminal["Longitude"], 2),
        }

        print(" ".join([
            f'Getting records for: {terminal["TerminalName"]} <>',
            f'{params["latitude"]}, {params["longitude"]} <>',
            f'{params["start_date"]} to {params["end_date"]}...'
        ]))

        response = client.get("/archive", params=params, extensions={"force_cache": True})

        try:
            print(f"\t{response}")
            print(f"\tfound {len(response.json()):,} records")
            print(f"\tFrom cache: {response.extensions['from_cache']}")
            response.raise_for_status()
            _json_data = response.json()
            _json_data["terminal_name"] = terminal["TerminalName"]
            json_data.append(_json_data)

        except httpx.HTTPStatusError as exc:
            if response.status_code == 429:
                print("\tRate limit exceeded. Waiting 60 seconds...")
                time.sleep(60)
                response = client.get("/forecast", params=params)
                print(f"\t{response}")
                print(f"\tfound {len(response.json()):,} records")
                print(f"\tFrom cache: {response.extensions['from_cache']}")
                response.raise_for_status()
                _json_data = response.json()
                _json_data["terminal_name"] = terminal["TerminalName"]
                json_data.append(_json_data)
            else:
                raise exc

Getting records for: Anacortes <> 48.51, -122.68 <> 2024-03-01 to 2024-03-29...
	<Response [200 OK]>
	found 9 records
	From cache: True
Getting records for: Anacortes <> 48.51, -122.68 <> 2024-03-30 to 2024-04-27...
	<Response [200 OK]>
	found 9 records
	From cache: True
Getting records for: Anacortes <> 48.51, -122.68 <> 2024-04-28 to 2024-05-26...
	<Response [200 OK]>
	found 9 records
	From cache: True
Getting records for: Anacortes <> 48.51, -122.68 <> 2024-05-27 to 2024-06-24...
	<Response [200 OK]>
	found 9 records
	From cache: True
Getting records for: Anacortes <> 48.51, -122.68 <> 2024-06-25 to 2024-07-23...
	<Response [200 OK]>
	found 9 records
	From cache: True
Getting records for: Anacortes <> 48.51, -122.68 <> 2024-07-24 to 2024-08-05...
	<Response [200 OK]>
	found 9 records
	From cache: True
Getting records for: Bainbridge Island <> 47.62, -122.51 <> 2024-03-01 to 2024-03-29...
	<Response [200 OK]>
	found 9 records
	From cache: True
Getting records for: Bainbridge Island <

Hishel was used for caching again. Re-run the above code chunk and note how much faster it executes. Then, convert the JSON data into a polars DataFrame.

In [58]:
pl.DataFrame(json_data).head(2)

latitude,longitude,generationtime_ms,utc_offset_seconds,timezone,timezone_abbreviation,elevation,hourly_units,hourly,terminal_name
f64,f64,f64,i64,str,str,f64,struct[8],struct[8],str
48.541298,-122.727264,0.223994,0,"""GMT""","""GMT""",0.0,"{""iso8601"",""wmo code"",""°C"",""mm"",""%"",""km/h"",""°"",""km/h""}","{[""2024-03-01T00:00"", ""2024-03-01T01:00"", … ""2024-03-29T23:00""],[3, 51, … 0],[6.2, 5.9, … 9.4],[0.0, 0.1, … 0.0],[80, 94, … 5],[37.4, 36.7, … 7.0],[154, 159, … 12],[47.9, 48.6, … 10.1]}","""Anacortes"""
48.541298,-122.727264,0.215054,0,"""GMT""","""GMT""",0.0,"{""iso8601"",""wmo code"",""°C"",""mm"",""%"",""km/h"",""°"",""km/h""}","{[""2024-03-30T00:00"", ""2024-03-30T01:00"", … ""2024-04-27T23:00""],[0, 0, … 51],[9.9, 9.8, … 9.6],[0.0, 0.0, … 0.1],[9, 3, … 100],[16.1, 19.4, … 33.1],[336, 342, … 147],[17.3, 23.0, … 42.8]}","""Anacortes"""


In [59]:
terminal_weather = (
    pl.DataFrame(json_data)
    .unnest("hourly")
)

terminal_weather.head()

latitude,longitude,generationtime_ms,utc_offset_seconds,timezone,timezone_abbreviation,elevation,hourly_units,time,weather_code,temperature_2m,precipitation,cloud_cover,wind_speed_10m,wind_direction_10m,wind_gusts_10m,terminal_name
f64,f64,f64,i64,str,str,f64,struct[8],list[str],list[i64],list[f64],list[f64],list[i64],list[f64],list[i64],list[f64],str
48.541298,-122.727264,0.223994,0,"""GMT""","""GMT""",0.0,"{""iso8601"",""wmo code"",""°C"",""mm"",""%"",""km/h"",""°"",""km/h""}","[""2024-03-01T00:00"", ""2024-03-01T01:00"", … ""2024-03-29T23:00""]","[3, 51, … 0]","[6.2, 5.9, … 9.4]","[0.0, 0.1, … 0.0]","[80, 94, … 5]","[37.4, 36.7, … 7.0]","[154, 159, … 12]","[47.9, 48.6, … 10.1]","""Anacortes"""
48.541298,-122.727264,0.215054,0,"""GMT""","""GMT""",0.0,"{""iso8601"",""wmo code"",""°C"",""mm"",""%"",""km/h"",""°"",""km/h""}","[""2024-03-30T00:00"", ""2024-03-30T01:00"", … ""2024-04-27T23:00""]","[0, 0, … 51]","[9.9, 9.8, … 9.6]","[0.0, 0.0, … 0.1]","[9, 3, … 100]","[16.1, 19.4, … 33.1]","[336, 342, … 147]","[17.3, 23.0, … 42.8]","""Anacortes"""
48.541298,-122.727264,0.30005,0,"""GMT""","""GMT""",0.0,"{""iso8601"",""wmo code"",""°C"",""mm"",""%"",""km/h"",""°"",""km/h""}","[""2024-04-28T00:00"", ""2024-04-28T01:00"", … ""2024-05-26T23:00""]","[3, 3, … 3]","[9.9, 9.9, … 11.6]","[0.0, 0.0, … 0.0]","[100, 100, … 100]","[32.9, 35.2, … 18.2]","[146, 146, … 120]","[42.8, 45.7, … 21.6]","""Anacortes"""
48.541298,-122.727264,0.210047,0,"""GMT""","""GMT""",0.0,"{""iso8601"",""wmo code"",""°C"",""mm"",""%"",""km/h"",""°"",""km/h""}","[""2024-05-27T00:00"", ""2024-05-27T01:00"", … ""2024-06-24T23:00""]","[51, 3, … 0]","[11.6, 11.9, … 12.6]","[0.1, 0.0, … 0.0]","[100, 100, … 5]","[17.9, 16.6, … 16.3]","[118, 129, … 229]","[22.0, 20.9, … 21.2]","""Anacortes"""
48.541298,-122.727264,0.205994,0,"""GMT""","""GMT""",0.0,"{""iso8601"",""wmo code"",""°C"",""mm"",""%"",""km/h"",""°"",""km/h""}","[""2024-06-25T00:00"", ""2024-06-25T01:00"", … ""2024-07-23T23:00""]","[0, 0, … 0]","[13.1, 12.7, … 13.2]","[0.0, 0.0, … 0.0]","[7, 4, … 0]","[16.5, 17.7, … 14.8]","[234, 241, … 225]","[19.4, 22.0, … 16.2]","""Anacortes"""


In [60]:
terminal_weather = (
    terminal_weather
    .explode(
        "time",
        "weather_code",
        "temperature_2m",
        "precipitation",
        "cloud_cover",
        "wind_speed_10m",
        "wind_direction_10m",
        "wind_gusts_10m",
    )
)

terminal_weather

latitude,longitude,generationtime_ms,utc_offset_seconds,timezone,timezone_abbreviation,elevation,hourly_units,time,weather_code,temperature_2m,precipitation,cloud_cover,wind_speed_10m,wind_direction_10m,wind_gusts_10m,terminal_name
f64,f64,f64,i64,str,str,f64,struct[8],str,i64,f64,f64,i64,f64,i64,f64,str
48.541298,-122.727264,0.223994,0,"""GMT""","""GMT""",0.0,"{""iso8601"",""wmo code"",""°C"",""mm"",""%"",""km/h"",""°"",""km/h""}","""2024-03-01T00:00""",3,6.2,0.0,80,37.4,154,47.9,"""Anacortes"""
48.541298,-122.727264,0.223994,0,"""GMT""","""GMT""",0.0,"{""iso8601"",""wmo code"",""°C"",""mm"",""%"",""km/h"",""°"",""km/h""}","""2024-03-01T01:00""",51,5.9,0.1,94,36.7,159,48.6,"""Anacortes"""
48.541298,-122.727264,0.223994,0,"""GMT""","""GMT""",0.0,"{""iso8601"",""wmo code"",""°C"",""mm"",""%"",""km/h"",""°"",""km/h""}","""2024-03-01T02:00""",3,5.7,0.0,83,36.6,156,49.0,"""Anacortes"""
48.541298,-122.727264,0.223994,0,"""GMT""","""GMT""",0.0,"{""iso8601"",""wmo code"",""°C"",""mm"",""%"",""km/h"",""°"",""km/h""}","""2024-03-01T03:00""",1,5.1,0.0,40,40.9,152,54.7,"""Anacortes"""
48.541298,-122.727264,0.223994,0,"""GMT""","""GMT""",0.0,"{""iso8601"",""wmo code"",""°C"",""mm"",""%"",""km/h"",""°"",""km/h""}","""2024-03-01T04:00""",2,4.6,0.0,66,40.7,144,54.7,"""Anacortes"""
…,…,…,…,…,…,…,…,…,…,…,…,…,…,…,…,…
47.486816,-122.512314,0.174046,0,"""GMT""","""GMT""",0.0,"{""iso8601"",""wmo code"",""°C"",""mm"",""%"",""km/h"",""°"",""km/h""}","""2024-08-05T19:00""",0,24.5,0.0,0,8.0,190,25.9,"""Vashon Island"""
47.486816,-122.512314,0.174046,0,"""GMT""","""GMT""",0.0,"{""iso8601"",""wmo code"",""°C"",""mm"",""%"",""km/h"",""°"",""km/h""}","""2024-08-05T20:00""",0,26.7,0.0,0,7.2,198,25.9,"""Vashon Island"""
47.486816,-122.512314,0.174046,0,"""GMT""","""GMT""",0.0,"{""iso8601"",""wmo code"",""°C"",""mm"",""%"",""km/h"",""°"",""km/h""}","""2024-08-05T21:00""",0,28.6,0.0,0,6.6,193,25.2,"""Vashon Island"""
47.486816,-122.512314,0.174046,0,"""GMT""","""GMT""",0.0,"{""iso8601"",""wmo code"",""°C"",""mm"",""%"",""km/h"",""°"",""km/h""}","""2024-08-05T22:00""",0,29.7,0.0,0,5.4,188,24.5,"""Vashon Island"""


Drop the hourly_units field, they may not write to the database correctly and we do not need it.

In [61]:
terminal_weather = terminal_weather.select(
    pl.col("*").exclude("hourly_units")
)

terminal_weather.head()

latitude,longitude,generationtime_ms,utc_offset_seconds,timezone,timezone_abbreviation,elevation,time,weather_code,temperature_2m,precipitation,cloud_cover,wind_speed_10m,wind_direction_10m,wind_gusts_10m,terminal_name
f64,f64,f64,i64,str,str,f64,str,i64,f64,f64,i64,f64,i64,f64,str
48.541298,-122.727264,0.223994,0,"""GMT""","""GMT""",0.0,"""2024-03-01T00:00""",3,6.2,0.0,80,37.4,154,47.9,"""Anacortes"""
48.541298,-122.727264,0.223994,0,"""GMT""","""GMT""",0.0,"""2024-03-01T01:00""",51,5.9,0.1,94,36.7,159,48.6,"""Anacortes"""
48.541298,-122.727264,0.223994,0,"""GMT""","""GMT""",0.0,"""2024-03-01T02:00""",3,5.7,0.0,83,36.6,156,49.0,"""Anacortes"""
48.541298,-122.727264,0.223994,0,"""GMT""","""GMT""",0.0,"""2024-03-01T03:00""",1,5.1,0.0,40,40.9,152,54.7,"""Anacortes"""
48.541298,-122.727264,0.223994,0,"""GMT""","""GMT""",0.0,"""2024-03-01T04:00""",2,4.6,0.0,66,40.7,144,54.7,"""Anacortes"""


In [62]:
# Write to the database
terminal_weather.write_database(
    table_name=f"{username}_terminal_weather_raw",
    connection=uri,
    engine="adbc",
    if_table_exists='replace'
)

75840

## Task 4 - Publish the solution notebook to Connect

### 🔄 Task

- Publish the solution notebook to Posit Connect.
- Share the notebook with the rest of the workshop.
- Schedule the notebook to run once every week.

### 🧑‍💻 Code

Run the following to deploy the notebook to Connect:

```bash
source .env

# Check if your Connect environment variables are set
echo $CONNECT_SERVER
echo $CONNECT_API_KEY

# Check that you have the required environment variables set
echo $DATABASE_URI_PYTHON 
echo $WSDOT_ACCESS_CODE

# Publish the notebook
rsconnect deploy notebook --title "Seattle Ferries #1 - Raw data" -E WSDOT_ACCESS_CODE notebook.ipynb
```

After the deployment is successful:

- Share the notebook with the person beside you.
- Schedule the notebook to run once every week.

In [63]:
print("Notebook complete ✅")

Notebook complete ✅
