In [1]:
import duckdb

# Load SQL extension, configure display limit
%load_ext sql
%config SqlMagic.displaylimit = 0

# Initialize 🦆 DuckDB connection
conn = duckdb.connect()

# Import database
%sql conn --alias duckdb
%sql IMPORT DATABASE '../../data/nps';

Deploy FastAPI apps for free on Ploomber Cloud! Learn more: https://ploomber.io/s/signup


Count
40


Now we get to have some fun! **Window** functions works by breaking up relations into _independent_ partitions, ordering those partitions and then computing a new column for each row. If this sounds complex, it might be at first, but we'll break it down for you.

If this sounds computationally intensive, it is... So we'll have to be careful when windowing over large datasets.

When we build a window function, we're basically iterating over a slice of data _relative_ to the other rows around it. This is a powerful pattern and one that makes SQL stand out as an excellent language for querying relational data.


Here are some sample questions well suited to windows:
- Which park has the most campsites (we'll show you why this is easy with windows)?
- What is the last event in January?
- 

In [2]:
%%sql
SELECT
    DISTINCT p.fullname as park_name,

    -- For each park, which campground has the maximum number of campsites?
    MAX(c.numberofsitesfirstcomefirstserve) OVER (PARTITION BY park_name) as max_num_fcfc,
    MAX(c.numberofsitesreservable) OVER (PARTITION BY park_name) as max_num_reserve,
    MAX(c.numberofsitesfirstcomefirstserve + c.numberofsitesreservable) OVER (PARTITION BY park_name) as max_num_campsites,

    -- For each park, which _campsite_ has the maximum number of campsites?
    FIRST(c.name) OVER (PARTITION BY park_name ORDER BY c.numberofsitesfirstcomefirstserve DESC) as max_num_fcfs_site,
    FIRST(c.name) OVER (PARTITION BY park_name ORDER BY c.numberofsitesreservable DESC) as max_num_reserve_site,
    FIRST(c.name) OVER (PARTITION BY park_name ORDER BY c.numberofsitesfirstcomefirstserve + c.numberofsitesreservable DESC) as max_num_campsites_site

FROM nps_public_data.campgrounds c
INNER JOIN nps_public_data.parks p
    ON c.parkcode = p.parkcode
    AND p.designation = 'National Park'
ORDER BY max_num_campsites DESC
LIMIT 10;

park_name,max_num_fcfc,max_num_reserve,max_num_campsites,max_num_fcfs_site,max_num_reserve_site,max_num_campsites_site
Mesa Verde National Park,267,267,534,Morefield Campground,Morefield Campground,Morefield Campground
Yellowstone National Park,0,432,432,Fishing Bridge RV Park,Bridge Bay Campground,Bridge Bay Campground
Grand Teton National Park,10,347,347,Jenny Lake Campground,Colter Bay Campground,Colter Bay Campground
Grand Canyon National Park,15,300,315,Mather Campground - South Rim,Mather Campground - South Rim,Mather Campground - South Rim
Yosemite National Park,156,235,304,Tuolumne Meadows Campground,Upper Pines Campground,Tuolumne Meadows Campground
Acadia National Park,0,281,281,Duck Harbor Campground,Blackwoods Campground,Blackwoods Campground
Rocky Mountain National Park,27,239,239,Aspenglen Campground,Moraine Park Campground,Moraine Park Campground
Death Valley National Park,230,136,230,Sunset Campground,Furnace Creek Campground,Sunset Campground
Shenandoah National Park,50,220,220,Loft Mountain Campground,Big Meadows Campground,Big Meadows Campground
Everglades National Park,108,159,216,Long Pine Key Campground,Flamingo Campground,Long Pine Key Campground
