Skip to content


Switch branches/tags

Name already in use

A tag already exists with the provided branch name. Many Git commands accept both tag and branch names, so creating this branch may cause unexpected behavior. Are you sure you want to create this branch?

Latest commit


Git stats


Failed to load latest commit information.
Latest commit message
Commit time

ISD —— Integrated Surface Data


Download, Parse, Visualize Integrated Surface Dataset

Including 30000 meteorology station, sub-hourly observation records, from 1900-2021.

Quick Start

make all will just setup everything

Internet (Github & noaa) access required

Make Baseline Works

Run make baseline will create a minimal usable production via:

make sql        # load isd database schema into postgres (via PGURL env)
make ui         # setup grafana dashboards
make download   # download meta data (dict) & parsers
make load-meta  # load meta-data into database

Get This Year's Daily Summary

Get latest daily observation summary (daily, monthly, yearly)

NOTICE: Will download directly from noaa. (check your proxy if too slow! about 60MB per year) around 3~4 GB original zipped file, 20 GB in database

Run make reload will load minimal data (this year so far) to database.

make get-daily   # get latest observation daily summary (of latest year e.g 2021)
make load-daily  # load latest daily data into database (of latest year e.g 2021)
make refresh     # refresh monthly & yearly data based on daily data 

ISD Daily and ISD hourly dataset will roll update each day. Run these commands to get daily update.

Get This Year's Hourly Raw Data

Get the latest hourly observation raw data (not recommended)

WARNING: hourly raw data are large dataset with tons of noisy. around 5GB per year around 100 GB original zipped file, 1TB in database

Run make reload-hourly will load minimal raw data (this year so far) to database.

make get-hourly   # get latest observation daily summary (of latest year e.g 2021)
make load-hourly  # load latest daily data into database (of latest year e.g 2021) 

Pour more historic data

You can download hourly & daily data by specific year.

# bin/ <year> will get specific year's observation daily summary (1929-2021)
bin/ 2020     # get 2020 data

# bin/ <year> will get latest observation daily summary (1900-2021)
bin/ 2020 

And load them into database with parser:

# bin/ <PGURL> <year> will load <year>'s daily summary into PGURL database 
bin/ service=meta 2020     # note there may have some dirty data that violate constraints

# bin/ <PGURL> <year> will load <year>'s raw hourly data into PGURL database
bin/ service=meta 2020



Dataset Sample Document Comments
ISD Hourly isd-hourly-sample.csv isd-hourly-document.pdf (Sub-) Hour oberservation records
ISD Daily isd-daily-sample.csv isd-daily-format.txt Daily summary
ISD Monthly N/A isd-gsom-document.pdf Not used, gen from daily
ISD Yearly N/A isd-gsoy-document.pdf Not used, gen from monthly

Hourly Data: Oringinal tarball size 105GB, Table size 1TB (+600GB Indexes).

Daily Data: Oringinal tarball size 3.2GB, table size 24 GB

It is recommended to have 2TB storage for a full installation, and at least 40GB for daily data only installation.


Data schema definition


CREATE TABLE isd.station
    station    VARCHAR(12) PRIMARY KEY,
    usaf       VARCHAR(6) GENERATED ALWAYS AS (substring(station, 1, 6)) STORED,
    wban       VARCHAR(5) GENERATED ALWAYS AS (substring(station, 7, 5)) STORED,
    name       VARCHAR(32),
    country    VARCHAR(2),
    province   VARCHAR(2),
    icao       VARCHAR(4),
    location   GEOMETRY(POINT),
    longitude  NUMERIC GENERATED ALWAYS AS (Round(ST_X(location)::NUMERIC, 6)) STORED,
    latitude   NUMERIC GENERATED ALWAYS AS (Round(ST_Y(location)::NUMERIC, 6)) STORED,
    elevation  NUMERIC,
    period     daterange,
    begin_date DATE GENERATED ALWAYS AS (lower(period)) STORED,
    end_date   DATE GENERATED ALWAYS AS (upper(period)) STORED

Hourly Data

CREATE TABLE isd.hourly
    station    VARCHAR(11) NOT NULL,
    ts         TIMESTAMP   NOT NULL,
    temp       NUMERIC(3, 1),
    dewp       NUMERIC(3, 1),
    slp        NUMERIC(5, 1),
    stp        NUMERIC(5, 1),
    vis        NUMERIC(6),
    wd_angle   NUMERIC(3),
    wd_speed   NUMERIC(4, 1),
    wd_gust    NUMERIC(4, 1),
    wd_code    VARCHAR(1),
    cld_height NUMERIC(5),
    cld_code   VARCHAR(2),
    sndp       NUMERIC(5, 1),
    prcp       NUMERIC(5, 1),
    prcp_hour  NUMERIC(2),
    prcp_code  VARCHAR(1),
    mw_code    VARCHAR(2),
    aw_code    VARCHAR(2),
    pw_code    VARCHAR(1),
    pw_hour    NUMERIC(2),
    data       JSONB

Daily Data

CREATE TABLE isd.daily
   station     VARCHAR(12) NOT NULL,
   ts          DATE        NOT NULL,
   temp_mean   NUMERIC(3, 1),
   temp_min    NUMERIC(3, 1),
   temp_max    NUMERIC(3, 1),
   dewp_mean   NUMERIC(3, 1),
   slp_mean    NUMERIC(5, 1),
   stp_mean    NUMERIC(5, 1),
   vis_mean    NUMERIC(6),
   wdsp_mean   NUMERIC(4, 1),
   wdsp_max    NUMERIC(4, 1),
   gust        NUMERIC(4, 1),
   prcp_mean   NUMERIC(5, 1),
   prcp        NUMERIC(5, 1),
   sndp        NuMERIC(5, 1),
   is_foggy    BOOLEAN,
   is_rainy    BOOLEAN,
   is_snowy    BOOLEAN,
   is_hail     BOOLEAN,
   is_thunder  BOOLEAN,
   is_tornado  BOOLEAN,
   temp_count  SMALLINT,
   dewp_count  SMALLINT,
   slp_count   SMALLINT,
   stp_count   SMALLINT,
   wdsp_count  SMALLINT,
   visib_count SMALLINT,
   temp_min_f  BOOLEAN,
   temp_max_f  BOOLEAN,
   prcp_flag   CHAR,
   PRIMARY KEY (ts, station)


There are two parser: isdd and isdh, which takes noaa original yearly tarball as input, generate CSV as output (which could be directly consume by PostgreSQL Copy command).

	isdh -- Intergrated Surface Dataset Hourly Parser

	isdh [-i <input|stdin>] [-o <output|st>] -p -d -c -v

	The isdh program takes isd hourly (yearly tarball file) as input.
	And generate csv format as output

	-i	<input>		input file, stdin by default
	-o	<output>	output file, stdout by default
	-p	<profpath>	pprof file path (disable by default)	
	-v                verbose progress report
	-d                de-duplicate rows (raw, ts-first, hour-first)
	-c                add comma separated extra columns


ISD Overview

Dashboard definition

ISD Station

Dashboard definition

ISD Monthly

Dashboard definition