Skip to content
/ obsio Public

obsio is a Python package that provides a consistent generic interface for accessing weather and climate observations from multiple different data providers.

License

Notifications You must be signed in to change notification settings

jaredwo/obsio

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

81 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

obsio

obsio is an Python package that provides a consistent generic interface for accessing weather and climate observations from multiple different data providers. All station and observation data are returned using pandas or xarray data structures.

Installation

obsio has the following dependencies:

The easiest method to install the required dependencies is with a combination of conda and pip:

conda create -n obsio_env python=3 lxml netCDF4 numpy pandas pycurl pytz scipy shapely xarray
pip install suds-py3
pip install tzwhere

And then install obsio from source:

git clone https://github.com/jaredwo/obsio.git
pip install obsio/

Available Data Providers

obsio currently has full or partial support for a number of climate and weather data providers. Only daily and monthly elements are supported at this time, but hourly and sub-hourly can easily be added.

Provider Name Currently Supported Elements Req. Local Storage
ACIS tmin,tmax,prcp,tobs_tmin,tobs_tmax, tobs_prcp No
GHCN-D tmin,tmax,prcp,tobs_tmin,tobs_tmax, tobs_prcp Optional
ISDLite tmin,tmax,tdew,tdewmin,tdewmax,vpd, vpdmin,vpdmax,rh,rhmin,rhmax,prcp No
MADIS tmin,tmax,prcp,tdew,tdewmin,tdewmax, vpd,vpdmin,vpdmax,rh,rhmin,rhmax,srad, wspd Yes
NRCS tmin,tmax,prcp,snwd,swe No
USHCN *_mth_raw,*_mth_tob,*_mth_fls Yes
WRCC tmin,tmax,tdew,tdewmin,tdewmax,vpd, vpdmin,vpdmax,rh,rhmin,rhmax,prcp,srad, wspd No

Element definitions:

  • tmin : daily minimum temperature (C)
  • tmax : daily maximum temperature (C)
  • tdew : daily average dewpoint (C)
  • tdewmin : daily minimum dewpoint (C)
  • tdewmax : daily maximum dewpoint (C)
  • vpd : daily average vapor pressure deficit (Pa)
  • vpdmin : daily minimum vapor pressure deficit (Pa)
  • vpdmax : daily maximum vapor pressure deficit (Pa)
  • rh : daily average relative humidity (%)
  • rhmin : daily minimum relative humidity (%)
  • rhmax : daily maximum relative humidity (%)
  • prcp : daily total precipitation (mm)
  • srad : daily 24-hr average incoming solar radiation (w m-2)
  • wspd : daily average windspeed (m s-1)
  • snwd : snow depth (mm)
  • swe : snow water equivalent (mm)
  • tobs_tmin : time-of-observation for daily tmin (local hr)
  • tobs_tmax : time-of-observation for daily tmax (local hr)
  • tobs_prcp : time-of-observation for daily prcp (local hr)
  • *_mth_raw : USHCN-specific elements. Original, raw monthly elements:
    • tmin_mth_raw (C)
    • tmax_mth_raw (C)
    • tavg_mth_raw(C)
    • prcp_mth_raw (mm)
  • *_mth_tob : USHCN-specific elements. Time-of-observation adjusted elements:
    • tmin_mth_tob (C)
    • tmax_mth_tob (C)
    • tavg_mth_tob (C)
  • *_mth_fls : USHCN-specific elements. Homogenized and infilled elements:
    • tmin_mth_fls (C)
    • tmax_mth_fls (C)
    • tavg_mth_fls (C)
    • prcp_mth_fls (mm)

Usage

The main entry point for using obsio is through ObsIoFactory. ObIoFactory is used to build ObsIO objects for accessing station metadata and observations from specific providers.

# Example code for accessing NRCS SNOTEL/SCAN observations in the Pacific
# Northwest for January 2015

import obsio
import pandas as pd

# List of elements to obtain
elems = ['tmin', 'tmax', 'swe']

# Lat/Lon bounding box for the Pacific Northwest
bbox = obsio.BBox(west_lon=-126, south_lat=42, east_lon=-111, north_lat=50)

# Start, end dates as pandas Timestamp objects
start_date = pd.Timestamp('2015-01-01')
end_date = pd.Timestamp('2015-01-31')

# Initialize factory with specified parameters
obsiof = obsio.ObsIoFactory(elems, bbox, start_date, end_date)

# Create ObsIO object for accessing daily NRCS observations
nrcs_io = obsiof.create_obsio_dly_nrcs()

# All ObsIO objects contain a stns attribute. This is a pandas DataFrame
# containing metadata for all stations that met the specified parameters.
print nrcs_io.stns

# Access observations using read_obs() method. By default, read_obs() will
# return observations for all stations in the stns attribute
obs = nrcs_io.read_obs()

# Observations are provided in a pandas DataFrame. Observation values are
# indexed by a 3 level multi-index: station_id, elem, time
print obs

# To access observations for only a few specific stations, send in a list
# of station ids to read_obs()
obs = nrcs_io.read_obs(['11E07S', '11E31S'])

In contrast to the NRCS SNOTEL/SCAN example, some ObsIO provider objects require all observation data to first be downloaded and stored locally, and then parsed (see provider table above). The data directory for local storage can be pre-specified in a 'OBSIO_DATA' environmental variable or specified as a parameter when creating the ObsIO object. If no directory is specified, obsio will default to a standard temporary directory. Example:

# Example code for accessing GHCN-D observations in the Pacific
# Northwest for January 2015. GHCN-D is a data provider that
# has an option to download and store observations locally for more
# efficient bulk parsing and access.

import obsio
import pandas as pd

# List of elements to obtain
elems = ['tmin', 'tmax']

# Lat/Lon bounding box for the Pacific Northwest
bbox = obsio.BBox(west_lon=-126, south_lat=42, east_lon=-111, north_lat=50)

# Start, end dates as pandas Timestamp objects
start_date = pd.Timestamp('2015-01-01')
end_date = pd.Timestamp('2015-01-31')

# Initialize factory with specified parameters
obsiof = obsio.ObsIoFactory(elems, bbox, start_date, end_date)

# Create ObsIO object for accessing GHCN-D observations in bulk mode.
# A local data path can be specified in the create_obsio_dly_ghcnd() call.
# If not specified, the 'OBSIO_DATA' environmental variable will be checked.
# If 'OBSIO_DATA' doesn't exist, a default temporary directory will be used.
ghcnd_io = obsiof.create_obsio_dly_ghcnd(bulk=True)

# Access observations for first 10 stations using the read_obs() method.
# First call to read_obs() will take several minutes due to initial data
# download.
obs = ghcnd_io.read_obs(ghcnd_io.stns.station_id.iloc[0:10])

About

obsio is a Python package that provides a consistent generic interface for accessing weather and climate observations from multiple different data providers.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages