# GFS Analysis Simulation Data access with Unidata siphon package

This notebook explores different ways to access GFS Analysis simulations.  The period of record online for these data sets is 2007-01-01 to present.  The simulations are run 4 times per day - consider these model states.  Following each simulation 2 forecast runs are made at 003 and 006 hours consider these forecasts.  The variables for the model state and forecast are slightly different.  This notebook explores the different data available that might be used for NHM calibration in the model-state and forecast.

Using <https://unidata.github.io/python-gallery/examples/MSLP_temp_winds.html> as a resource

In [2]:
from datetime import datetime
import cartopy.crs as ccrs
import cartopy.feature as cfeature
import matplotlib.pyplot as plt
from metpy.units import units
from netCDF4 import num2date
import numpy as np
import scipy.ndimage as ndimage
from siphon.ncss import NCSS

# Open Model-state Simulation

Model-states, run 4 times per day at _000, _006, _012, and _018 hours (UTC I believe).  Below we are loading the first daily simulation _000

The available variables are printed and the likely variables to be used in the NHM are queried.  Another view of this can be found by going to the netCDFSubset catalog at <https://www.ncei.noaa.gov/thredds/catalog/model-gfs-g4-anl-files/catalog.html> and then picking a date.

In [3]:
base_url = 'https://www.ncei.noaa.gov/thredds/ncss/model-gfs-g4-anl-files/'
dt = datetime(2020, 5, 6, 0)
ncss = NCSS('{}{dt:%Y%m}/{dt:%Y%m%d}/gfsanl_4_{dt:%Y%m%d}'
            '_{dt:%H}00_000.grb2'.format(base_url, dt=dt))
print('{}{dt:%Y%m}/{dt:%Y%m%d}/gfsanl_4_{dt:%Y%m%d}_{dt:%H}00_000.grb2'.format(base_url, dt=dt))

# Create lat/lon box for location you want to get data for
query = ncss.query().time(dt)
# These are the same lat/lon boundary I use for the daymet data
query.lonlat_box(north=54, south=20, east=-65, west=-126)
query.accept('netcdf')
print(ncss.variables)
# Request data for model "surface" data
query.variables('Temperature_surface',
               'Relative_humidity_sigma',
               'Precipitation_rate_surface',
               'u-component_of_wind_sigma',
               'v-component_of_wind_sigma')

# data = ncss.get_data(query)

https://www.ncei.noaa.gov/thredds/ncss/model-gfs-g4-anl-files/202005/20200506/gfsanl_4_20200506_0000_000.grb2
{'Pressure_potential_vorticity_surface', 'Best_4_layer_Lifted_Index_surface', 'v-component_of_wind_planetary_boundary', 'u-component_of_wind_altitude_above_msl', 'Pressure_maximum_wind', 'Temperature_altitude_above_msl', 'Categorical_Ice_Pellets_surface', 'v-component_of_wind_isobaric', 'u-component_of_wind_sigma', 'Geopotential_height_potential_vorticity_surface', 'Volumetric_Soil_Moisture_Content_depth_below_surface_layer', 'Ventilation_Rate_planetary_boundary', 'Sunshine_Duration_surface', 'u-component_of_wind_maximum_wind', 'Categorical_Freezing_Rain_surface', 'v-component_of_wind_height_above_ground', 'Temperature_maximum_wind', 'Ice_cover_surface', 'Pressure_of_level_from_which_parcel_was_lifted_pressure_difference_layer', 'Planetary_Boundary_Layer_Height_surface', 'Land_cover_0__sea_1__land_surface', 'Pressure_height_above_ground', 'Total_cloud_cover_isobaric', 'u-compon

var=v-component_of_wind_sigma&var=Precipitation_rate_surface&var=Temperature_surface&var=u-component_of_wind_sigma&var=Relative_humidity_sigma&time=2020-05-06T00%3A00%3A00&west=-126&east=-65&south=20&north=54&accept=netcdf

# Open Model-forecast Simulation

Model-states, run 4 times per day at _000, _006, _012, and _018 hours (UTC I believe).  For each model state 2 forecast simultions at _003 and _006 hours are run.  These simulations have variables not available in the model-state simulations that could be used to get daily tmax, tmin, and precip for NHM runs

The available variables are printed and the likely variables to be used in the NHM are queried.  Another view of this can be found by going to the netCDFSubset catalog at <https://www.ncei.noaa.gov/thredds/catalog/model-gfs-g4-anl-files/catalog.html> and then picking a date and forecast result.  Note the _003.grb2 in the URL below.

In [4]:
base_url = 'https://www.ncei.noaa.gov/thredds/ncss/model-gfs-g4-anl-files/'
dt = datetime(2020, 5, 6, 0)
ncss = NCSS('{}{dt:%Y%m}/{dt:%Y%m%d}/gfsanl_4_{dt:%Y%m%d}'
            '_{dt:%H}00_003.grb2'.format(base_url, dt=dt))
print('{}{dt:%Y%m}/{dt:%Y%m%d}/gfsanl_4_{dt:%Y%m%d}_{dt:%H}00_003.grb2'.format(base_url, dt=dt))

# Create lat/lon box for location you want to get data for
# Both the time selections work below
query = ncss.query().time_range(datetime(2020, 5, 6, 3),datetime(2020, 5, 6, 3))
# query = ncss.query().time(datetime(2020, 5, 6, 3))
query.lonlat_box(north=54, south=20, east=-65, west=-126)
query.accept('netcdf')
print(ncss.variables)
# Request data for model "surface" data
query.variables('Maximum_temperature_height_above_ground_3_Hour_Maximum',
                'Minimum_temperature_height_above_ground_3_Hour_Minimum',
                'Relative_humidity_sigma',
                'Total_precipitation_surface_3_Hour_Accumulation',
                'u-component_of_wind_sigma',
                'v-component_of_wind_sigma')

data = ncss.get_data(query)

https://www.ncei.noaa.gov/thredds/ncss/model-gfs-g4-anl-files/202005/20200506/gfsanl_4_20200506_0000_003.grb2
{'Minimum_temperature_height_above_ground_3_Hour_Minimum', 'Meridional_Flux_of_Gravity_Wave_Stress_surface_3_Hour_Average', 'v-component_of_wind_planetary_boundary', 'u-component_of_wind_altitude_above_msl', 'Temperature_altitude_above_msl', 'Categorical_Ice_Pellets_surface', 'u-component_of_wind_sigma', 'Volumetric_Soil_Moisture_Content_depth_below_surface_layer', 'Sunshine_Duration_surface', 'Categorical_Freezing_Rain_surface_3_Hour_Average', 'Categorical_Freezing_Rain_surface', 'v-component_of_wind_height_above_ground', 'Precipitation_rate_surface_3_Hour_Average', 'Ice_cover_surface', 'Pressure_middle_cloud_bottom_3_Hour_Average', 'Upward_Short-Wave_Radiation_Flux_atmosphere_top_3_Hour_Average', 'Total_cloud_cover_isobaric', 'v-component_of_wind_potential_vorticity_surface', 'Relative_humidity_sigma', 'Sensible_heat_net_flux_surface_3_Hour_Average', 'Pressure_surface', 'Conv

In [5]:
data


<class 'netCDF4._netCDF4.Dataset'>
root group (NETCDF3_CLASSIC data model, file format NETCDF3):
    Originating_or_generating_Center: US National Weather Service, National Centres for Environmental Prediction (NCEP)
    Originating_or_generating_Subcenter: 0
    GRIB_table_version: 2,1
    Type_of_generating_process: Forecast
    Analysis_or_forecast_generating_process_identifier_defined_by_originating_centre: Global Forecast System Model T1534 - Forecast hours 00-384 T574 - Forecast hours 00-192 T190 - Forecast hours 204-384
    Conventions: CF-1.6
    history: Read using CDM IOSP GribCollection v3
    featureType: GRID
    History: Translated to CF-1.0 Conventions by Netcdf-Java CDM (CFGridWriter2)
Original Dataset = /san5302/nexus/gfsanl/202005/20200506/gfsanl_4_20200506_0000_003.grb2; Translation Date = 2020-05-19T14:03:06.918Z
    geospatial_lat_min: 20.0
    geospatial_lat_max: 54.0
    geospatial_lon_min: -126.0
    geospatial_lon_max: -65.0
    dimensions(sizes): time(1), heig