**Brian Blaylock**


# Importing and using the `HRRR_archive.py` module.

Before you can import a function from `HRRR_archive.py`, Python needs to know where the module is located. If it is in your current working directory, that is great, just import the functions with `from HRRR_archive import download_HRRR`. If it is not in your current path, you need to tell Python where to look for it with `sys.path`.

    import sys
    sys.path.append('../') # tell Python to look for the HRRR_archive module back one directory

    from HRRR_archive import download_HRRR, get_HRRR


# Example #1
## Download full HRRR files

- for every hour between two dates
- only get analyses (F00)
- get the surface, 'sfc', fileds (not the pressure 'prs' fileds which are much larger files).


In [1]:
from datetime import datetime
from pandas import date_range

import sys
sys.path.append('../') # tell python to look back one direcotry for HRRR_archive

from HRRR_archive import download_HRRR

DATES = date_range('2019-7-4 00:00', '2019-7-4 03:00', freq='1H')

download_HRRR(DATES, fxx=[0], field='sfc')

💡 Info: Downloading 4 GRIB2 files

✅ Success! Downloaded https://pando-rgw01.chpc.utah.edu/hrrr/sfc/20190704/hrrr.t00z.wrfsfcf00.grib2 as ./20190704_hrrr.t00z.wrfsfcf00.grib2
✅ Success! Downloaded https://pando-rgw01.chpc.utah.edu/hrrr/sfc/20190704/hrrr.t01z.wrfsfcf00.grib2 as ./20190704_hrrr.t01z.wrfsfcf00.grib2
✅ Success! Downloaded https://pando-rgw01.chpc.utah.edu/hrrr/sfc/20190704/hrrr.t02z.wrfsfcf00.grib2 as ./20190704_hrrr.t02z.wrfsfcf00.grib2
✅ Success! Downloaded https://pando-rgw01.chpc.utah.edu/hrrr/sfc/20190704/hrrr.t03z.wrfsfcf00.grib2 as ./20190704_hrrr.t03z.wrfsfcf00.grib2

Finished 🍦


[['./20190704_hrrr.t00z.wrfsfcf00.grib2',
  './20190704_hrrr.t01z.wrfsfcf00.grib2',
  './20190704_hrrr.t02z.wrfsfcf00.grib2',
  './20190704_hrrr.t03z.wrfsfcf00.grib2'],
 ['https://pando-rgw01.chpc.utah.edu/hrrr/sfc/20190704/hrrr.t00z.wrfsfcf00.grib2',
  'https://pando-rgw01.chpc.utah.edu/hrrr/sfc/20190704/hrrr.t01z.wrfsfcf00.grib2',
  'https://pando-rgw01.chpc.utah.edu/hrrr/sfc/20190704/hrrr.t02z.wrfsfcf00.grib2',
  'https://pando-rgw01.chpc.utah.edu/hrrr/sfc/20190704/hrrr.t03z.wrfsfcf00.grib2']]

### Use the `dryrun=True` option to see what will happen without actually downloading anything

In [2]:
from datetime import datetime
from pandas import date_range

import sys
sys.path.append('../') # tell python to look back one direcotry for HRRR_archive

from HRRR_archive import download_HRRR

DATES = date_range('2019-7-4 00:00', '2019-7-4 03:00', freq='1H')

download_HRRR(DATES, fxx=[0], field='sfc', dryrun=True)

🌵 Info: Dry Run 4 GRIB2 files

🌵 Dry Run Success! Would have downloaded https://pando-rgw01.chpc.utah.edu/hrrr/sfc/20190704/hrrr.t00z.wrfsfcf00.grib2 as ./20190704_hrrr.t00z.wrfsfcf00.grib2
🌵 Dry Run Success! Would have downloaded https://pando-rgw01.chpc.utah.edu/hrrr/sfc/20190704/hrrr.t01z.wrfsfcf00.grib2 as ./20190704_hrrr.t01z.wrfsfcf00.grib2
🌵 Dry Run Success! Would have downloaded https://pando-rgw01.chpc.utah.edu/hrrr/sfc/20190704/hrrr.t02z.wrfsfcf00.grib2 as ./20190704_hrrr.t02z.wrfsfcf00.grib2
🌵 Dry Run Success! Would have downloaded https://pando-rgw01.chpc.utah.edu/hrrr/sfc/20190704/hrrr.t03z.wrfsfcf00.grib2 as ./20190704_hrrr.t03z.wrfsfcf00.grib2

Finished 🍦


[[None, None, None, None],
 ['https://pando-rgw01.chpc.utah.edu/hrrr/sfc/20190704/hrrr.t00z.wrfsfcf00.grib2',
  'https://pando-rgw01.chpc.utah.edu/hrrr/sfc/20190704/hrrr.t01z.wrfsfcf00.grib2',
  'https://pando-rgw01.chpc.utah.edu/hrrr/sfc/20190704/hrrr.t02z.wrfsfcf00.grib2',
  'https://pando-rgw01.chpc.utah.edu/hrrr/sfc/20190704/hrrr.t03z.wrfsfcf00.grib2']]

### Set `verbose=False` to not print so much stuff to the screen

In [3]:
from datetime import datetime
from pandas import date_range

import sys
sys.path.append('../') # tell python to look back one direcotry for HRRR_archive

from HRRR_archive import download_HRRR

DATES = date_range('2019-7-4 00:00', '2019-7-4 03:00', freq='1H')

download_HRRR(DATES, fxx=[0], field='sfc', verbose=False)

💡 Info: Downloading 4 GRIB2 files

 Download Progress: 100.00% of 104.4 MB/pando-rgw01.chpc.utah.edu/hrrr/sfc/20190704/hrrr.t03z.wrfsfcf00.grib2

[['./20190704_hrrr.t00z.wrfsfcf00.grib2',
  './20190704_hrrr.t01z.wrfsfcf00.grib2',
  './20190704_hrrr.t02z.wrfsfcf00.grib2',
  './20190704_hrrr.t03z.wrfsfcf00.grib2'],
 ['https://pando-rgw01.chpc.utah.edu/hrrr/sfc/20190704/hrrr.t00z.wrfsfcf00.grib2',
  'https://pando-rgw01.chpc.utah.edu/hrrr/sfc/20190704/hrrr.t01z.wrfsfcf00.grib2',
  'https://pando-rgw01.chpc.utah.edu/hrrr/sfc/20190704/hrrr.t02z.wrfsfcf00.grib2',
  'https://pando-rgw01.chpc.utah.edu/hrrr/sfc/20190704/hrrr.t03z.wrfsfcf00.grib2']]

# Example #2
## Download subset of fields from files
- same as above
- except only get 10-meter u and v wind compoenent

In [4]:
from datetime import datetime
from pandas import date_range

import sys
sys.path.append('../') # tell python to look back one direcotry for HRRR_archive

from HRRR_archive import download_HRRR

DATES = date_range('2019-7-4 00:00', '2019-7-4 03:00', freq='1H')
download_HRRR(DATES, searchString=':(U|V)GRD:10 m',
              fxx=[0], field='sfc')

💡 Info: Downloading 4 GRIB2 files

Download subset from [pando]:
  Downloading GRIB line [ 71]: variable=UGRD, level=10 m above ground, forecast=anl
  Downloading GRIB line [ 72]: variable=VGRD, level=10 m above ground, forecast=anl
✅ Success! Searched for [:(U|V)GRD:10 m] and got [2] GRIB fields and saved as ./subset_20190704_hrrr.t00z.wrfsfcf00.grib2
Download subset from [pando]:
  Downloading GRIB line [ 71]: variable=UGRD, level=10 m above ground, forecast=anl
  Downloading GRIB line [ 72]: variable=VGRD, level=10 m above ground, forecast=anl
✅ Success! Searched for [:(U|V)GRD:10 m] and got [2] GRIB fields and saved as ./subset_20190704_hrrr.t01z.wrfsfcf00.grib2
Download subset from [pando]:
  Downloading GRIB line [ 71]: variable=UGRD, level=10 m above ground, forecast=anl
  Downloading GRIB line [ 72]: variable=VGRD, level=10 m above ground, forecast=anl
✅ Success! Searched for [:(U|V)GRD:10 m] and got [2] GRIB fields and saved as ./subset_20190704_hrrr.t02z.wrfsfcf00.grib2
Downl

[['./subset_20190704_hrrr.t00z.wrfsfcf00.grib2',
  './subset_20190704_hrrr.t01z.wrfsfcf00.grib2',
  './subset_20190704_hrrr.t02z.wrfsfcf00.grib2',
  './subset_20190704_hrrr.t03z.wrfsfcf00.grib2'],
 ['https://pando-rgw01.chpc.utah.edu/hrrr/sfc/20190704/hrrr.t00z.wrfsfcf00.grib2',
  'https://pando-rgw01.chpc.utah.edu/hrrr/sfc/20190704/hrrr.t01z.wrfsfcf00.grib2',
  'https://pando-rgw01.chpc.utah.edu/hrrr/sfc/20190704/hrrr.t02z.wrfsfcf00.grib2',
  'https://pando-rgw01.chpc.utah.edu/hrrr/sfc/20190704/hrrr.t03z.wrfsfcf00.grib2']]

### with `verbose=False` and `dryrun=True` option...

In [10]:
from datetime import datetime
from pandas import date_range

import sys
sys.path.append('../') # tell python to look back one direcotry for HRRR_archive

from HRRR_archive import download_HRRR

DATES = date_range('2019-7-4 00:00', '2019-7-4 03:00', freq='1H')
download_HRRR(DATES, searchString=':(U|V)GRD:10 m',
              fxx=[0], field='sfc', verbose=False, dryrun=True)

🌵 Info: Dry Run 4 GRIB2 files

 Download Progress: (4/4) files https://pando-rgw01.chpc.utah.edu/hrrr/sfc/20190704/hrrr.t03z.wrfsfcf00.grib2

[[None, None, None, None],
 ['https://pando-rgw01.chpc.utah.edu/hrrr/sfc/20190704/hrrr.t00z.wrfsfcf00.grib2',
  'https://pando-rgw01.chpc.utah.edu/hrrr/sfc/20190704/hrrr.t01z.wrfsfcf00.grib2',
  'https://pando-rgw01.chpc.utah.edu/hrrr/sfc/20190704/hrrr.t02z.wrfsfcf00.grib2',
  'https://pando-rgw01.chpc.utah.edu/hrrr/sfc/20190704/hrrr.t03z.wrfsfcf00.grib2']]

# Example #3
## Download and read fields for 2 m above the ground as xarray Dataset

In [5]:
from datetime import datetime

import sys
sys.path.append('../') # tell python to look back one direcotry for HRRR_archive

from HRRR_archive import get_HRRR

DATE = datetime(2019, 7, 4)
get_HRRR(DATE, searchString=':2 m', fxx=0, field='sfc', remove_grib2=True)


💡 Info: Downloading 1 GRIB2 files

Download subset from [pando]:
  Downloading GRIB line [ 66]: variable=TMP, level=2 m above ground, forecast=anl
  Downloading GRIB line [ 67]: variable=POT, level=2 m above ground, forecast=anl
  Downloading GRIB line [ 68]: variable=SPFH, level=2 m above ground, forecast=anl
  Downloading GRIB line [ 69]: variable=DPT, level=2 m above ground, forecast=anl
  Downloading GRIB line [ 70]: variable=RH, level=2 m above ground, forecast=anl
✅ Success! Searched for [:2 m] and got [5] GRIB fields and saved as ./subset_20190704_hrrr.t00z.wrfsfcf00.grib2

Finished 🍦


# Example #4
## Files download from NOMADS instead of Pando if datetime requested is for yesterday or today.
HRRR files are available on the NOMADS server for runs from today and yesterday. Download the files from NOMADS for those dates.

In [6]:
from datetime import datetime, timedelta
from pandas import date_range

import sys
sys.path.append('../') # tell python to look back one directory for HRRR_archive

from HRRR_archive import download_HRRR, get_HRRR

In [7]:
today = datetime.utcnow()
oneDayAgo = datetime(today.year, today.month, today.day) - timedelta(days=1)
twoDaysAgo = datetime(today.year, today.month, today.day) - timedelta(days=2)

sDATE = datetime(twoDaysAgo.year, twoDaysAgo.month, twoDaysAgo.day, 22)
eDATE = datetime(oneDayAgo.year, oneDayAgo.month, oneDayAgo.day, 2)

# This range of dates will get files from Pando and NOMADS
DATES = date_range(sDATE, eDATE, freq='1H')
DATES

DatetimeIndex(['2020-06-27 22:00:00', '2020-06-27 23:00:00',
               '2020-06-28 00:00:00', '2020-06-28 01:00:00',
               '2020-06-28 02:00:00'],
              dtype='datetime64[ns]', freq='H')

In [8]:
download_HRRR(DATES, searchString='(TMP|DPT):500 mb')

💡 Info: Downloading 5 GRIB2 files

Download subset from [nomads]:
  Downloading GRIB line [ 14]: variable=TMP, level=500 mb, forecast=anl
  Downloading GRIB line [ 15]: variable=DPT, level=500 mb, forecast=anl
✅ Success! Searched for [(TMP|DPT):500 mb] and got [2] GRIB fields and saved as ./subset_20200627_hrrr.t22z.wrfsfcf00.grib2
Download subset from [nomads]:
  Downloading GRIB line [ 14]: variable=TMP, level=500 mb, forecast=anl
  Downloading GRIB line [ 15]: variable=DPT, level=500 mb, forecast=anl
✅ Success! Searched for [(TMP|DPT):500 mb] and got [2] GRIB fields and saved as ./subset_20200627_hrrr.t23z.wrfsfcf00.grib2
Download subset from [nomads]:
  Downloading GRIB line [ 14]: variable=TMP, level=500 mb, forecast=anl
  Downloading GRIB line [ 15]: variable=DPT, level=500 mb, forecast=anl
✅ Success! Searched for [(TMP|DPT):500 mb] and got [2] GRIB fields and saved as ./subset_20200628_hrrr.t00z.wrfsfcf00.grib2
Download subset from [nomads]:
  Downloading GRIB line [ 14]: variab

[['./subset_20200627_hrrr.t22z.wrfsfcf00.grib2',
  './subset_20200627_hrrr.t23z.wrfsfcf00.grib2',
  './subset_20200628_hrrr.t00z.wrfsfcf00.grib2',
  './subset_20200628_hrrr.t01z.wrfsfcf00.grib2',
  './subset_20200628_hrrr.t02z.wrfsfcf00.grib2'],
 ['https://pando-rgw01.chpc.utah.edu/hrrr/sfc/20200627/hrrr.t22z.wrfsfcf00.grib2',
  'https://pando-rgw01.chpc.utah.edu/hrrr/sfc/20200627/hrrr.t23z.wrfsfcf00.grib2',
  'https://nomads.ncep.noaa.gov/pub/data/nccf/com/hrrr/prod/hrrr.20200628/conus/hrrr.t00z.wrfsfcf00.grib2',
  'https://nomads.ncep.noaa.gov/pub/data/nccf/com/hrrr/prod/hrrr.20200628/conus/hrrr.t01z.wrfsfcf00.grib2',
  'https://nomads.ncep.noaa.gov/pub/data/nccf/com/hrrr/prod/hrrr.20200628/conus/hrrr.t02z.wrfsfcf00.grib2']]

---
Similiarly, if we open a file from yesterday, it will automatically download it from NOMADS

In [9]:
get_HRRR(eDATE, 'TMP:2 m')

💡 Info: Downloading 1 GRIB2 files

Download subset from [nomads]:
  Downloading GRIB line [ 66]: variable=TMP, level=2 m above ground, forecast=anl
✅ Success! Searched for [TMP:2 m] and got [1] GRIB fields and saved as ./subset_20200628_hrrr.t02z.wrfsfcf00.grib2

Finished 🍦


# Example #5
## Specify certain datetimes
I really like the pandas `date_range` function for making lists of datetimes. Below are some examples to customize what you want.

You can give a start and end time either as a string or as a standard datetime object.

    from pandas import date_range
    from datetime import datetime
    
    date_range('2020-01-01 00:00', '2020-02-01 00:00')
    
    date_range(datetime(2020, 1, 1), datetime(2020, 2, 1))

In [2]:
from pandas import date_range

In [5]:
# I want a whole month of data...every hourly RRR run in a month.
date_range('2020-01-01', '2020-02-01', freq='1H')

DatetimeIndex(['2020-01-01 00:00:00', '2020-01-01 01:00:00',
               '2020-01-01 02:00:00', '2020-01-01 03:00:00',
               '2020-01-01 04:00:00', '2020-01-01 05:00:00',
               '2020-01-01 06:00:00', '2020-01-01 07:00:00',
               '2020-01-01 08:00:00', '2020-01-01 09:00:00',
               ...
               '2020-01-31 15:00:00', '2020-01-31 16:00:00',
               '2020-01-31 17:00:00', '2020-01-31 18:00:00',
               '2020-01-31 19:00:00', '2020-01-31 20:00:00',
               '2020-01-31 21:00:00', '2020-01-31 22:00:00',
               '2020-01-31 23:00:00', '2020-02-01 00:00:00'],
              dtype='datetime64[ns]', length=745, freq='H')

In [6]:
# I only want the 1800 UTC run for a month of data
date_range('2020-01-01 18:00', '2020-02-01 18:00', freq='24H')

DatetimeIndex(['2020-01-01 18:00:00', '2020-01-02 18:00:00',
               '2020-01-03 18:00:00', '2020-01-04 18:00:00',
               '2020-01-05 18:00:00', '2020-01-06 18:00:00',
               '2020-01-07 18:00:00', '2020-01-08 18:00:00',
               '2020-01-09 18:00:00', '2020-01-10 18:00:00',
               '2020-01-11 18:00:00', '2020-01-12 18:00:00',
               '2020-01-13 18:00:00', '2020-01-14 18:00:00',
               '2020-01-15 18:00:00', '2020-01-16 18:00:00',
               '2020-01-17 18:00:00', '2020-01-18 18:00:00',
               '2020-01-19 18:00:00', '2020-01-20 18:00:00',
               '2020-01-21 18:00:00', '2020-01-22 18:00:00',
               '2020-01-23 18:00:00', '2020-01-24 18:00:00',
               '2020-01-25 18:00:00', '2020-01-26 18:00:00',
               '2020-01-27 18:00:00', '2020-01-28 18:00:00',
               '2020-01-29 18:00:00', '2020-01-30 18:00:00',
               '2020-01-31 18:00:00', '2020-02-01 18:00:00'],
              dtype='da

In [9]:
# I want the model runs ever 12 hours for a week...
date_range('2020-01-01 00:00', '2020-01-07 00:00', freq='12H')

DatetimeIndex(['2020-01-01 00:00:00', '2020-01-01 12:00:00',
               '2020-01-02 00:00:00', '2020-01-02 12:00:00',
               '2020-01-03 00:00:00', '2020-01-03 12:00:00',
               '2020-01-04 00:00:00', '2020-01-04 12:00:00',
               '2020-01-05 00:00:00', '2020-01-05 12:00:00',
               '2020-01-06 00:00:00', '2020-01-06 12:00:00',
               '2020-01-07 00:00:00'],
              dtype='datetime64[ns]', freq='12H')

### `runDATE` versus `validDATE`
Sometimes you want to get the data in terms of when the model if *valid* rather than when it is *initialized*. In this case, you will need a loop to offset the datetime by the forecast hour.

For example, if you want the 6-hour forecast valid at 18:00 UTC, you need to get the F06 from the 12:00 UTC run.

In [22]:
from datetime import datetime, timedelta
import sys
sys.path.append('../') # tell python to look back one direcotry for HRRR_archive
from HRRR_archive import download_HRRR


validDATE = datetime(2020, 1, 1, 18)
lead_time = 6

runDATE = validDATE - timedelta(hours=lead_time)

print('Valid Date:', validDATE)
print('  Run Date:', runDATE, f'  Lead Time: F{lead_time:02d}')

download_HRRR(runDATE, fxx=lead_time, dryrun=True)

Valid Date: 2020-01-01 18:00:00
  Run Date: 2020-01-01 12:00:00   Lead Time: F06
🌵 Info: Dry Run 1 GRIB2 files

🌵 Dry Run Success! Would have downloaded https://pando-rgw01.chpc.utah.edu/hrrr/sfc/20200101/hrrr.t12z.wrfsfcf06.grib2 as ./20200101_hrrr.t12z.wrfsfcf06.grib2

Finished 🍦


(None,
 'https://pando-rgw01.chpc.utah.edu/hrrr/sfc/20200101/hrrr.t12z.wrfsfcf06.grib2')

##### Looping
If you want all the forecasts in a day, based on the valid times instead of the initialization times, you can use a loop.
> Note: the datetime in the filename of the GRIB2 file downloaded will be in terms of the model's initialization time. 

In [26]:
import sys
sys.path.append('../')
from HRRR_archive import download_HRRR
from datetime import timedelta
from pandas import date_range

validDATES = date_range('2020-01-01', '2020-01-02', freq='6H')
fxx = [12, 18]
searchString = '((U|V)GRD:10 m|TMP:2 m|APCP)'

for D in validDATES:
    for F in fxx:
        runDATE = D - timedelta(hours=F)
        download_HRRR(runDATE, searchString=searchString, fxx=F, dryrun=True)

🌵 Info: Dry Run 1 GRIB2 files

Download subset from [pando]:
    🐫 Dry Run: Found GRIB line [ 66]: variable=TMP, level=2 m above ground, forecast=12 hour fcst
    🐫 Dry Run: Found GRIB line [ 71]: variable=UGRD, level=10 m above ground, forecast=12 hour fcst
    🐫 Dry Run: Found GRIB line [ 72]: variable=VGRD, level=10 m above ground, forecast=12 hour fcst
    🐫 Dry Run: Found GRIB line [ 78]: variable=APCP, level=surface, forecast=0-12 hour acc fcst
    🐫 Dry Run: Found GRIB line [ 84]: variable=APCP, level=surface, forecast=11-12 hour acc fcst
🌵 Dry Run: Success! Searched for [((U|V)GRD:10 m|TMP:2 m|APCP)] and found [5] GRIB fields. Would save as ./subset_20191231_hrrr.t12z.wrfsfcf12.grib2

Finished 🍦
🌵 Info: Dry Run 1 GRIB2 files

Download subset from [pando]:
    🐫 Dry Run: Found GRIB line [ 66]: variable=TMP, level=2 m above ground, forecast=18 hour fcst
    🐫 Dry Run: Found GRIB line [ 71]: variable=UGRD, level=10 m above ground, forecast=18 hour fcst
    🐫 Dry Run: Found GRIB li