# National Oceanic and Atmospheric Administration (NOAA)
This jupyter notebook is meant to be used along with the North American Mesoscale Forecast System (NAM) dataset.  
This dataset can be found online under "Data Access > Model > Datasets > NAM" on the NOAA website.  
Once the data has been properly requested, confirmed, and processed using NOAA's Order Data feature,  
the requester is given 5 days to download the data via a email-link.  

# NAM 2017
To be more specific, the range this notebook will be targetting is the entire 2017 year (1800 UTC only). (100 ish Gb)  
Through further observation of the email-link provided by NOAA, it can be seen that the files end in an extension ".tar". These are all just zipped files. BEWARE- unpacking all these files doubles the 100Gb to 200Gb.  
It is recommended to make a main folder (moddata) with subfolders with each month on them (01, 02, 03, ..., 12).  
Then, putting all the corresponding ".tar" files in their respective month folder. The idea is to unpack an entire month and then delete the ".tar" files for that month. That way you don't have to unpack 100Gb to 200Gb and then try to delete 100Gb. It will be more like unpacking 6Gb to 12Gb then deleting the old 6Gb.

# Unpacking ".tar" Files
The ".tar" file names should look like the following:

namanl_218_2017010118.g2.tar

The format is very simple: namanl_218_yyyymmddhh.g2.tar  
Where yyyy = year, mm = month, dd = day, hh = hour (UTC)  
The above file would then be of 2017 January 1st 18 UTC

Since this data ranges accross the entire North America continent, 18 UTC was chosen. In New York, 18 UTC translates to 2 P.M. This makes any "real images" from the dataset appear more visually appealing since that part of the Earth will be facing towards the sun. This would be better than "real images" of NA taken at night. This also saves a lot of space. Imagine simply having 2 timestamps per day instead of 1, this would easily double the size of the data. NOAA allows 4 timestamps per day (0 UTC, 6 UTC, 12 UTC, and 18 UTC).  

Once you're inside the month directory containing all the ".tar" files for that month, simply use the following command to unpack:

// assuming path /moddata/01 being the path to all of january's ".tar" files type  
// and that you are currently inside the /01 directory, type the following into the kernel:  
for f in *.tar; do tar -xvf $f; done

This command should run for about 20 seconds and you should see all the files being unpacked individually.  
For every ".tar" file unpacked, there should be 5 ".grb2" files

// now type the following command to delete all of the ".tar" files in that directory:  
rm -r *.tar

This is done for all 12 months until every subdirectory of /moddata contains only ".grb2" files.  

# Congratulations!
Once all of the ".grb2" for every month are neatly organized in their own folder, the Exploritory Data Analysis can begin!  
These files can be viewed using a python package called "pygrib".  
Currently, pygrib is not available on windows (or at-least too hard to install), so the following commands and codes were performed on a Virtual Machine running Ubuntu 16.04 LTS with Anaconda2-Python 2.7 installed. Details on how to create your own Virtual Machine and install a Linux distribution can be found online. Once it's set up and working, simply go on a web-browser, on the virtual machine, and google "Anaconda Python", follow the instructions to install Anaconda2-Python 2.7 for Linux.  

// to ensure everything downloaded correctly, go on any terminal and type in
python -V

This should return "Python 2.7.14 :: Anaconda, Inc" or a newer version.  
Now to install the pygrib package using anaconda. Type the following into a terminal:

conda install -c conda-forge pygrib

That's about it for installations.

In [None]:
import pygrib
import numpy as np
import seaborn as sns
import matplotlib.pyplot as plt

In [None]:
def print_grbs(grbs):
    """
    prints the contents of a pygrib.open(filename) variable
    """
    for grb in grbs:
        print(grb)

In [None]:
satdata_2017_01_01_00_0 = "mod/moddata/nam_218_20170101_0000_000.grb2"
grbs = pygrib.open(satdata_2017_01_01_00_0)
print(grbs)

In [None]:
print_grbs(grbs)

In [None]:
temp = grbs.select(name="Temperature")[0].values

In [None]:
print(temp)

In [None]:
plt.figure()
ax = sns.heatmap(temp, cbar='true')
ax.invert_yaxis()