To set up the tutorial to work with your files, modify the WRF_DIRECTORY and WRF_FILES variables to point to your WRF files.

**IMPORTANT**:  If for some reason your workbook crashes, you need to run this cell again before running the later examples.

In [None]:
from __future__ import print_function

# This jupyter notebook command inserts matplotlib graphics in 
# to the workbook
%matplotlib inline

# Modify these to point to your own files
WRF_DIRECTORY = "./"
WRF_FILES = ["wrfout_d01_2005-08-28_00:00:00",
             "wrfout_d01_2005-08-28_12:00:00"]


# Do not modify the code below this line
#------------------------------------------------------
_WRF_FILES = [os.path.abspath(os.path.expanduser(
    os.path.join(WRF_DIRECTORY, f))) for f in WRF_FILES]

# Make sure the environment is good
from netCDF4 import Dataset
from xarray import DataArray
from wrf import (getvar, interplevel, vertcross, 
                 vinterp, ALL_TIMES)

# Check that the WRF files exist
import os
for f in _WRF_FILES:
    if not os.path.exists(f):
        raise ValueError("{} does not exist. "
            "Did you type it in correctly?".format(full_path))

# Create functions so that the WRF files only need to be specified in 
# one place
def single_wrf_file():
    global _WRF_FILES
    return _WRF_FILES[0]

def multiple_wrf_files():
    global _WRF_FILES
    return _WRF_FILES

print ("All tests passed!")
    


# Overview of WRF Output Data

The first rule of data processing:

**"ALWAYS LOOK AT YOUR DATA"**

\- D. Shea

WRF can be configured in various ways and can have variables turned on and off.  If you run in to problems, it could be due to a variable missing.  If your plot doesn't look right, there could be a map projection issue.

Let's look at some WRF data...how do we do that?

There are numerous tools available to examine NetCDF data, from both outside and inside of Python.

- **ncdump** (used for this example)
- ncl_filedump
- netcdf4-python 
- xarray


## ncdump

ncdump is a program included with the NetCDF libraries that can be used to examine NetCDF data.

By supplying the '-h' option, only the data descriptions are returned.  Otherwise, you'll get all of the data values, which can span miles.

To run:

```
$ ncdump -h wrfout_d01_2005-08-28_00:00:00
```

<div id="nc_dims" style="font-size:75%;"/>
``` 
netcdf wrfout_d01_2005-08-28_00\:00\:00 {
dimensions:
    Time = UNLIMITED ; // (4 currently)
    
    DateStrLen = 19 ;
    
    west_east = 90 ;
    
    south_north = 73 ;
    
    bottom_top = 29 ;
    
    bottom_top_stag = 30 ;
    
    soil_layers_stag = 4 ;
    
    west_east_stag = 91 ;
    
    south_north_stag = 74 ;
    
```
</div>



<div id="nc_vars" style="font-size:75%;">
```
variables:
    char Times(Time, DateStrLen) ;
    float XLAT(Time, south_north, west_east) ;
        XLAT:FieldType = 104 ;
        XLAT:MemoryOrder = "XY " ;
        XLAT:description = "LATITUDE, SOUTH IS NEGATIVE" ;
        XLAT:units = "degree_north" ;
        XLAT:stagger = "" ;
        XLAT:coordinates = "XLONG XLAT" ;
    float XLONG(Time, south_north, west_east) ;
        XLONG:FieldType = 104 ;
        XLONG:MemoryOrder = "XY " ;
        XLONG:description = "LONGITUDE, WEST IS NEGATIVE" ;
        XLONG:units = "degree_east" ;
        XLONG:stagger = "" ;
        XLONG:coordinates = "XLONG XLAT" ;
        .
        .
        .
    float SST_INPUT(Time, south_north, west_east) ;
        SST_INPUT:FieldType = 104 ;
        SST_INPUT:MemoryOrder = "XY " ;
        SST_INPUT:description = "SEA SURFACE TEMPERATURE 
            FROM WRFLOWINPUT FILE" ;
        SST_INPUT:units = "K" ;
        SST_INPUT:stagger = "" ;
        SST_INPUT:coordinates = "XLONG XLAT XTIME" ;
```
</div>

<div id="nc_attrs" style="font-size:75%;">
```
// global attributes:
        :TITLE = " OUTPUT FROM WRF V3.7 MODEL" ;
        :START_DATE = "2005-08-28_00:00:00" ;
        :SIMULATION_START_DATE = "2005-08-28_00:00:00" ;
        :WEST-EAST_GRID_DIMENSION = 91 ;
        :SOUTH-NORTH_GRID_DIMENSION = 74 ;
        :BOTTOM-TOP_GRID_DIMENSION = 30 ;
        :DX = 30000.f ;
        :DY = 30000.f ;
        .
        .
        .
        :CEN_LAT = 28.00002f ;
        :CEN_LON = -89.f ;
        :TRUELAT1 = 30.f ;
        :TRUELAT2 = 60.f ;
        :MOAD_CEN_LAT = 28.00002f ;
        :STAND_LON = -89.f ;
        :POLE_LAT = 90.f ;
        :POLE_LON = 0.f ;
        :GMT = 0.f ;
        :JULYR = 2005 ;
        :JULDAY = 240 ;
        :MAP_PROJ = 1 ;
        :MAP_PROJ_CHAR = "Lambert Conformal" ;
        .
        .
        .
}
```
</div>

## Dimensions

WRF uses an Arakawa C-grid staggered grid [(taken from mmm website)] [1]

- Mass related quantities (pressure, temperature, etc) are computed at the center of a grid cell. 
- The u-component of the horizontal wind is calculated at the left and right edges of a grid cell.  It has one more 
  point in the x direction than the mass grid.
- The v-component of the horizontal wind is calculated at the bottom and top edges of a grid cell.  It has one more     point in the y direction than the mass grid. 
- The corners of each grid box are know as the 'staggered' grid, and has one additional point in both the x and y direction.

[1]: http://www2.mmm.ucar.edu/rt/amps/information/configuration/wrf_grid_structure.html

![alt](images/wrf_stagger.png)

<div id="nc_dims" style="font-size:75%;"/>
``` 
netcdf wrfout_d01_2005-08-28_00\:00\:00 {
dimensions:
    Time = UNLIMITED ; // (4 currently)
    
    DateStrLen = 19 ;
    
    west_east = 90 ;
    
    south_north = 73 ;
    
    bottom_top = 29 ;
    
    bottom_top_stag = 30 ; <-- Extra grid point
    
    soil_layers_stag = 4 ;
    
    west_east_stag = 91 ; <-- Extra grid point
    
    south_north_stag = 74 ; <-- Extra grid point
    
```
</div>

## Variables

- Each variable is made up of dimensions, attributes, and data values. 
- Pay special attention to the units and coordinates attribute.
  - The coordinates attribute specifies the variables that contain the latitude and longitude 
    information for each grid box (XLONG, XLAT).
  - If the domain is from a moving nest, then a time coordinate is also used (XTIME).
  - The coordinates are named in Fortran ordering, so they'll be listed in reverse.




<div id="var_example" style="font-size:75%;"/>
```
float P(Time, bottom_top, south_north, west_east) ; <- Dims
	P:FieldType = 104 ;                       <-  Attribute
	P:MemoryOrder = "XYZ" ;                   <-  Attribute
    P:description = "perturbation pressure" ; <-  Attribute
	P:units = "Pa" ;                          <-  Attribute
    P:stagger = "" ;                          <-  Attribute
	P:coordinates = "XLONG XLAT XTIME" ;      <-  Attribute
    
data:

 P =
  339.8281, 340.3281, 340.25, 341.4531, ... 
    355.8672, 356.9531, 361.2578, 365.7188,  ...

```
</div>

## Global Attributes

- Provide a description of how the model was set up (resolution, map projection, microphysics, etc)
- For plotting, the map projection parameters will be the most important.
- wrf-python uses this information to build the mapping object in your plotting system of choice - basemap, cartopy, pyngl.




<div id="proj_stuff" style="font-size:75%;"/>
```
.
.
.
:CEN_LAT = 28.00002f ;
:CEN_LON = -89.f ;
:TRUELAT1 = 30.f ;
:TRUELAT2 = 60.f ;
:MOAD_CEN_LAT = 28.00002f ;
:STAND_LON = -89.f ;
:POLE_LAT = 90.f ;
:POLE_LON = 0.f ;
.
.
:MAP_PROJ = 1 ;
:MAP_PROJ_CHAR = "Lambert Conformal" ;
.
.
.
```
</div>

## Your Turn!

In [None]:
from subprocess import Popen, PIPE, STDOUT

file_path = single_wrf_file()

# This simply executes 'ncdump -h {wrf_file}' 
# from Python
p = Popen(["ncdump", "-h", "{}".format(file_path)], 
          stdout=PIPE, stderr=STDOUT)
output, _ = p.communicate()

print (output)


## Reading a WRF File in Python

You have several options to read a WRF NetCDF file in Python. 

- **netcdf4-python**
- PyNIO (currently Python 2.x only)
- xarray (Dataset not natively supported yet in wrf-python)


## netcdf4-python Example

``` python
from netCDF4 import Dataset

file_path = "./wrfout_d01_2005-08-28_00:00:00"

wrf_file = Dataset(file_path)

```

## Your Turn!

In [None]:
from netCDF4 import Dataset

file_path = single_wrf_file()

wrf_file = Dataset(file_path)

print(wrf_file)


## Getting Variables and Attributes

netcdf4-python uses an old API that was originally created for an old package called Scientific.IO.NetCDF.  PyNIO also uses this API.  Some of it may look a little dated.




### Getting global attributes

The get the full dictionary of global attributes, use the \_\_dict\_\_ attribute.  To work with one attribute at a time, you can use the getncattr and setncattr methods. 

``` python
global_attrs = wrf_file.__dict__

# To get the value for MAP_PROJ, you can do:
map_proj = wrf_file.__dict__["MAP_PROJ"]

# Or more cleanly
map_prof = wrf_file.getncattr("MAP_PROJ")

```

### Getting variables, variable attributes, and variable data

All variables are stored in a dictionary attribute called *variables*.  

Let's get the perturbation pressure ("P") variable.

``` python

# This will return a netCDF4.Variable object
p = wrf_file.variables["P"]

```

To get the variable attributes, you can use the \_\_dict\_\_ attribute to get a dictionary of all attributes, or the *getncattr* function if you already know the attribute name.

``` python
# Return a dictionary of all of P's 
# attributes
p_attrs = p.__dict__

# Let's just get the 'coordinates' attribute
p_coords = p.getncattr("coordinates")

```

To get the variable's data as a numpy array, you need to use Python's [ ] API (\_\_getitem\_\_ for those that are more familiar with Python's data model).  

``` python

# Get a numpy array for all times
p_all_data = p[:,:,:,:]

# In numpy, there is implicit expansion of ':'
# across all dimensions. So, this is the 
# same as p[:,:,:,:]
p_all_data = p[:]

# You can also request specific values 
# by supplying indexes.  This will 
# extract the numpy array for time 
# index 0.

p_t0_data = p[0,:]


```

## Your Turn!

In [None]:
from netCDF4 import Dataset

file_path = single_wrf_file()

# Create the netCDF4.Dataset object
wrf_file = Dataset(file_path)

# Get the global attribute dict
global_attrs = wrf_file.__dict__
print ("Global attributes for the file")
print(global_attrs)
print ("\n")

# Just get the 'MAP_PROJ' attribute
map_proj = wrf_file.getncattr("MAP_PROJ")
print ("The MAP_PROJ attribute:")
print (map_proj)
print("\n")

# Get the perturbation pressure variable
p = wrf_file.variables["P"]
print ("The P variable: ")
print(p)
print ("\n")

# Get the P attributes
p_attrs = p.__dict__
print ("The attribute dict for P")
print (p_attrs)
print ("\n")

# Get the 'coordinates' attribute for P
coords = p.getncattr("coordinates")
print ("Coordinates for P:")
print (coords)
print ("\n")

# Get the P numpy array for all times
p_all_data = p[:]
print ("The P numpy array: ")
print (p_all_data)
print ("\n")

# Get the P numpy array for time 0
p_t0_data = p[0,:]
print ("P array at time 0:")
print (p_t0_data)
print ("\n")

## Pop Quiz

What is the first rule of data processing?

    A) YOU DO NOT TALK ABOUT DATA PROCESSING
    B) 60% OF THE TIME, IT WORKS EVERY TIME
    C) ALWAYS LOOK AT YOUR DATA

# wrf-python

wrf-python provides functionality similar to what is found in the NCL-WRF package:

- over 30 diagnostics calculations
- several interpolation routines (horizontal level, vertical cross section, horizontal "surface"
- plot helper utilities for cartopy, basemap, and PyNGL


The most commonly used functions:

- **getvar**: Extracts variables and diagnostic variables
- **interplevel**: Interpolates a variable to a horizontal plane at a specified level
- **vertcross**: Interpolates a 3D variable to a vertical cross section
- **vinterp**: Interpolates a variable to a new surface (e.g. theta-e)

## The *getvar* function

The *getvar* function can be used to:

- Extract NetCDF variables from a file, similar to netcdf4-python or PyNIO.
- Compute diagnostic variables
- Concatenate a variable (either NetCDF or diagnostic) across multiple files. 



### Simple *getvar* Example for HGT

``` python

from netCDF4 import Dataset
from wrf import getvar

file_path = "./wrfout_d01_2005-08-28_00:00:00"

wrf_file = Dataset(file_path)

hgt = getvar(wrf_file, "HGT", timeidx=0)

```

## Your Turn!

In [None]:
from netCDF4 import Dataset
from wrf import getvar

file_path = single_wrf_file()

wrf_file = Dataset(file_path)

hgt = getvar(wrf_file, "HGT", timeidx=0)

print(hgt)

## Computing a Diagnostic Variable with *getvar*

A list of available diagnostics is available at: http://wrf-python.readthedocs.io/en/latest/diagnostics.html

In this example, we're going to compute sea level pressure.  Note the 'units' keyword argument.  Some diagnostics support several choices for units.  However, unit support is still relatively primitive.  

``` python

from netCDF4 import Dataset
from wrf import getvar

file_path = "./wrfout_d01_2005-08-28_00:00:00"

wrf_file = Dataset(file_path)

slp = getvar(wrf_file, "slp", 
             timeidx=0, units="hPa")


```

## Your Turn!

Also try changing the units for by specifiying the following values:  'hPa', 'Pa', 'atm', 'mmhg'

In [None]:
from netCDF4 import Dataset
from wrf import getvar

file_path = single_wrf_file()

wrf_file = Dataset(file_path)

slp = getvar(wrf_file, "slp", timeidx=0, units="mmhg")

print (slp)

## Combining Across Mulitple Files

wrf-python has two methods for combining a variable across multiple files

- **cat** - combines the the variable along the Time dimension (Note: you must put the files in the order you want them)
- **join** - creates a new left-most dimension for each file


To extract all times in to a single array, set *timeidx* to wrf.ALL_TIMES (an alias for None).  

In this example, we're using the 'cat' method, which is the most common.

``` python
from netCDF4 import Dataset
from wrf import getvar, ALL_TIMES

file_paths = [
     "./wrfout_d01_2005-08-28_00:00:00",
     "./wrfout_d01_2005-08-28_12:00:00"
     ]

wrf_files = [Dataset(file_paths[0]),
             Dataset(file_paths[1])]

slp = getvar(wrf_files, "slp", 
             timeidx=ALL_TIMES,
             method="cat")

```

## Your Turn!

### Example using the 'cat' method

In [None]:
from netCDF4 import Dataset
from wrf import getvar, ALL_TIMES

file_paths = multiple_wrf_files()

wrf_files = [Dataset(file_paths[0]),
             Dataset(file_paths[1])]

slp = getvar(wrf_files, "slp", timeidx=ALL_TIMES, method="cat")

print (slp)

### Example using the 'join' method

In [None]:
from netCDF4 import Dataset
from wrf import getvar, ALL_TIMES

file_paths = multiple_wrf_files()

wrf_files = [Dataset(file_paths[0]),
             Dataset(file_paths[1])]

slp = getvar(wrf_files, "slp", timeidx=ALL_TIMES, method="join")

print (slp)

## Interpolation Routines

- **interplevel** - linear interpolation to a horizontal plane at a specified height or pressure level
- **vertcross** - vertical cross section interpolation to a vertical plane through two specified points 
  (or a pivot point and angle)
- **vinterp** - interpolates to a "surface", which could be pressure levels or temperature levels like theta-e

### interplevel Example
 
Let's get the 500 hPa heights in decameters
 
``` python
 
from netCDF4 import Dataset
from wrf import getvar, interplevel

file_path = "./wrfout_d01_2005-08-28_00:00:00"

wrf_file = Dataset(file_path)

pres = getvar(wrf_file, "pressure", timeidx=0)
ht = getvar(wrf_file, "z", timeidx=0, units="dm")

ht_500 = interplevel(ht, pres, 500.0)
 
```

## Your Turn!

In [None]:
from netCDF4 import Dataset
from wrf import getvar, interplevel

file_path = single_wrf_file()

wrf_file = Dataset(file_path)

pres = getvar(wrf_file, "pressure", timeidx=0)
ht = getvar(wrf_file, "z", timeidx=0, units="dm")

ht_500 = interplevel(ht, pres, 500.0)

print (ht_500)

### vertcross Example

Vertical cross sections can be confusing to people.  The idea is to draw a horizontal line at the surface, and the cross section is defined as a vertical plane up from this line.

- The new 'x-axis' in the cross section is the points along the line you made.  The line can be defined by:
  1. defining a start point and an end point by using (x,y) grid coordinates or (latitude, longitude) coordinates.
  2. defining a pivot point and an angle, which is useful for cross sections that will span most of the domain.
- The new 'y-axis' will be a set of vertical levels chosen to be predefined levels, or you can choose them.  

Also important to note that the horizontal line distance is discretized in to integer points, so the final point on the line may be slightly different than the end_point you have chosen.





This example introduces the *CoordPair* class.  A *CoordPair* is simply used to store (x,y) coordinates, or (lat,lon) coordinates.  It is also possible to have (x, y, lat, lon), but that's rarely used.  

The *CoordPair* will be used to define your cross section line.

``` python
from wrf import CoordPair

# Creating an x,y pair
x_y_pair = CoordPair(x=10, y=20)

# Creating a lat,lon pair
lat_lon_pair = CoordPair(lat=30.0, lon=-120.0)

```

In this example, we're going to define the cross section using a start point and and end point.  

We're going to let the algorithm pick the levels, which are at ~1% increments.

``` python
 
from netCDF4 import Dataset
from wrf import getvar, vertcross, CoordPair

file_path = "./wrfout_d01_2005-08-28_00:00:00"
wrf_file = Dataset(file_path)

# Making a diagonal cross section line from 
# bottom left to top right.
bottom_left = CoordPair(x=0, y=0)

top_right = CoordPair(x=-1, y=-1)

# Let's get wind speed in kts
wspd_wdir = getvar(wrf_file, "wspd_wdir", 
                   timeidx=0, units="kt")
                   
wspd = wspd_wdir[0,...]

# Get the height levels
ht = getvar(wrf_file, "z", timeidx=0)

wspd_cross = vertcross(wspd, ht, 
               start_point=bottom_left, 
               end_point=top_right)
               
```

## Your Turn!

In [None]:
from netCDF4 import Dataset
from wrf import getvar, vertcross, CoordPair

file_path = "./wrfout_d01_2005-08-28_00:00:00"
wrf_file = Dataset(file_path)

# Making a diagonal cross section line from 
# bottom left to top right.
bottom_left = CoordPair(x=0, y=0)

top_right = CoordPair(x=-1, y=-1)

# Let's get wind speed in kts
wspd_wdir = getvar(wrf_file, "wspd_wdir", 
                   timeidx=0, units="kt")
                   
wspd = wspd_wdir[0,...]

# Get the height levels
ht = getvar(wrf_file, "z", timeidx=0)

wspd_cross = vertcross(wspd, ht, 
               start_point=bottom_left, 
               end_point=top_right)

print (wspd_cross)
