# Drilling down into 4-degree scale statistics of the GEOS-5 Nature Run

The G5NR experiment is described at http://g5nr.nccs.nasa.gov/<br>
This software is from https://github.com/brianmapes/CHAD_G5NR <br>
Questions? Brian Mapes ([mapes@miami.edu](mailto:mapes@miami.edu))<br>

Original author was Matthew Niznik (a postdoc 2015-2016), with the more general package:<br>
https://github.com/matthewniznik/ClickHist/wiki<br>
http://matthewniznik.com/research-projects/clickhist<br>

### First, pick a Tag that will prepend the filenames of this session's *Case Notebooks*.

In [9]:
sessionName = 'MyFirstSession'

### What space-time subset are you interested in exploring? 

Longitude: 0 through 360 (Degrees East)<br>
Latitude: -90 through 90 (Degrees North)<br>
Time: indices of the 18200 hours in this dataset (data starts on nominal "2005-05-16", but recall it is just a simulation).

In [10]:
lonLow = 360.-160.
lonHigh = 360.-150.
latLow = -25.0
latHigh = -15.0

#These are indices in the range [0, 18200 or so], a simple range limiter to speed up data reading
timelimit1 = 18000
timelimit2 = 18010

### What two 4-degree variables do you want to scatterplot, and then pick cases from?

*The units and ranges are hardwired into the module housekeeping_G5NR.py. So you have to pick from a list of options which are set there (case sensitive). The 4 degree dataset has many more quantities whose scatterplot ranges could be set, and then they could be used. See its header for the names of those available variables. Here, TEEF is Total Eddy Enthalpy Flux at 500mb, good for finding deep convection especially in dry 500mb environments. SKEdot is the change in column-integrated shear kinetic energy by eddy momentum flux. HMV is horizontal momentum variance, good for finding tropical storm centers.* <br>
Precip, W500, wPuP, TEEF, SKEDot, HMV

In [11]:
var1Name = 'Precip'
var2Name = 'SKEDot'

### Which quicklooks from the [online G5NR repository](http://g5nr.nccs.nasa.gov/images/) would you like your notebooks to contain?

**Options:** 'cloudsir', 'cloudsvis', 'cyclones', 'storms', 'temperature', 'tropical', 'water', 'winds'
<br>*N.B. Must be a list. If image saving time is prohibitively long, shorten the list of variables.*

In [12]:
imageVar = ['cloudsir','tropical']

--------
### Only advanced users will adjust the things below. Hit shift-return again and again to get on with the show!

#### This sets the URL of Brian's preprocessed dataset on a 4 degree hourly mesh 
These data were derived and rebinned from NASA's 0.5deg set. For faster response times, you could download the 4-degree statistics file locally, and then use its pathname as urlToLoad. 
Here is the 6GB file if you want to download it: http://weather.rsmas.miami.edu/repository/entry/show?entryid=synth%3Aeab82de2-d682-4dc0-ba8b-2fac7746d269%3AL2FsbFZhcnNfcjkweDQ1XzIwMTZfRGVjX25vWlNLRURPVC5uYzQ%3D

In [13]:
urlToLoad = ('http://weather.rsmas.miami.edu/repository/opendap/'+
             'synth:eab82de2-d682-4dc0-ba8b-2fac7746d269:'+
             'L2FsbFZhcnNfcjkweDQ1XzIwMTZfRGVjX25vWlNLRURPVC5uYzQ=/entry.das')  

#### This is a list of pre-made 'template' IDV bundles (a collection of datasets and displays), which ClickHist will modify to focus on the time and location relevant to scatter points you select. It's nice to have a fast small 'simple' one, and a rich deep 'full' one. 
**Note:** The *first* bundle here will be the one referenced in a script output that allows you to do a batch process that reads in the data and pre-generates images, movies, and a local .zidv file for fast interactive loading. It should probably be the "full" bundle with all the variables you want to study.

In [14]:
bundleInFilenames = ['G5NR_template_full', 'G5NR_template_simple_cloudsir']
bundleOutTags = ['full', 'simple']

#### Set how large you want your IDV bundle of detailed data to be in space and time
Each of these is calculated as distance from center, so `lonOffset = 1.0` means 2.0° of longitude. `dtFromCenter` needs to be in seconds

In [15]:
lonOffset = 1.0
latOffset = 1.0
dtFromCenter = 3*3600. # in seconds

#### Would you like specific quantile lines indicated in the scatterplot? 
**If so, specify them here.**

In [16]:
quantiles = [0.01,0.1,1,5,95,99,99.9,99.99]

--------
## Now, the action. Hit shift-return again and again to get on with the show!

### Import the necessary modules

*Currently supported graphics backends are Qt4Agg ('qt4') and TK ('tk')*

In [17]:
#%matplotlib tk
%matplotlib qt4
import matplotlib
#matplotlib.use('TkAgg')
#matplotlib.use('Qt4Agg')

from IPython.display import clear_output
import netCDF4 #If this gives an error, in a terminal type conda install netCDF4 or pip netCDF4
import sys

import ClickHist_G5NR as ClickHist
import ClickHistDo_G5NR as ClickHistDo
import housekeeping_G5NR
import numpy as np

#### The following scatterplot features (axis labels, joint histogram bin labels, etc.) are set to values that are hidden in the module `housekeeping_G5NR`

(You can change them in the module if desired, or here directly. Yes, this is inelegant.)

In [18]:
lonValueName = housekeeping_G5NR.lonValueName
latValueName = housekeeping_G5NR.latValueName
timeValueName = housekeeping_G5NR.timeValueName
startDatetime = housekeeping_G5NR.startDatetime

var1Edges = housekeeping_G5NR.binOptions[var1Name]
var2Edges = housekeeping_G5NR.binOptions[var2Name]

var1FmtStr = housekeeping_G5NR.fmtStrOptions[var1Name]
var2FmtStr = housekeeping_G5NR.fmtStrOptions[var2Name]

var1ValueName = housekeeping_G5NR.valueNameOptions[var1Name]
var2ValueName = housekeeping_G5NR.valueNameOptions[var2Name]

var1Units = housekeeping_G5NR.varUnitOptions[var1Name]
var2Units = housekeeping_G5NR.varUnitOptions[var2Name]
metadata_UD = (var1Name+' vs '+var2Name+': '+
               str(lonLow)+' to '+str(lonHigh)+' E, '+
               str(latLow)+' to '+str(latHigh)+' N')

var1ValueMult = housekeeping_G5NR.varMultOptions[var1Name]
var2ValueMult = housekeeping_G5NR.varMultOptions[var2Name]

## Load the Dataset and grab the coordinate variables

In [19]:
cdfIn = netCDF4.Dataset(urlToLoad,'r')

lonValues = cdfIn.variables[lonValueName][:]
latValues = cdfIn.variables[latValueName][:]
timeValues = cdfIn.variables[timeValueName][timelimit1:timelimit2]*housekeeping_G5NR.timeValueMult

#### By finding the needed index ranges here, we will load only the desired subset of the data.

In [21]:
lowLonInt,highLonInt = housekeeping_G5NR.getIntEdges(lonValues,lonLow,lonHigh)
lowLatInt,highLatInt = housekeeping_G5NR.getIntEdges(latValues,latLow,latHigh)

# How many values are we asking for?
print highLatInt+1-lowLatInt, 'x', highLonInt+1-lowLonInt, 'x', np.size(timeValues)

3 x 3 x 10


#### *Based on the above, the following variable load command may take some time.*

In [22]:
var1Values = cdfIn.variables[var1ValueName][timelimit1:timelimit2,
                                            lowLatInt:highLatInt+1,
                                            lowLonInt:highLonInt+1]*\
                                            var1ValueMult
var2Values = cdfIn.variables[var2ValueName][timelimit1:timelimit2,
                                            lowLatInt:highLatInt+1,
                                            lowLonInt:highLonInt+1]*\
                                            var2ValueMult

np.shape(var1Values)

(10, 3, 3)

In [23]:
cdfIn.close()

*(We now subset the longitude and latitude coordinate arrays, since the previous call to getIntEdges needed the full arrays. They were not big, just 1D arrays.)*

In [24]:
lonValues = lonValues[lowLonInt:highLonInt+1]
latValues = latValues[lowLatInt:highLatInt+1]

## Create the interactive scatterplot
ClickHistDo is the scatterplot and all the stuff it does when clicked. It needs a setup call. 

*This bug workaround call is necessary to make sure the output displays properly. (If interested in the details, see: http://bit.ly/1SsishU)*

In [25]:
oldsysstdout = sys.stdout
sys.stdout = housekeeping_G5NR.flushfile(sys.stdout)

### Initialize 'ClickHistDo'

In [26]:
ClickHistDo1 = ClickHistDo.ClickHistDo(lonValues,latValues,
                                       timeValues,startDatetime,
                                       bundleInFilenames,
                                       bundleOutTags,
                                       sessionName,
                                       xVarName=var1Name,
                                       yVarName=var2Name,
                                       lonOffset=lonOffset,
                                       latOffset=latOffset,
                                       dtFromCenter=dtFromCenter,
                                       imageVar=imageVar,
                                       openTab=False)

# Launch the clickable histogram, and let the sampling begin! 
## One click-response must finish before the next can begin. 

If you want the output of CHAD to be in a separate window, uncomment `%qtconsole`. If commented, the text output showing the responses to your clicks will appear below the last cell.

In [27]:
#%qtconsole
ClickHist1 = ClickHist.ClickHist(var1Edges,var2Edges,
                                 var1Values,var2Values,
                                 xVarName=var1Name,yVarName=var2Name,
                                 xUnits=var1Units,yUnits=var2Units,
                                 xFmtStr=var1FmtStr,
                                 yFmtStr=var2FmtStr,
                                 maxPlottedInBin=housekeeping_G5NR.maxPlottedInBin_UD,
                                 quantiles=quantiles,
                                 metadata=metadata_UD)
ClickHist1.setDo(ClickHistDo1)
ClickHist1.showPlot()

Saving IDV bundle(s)...
2007-06-05 08:30:00
208 E -20 N
X= 13 mm day-1 Y=0.281 W m-2
x%: 98.889 y%: 96.667

Link to cloudsir image: http://g5nr.nccs.nasa.gov/static/naturerun/fimages/CLOUDSIR/Y2007/M06/D05/cloudsir_globe_c1440_NR_BETA9-SNAP_20070605_0830z.png

Link to tropical image: http://g5nr.nccs.nasa.gov/static/naturerun/fimages/TROPICAL/Y2007/M06/D05/tropical_globe_c1440_NR_BETA9-SNAP_20070605_0830z.png

Bundle 'full' Saved!
Bundle 'simple' Saved!

Creating Case Notebook (MyFirstSession_Precip_quantile_98.889_SKEDot_quantile_96.667_lat_-20_lon_208_time_20070605_0830.ipynb)
Copy the template
Adding quicklook images to the notebook
Now looping over  ['cloudsir', 'tropical'] ...
 trying  cloudsir ...
appending 0
 trying  tropical ...
appending 1 and adding URL
Case Notebook created!
