# Drilling down into GEOS-5 Nature Run 4-degree stats

This software is from https://github.com/brianmapes/CHAD_G5NR <br>
Questions? Brian Mapes ([mapes@miami.edu](mailto:mapes@miami.edu))<br>

Original author was Matthew Niznik (a postdoc 2015-2016), with the more general package:<br>
https://github.com/matthewniznik/ClickHist/wiki<br>
http://matthewniznik.com/research-projects/clickhist<br>

## Now, pick a Tag for this session's *Case Notebooks*.
#### This is a notebook that will be generated separately from this one containing snapshots of ClickHists and other images related to each case you select.
This way, without much extra effort you can remember what you were working on!

In [2]:
caseNotebookFilename = 'Hamburg_Dec2016'

## (2) Set the variables, data sources, and other necessary information.
### What geographic subset are you interested in exploring?

Longitude: 0 through 360 (Degrees East)<br>
Latitude: -90 through 90 (Degrees North)<br>

In [3]:
lonLow = 360.-160.
lonHigh = 360.-150.
latLow = -25.0
latHigh = -15.0

#These are indices, crude to speed up testing
timelimit1 = 18000
timelimit2 = 18010

### Three URLs of preprocessed GEOS-5 data at 4 degree coarse grid scale are below. Select the time interval you prefer.

**Every Hour:**

In [4]:
urlToLoad = ('http://weather.rsmas.miami.edu/repository/opendap/'+
             'synth:eab82de2-d682-4dc0-ba8b-2fac7746d269:'+
             'L2FsbFZhcnNfcjkweDQ1XzIwMTZfRGVjX25vWlNLRURPVC5uYzQ=/entry.das')  
# http://weather.rsmas.miami.edu/repository/entry/show?entryid=synth%3Aeab82de2-d682-4dc0-ba8b-2fac7746d269%3AL2FsbFZhcnNfcjkweDQ1XzIwMTZfRGVjX25vWlNLRURPVC5uYzQ%3D

### Now let's get some information on the variables you want

**For this data, we've preprogrammed all of the units and data into a module so you just have to pick from a list of options (case sensitive):**<br>
Precip, W500, wPuP, TEEF, SKEDot, HMV

In [5]:
var1Name = 'Precip'
var2Name = 'SKEDot'

### What kind of snapshot from the [online G5NR repository](http://g5nr.nccs.nasa.gov/images/) of pre-made images would you like?

**Options:** 'cloudsir', 'cloudsvis', 'cyclones', 'storms', 'temperature', 'tropical', 'water', 'winds'
<br>*N.B. Must be a list. If image saving time is prohibitively long, shorten the list of variables.*

In [6]:
imageVar = ['cloudsir','tropical']

## (1) Setting Input/Output Files
## First, you need to chose the *template* IDV bundle.
### This is an IDV bundle with your desired data and displays that ClickHist will alter to focus on the time and location relevant to scatter points you select.
**Note:** The first bundle in both lists here will be the one referenced in the later script that generates images, movies, and a .zidv file. It should probably be the "full" bundle with the variables you want to study.

In [1]:
bundleInFilenames = ['G5NR_template_full', 'G5NR_template_simple_cloudsir']
bundleOutTags = ['full', 'simple']

### Set how large you want the IDV bundle to be in space and time
#### Each of these is calculated as distance from center, so `lonOffset = 1.0` means 2.0° of longitude.
#### `dtFromCenter` needs to be in seconds

In [7]:
lonOffset = 1.0
latOffset = 1.0
dtFromCenter = 3*3600.

### Would you like specific quantiles indicated in X and Y?
**If so, specify them here.**

In [8]:
quantiles = [0.01,0.1,1,5,95,99,99.9,99.99]

### Import the necessary modules needed for CHAD to work

*Currently supported graphics backends are Qt4Agg ('qt4') and TK ('tk')*

In [9]:
#%matplotlib tk
%matplotlib qt4
import matplotlib
#matplotlib.use('TkAgg')
#matplotlib.use('Qt4Agg')

from IPython.display import clear_output
import netCDF4 #If this gives an error, in a terminal type conda install netCDF4 or pip netCDF4
import sys

import ClickHist_G5NR as ClickHist
import ClickHistDo_G5NR as ClickHistDo
import housekeeping_G5NR
import numpy as np

#### The following (less often edited) items are set to default values in the module `housekeeping_G5NR`

(You can change them in the module if desired, but they are left out here to save space. For "advanced" users.)

In [10]:
lonValueName = housekeeping_G5NR.lonValueName
latValueName = housekeeping_G5NR.latValueName
timeValueName = housekeeping_G5NR.timeValueName
startDatetime = housekeeping_G5NR.startDatetime

var1Edges = housekeeping_G5NR.binOptions[var1Name]
var2Edges = housekeeping_G5NR.binOptions[var2Name]

var1FmtStr = housekeeping_G5NR.fmtStrOptions[var1Name]
var2FmtStr = housekeeping_G5NR.fmtStrOptions[var2Name]

var1ValueName = housekeeping_G5NR.valueNameOptions[var1Name]
var2ValueName = housekeeping_G5NR.valueNameOptions[var2Name]

var1Units = housekeeping_G5NR.varUnitOptions[var1Name]
var2Units = housekeeping_G5NR.varUnitOptions[var2Name]
metadata_UD = (var1Name+' vs '+var2Name+': '+
               str(lonLow)+' to '+str(lonHigh)+' E, '+
               str(latLow)+' to '+str(latHigh)+' N')

var1ValueMult = housekeeping_G5NR.varMultOptions[var1Name]
var2ValueMult = housekeeping_G5NR.varMultOptions[var2Name]

## (3) Load the Data

**N.B.** CHAD currently expects the 3-D variables to be in the Python format `variable[times,latitudes,longitudes]`.<br>
If this is not the default, you will have to either permute the data below or preprocess the data so that it matches this format.

In [11]:
cdfIn = netCDF4.Dataset(urlToLoad,'r')

In [12]:
#cdfIn.variables

In [13]:
lonValues = cdfIn.variables[lonValueName][:]
latValues = cdfIn.variables[latValueName][:]
timeValues = cdfIn.variables[timeValueName][timelimit1:timelimit2]*housekeeping_G5NR.timeValueMult

#### By finding the needed index ranges here, we can load a subset of the data over the web instead of loading it all and subsetting later.

In [14]:
lowLonInt,highLonInt = housekeeping_G5NR.getIntEdges(lonValues,lonLow,lonHigh)
lowLatInt,highLatInt = housekeeping_G5NR.getIntEdges(latValues,latLow,latHigh)

In [15]:
# How many values are we asking for?
print highLatInt+1-lowLatInt, 'x', highLonInt+1-lowLonInt, 'x', np.size(timeValues)

3 x 3 x 10


*Based on the above, the following variable load command may take some time.*

In [16]:
var1Values = cdfIn.variables[var1ValueName][timelimit1:timelimit2,
                                            lowLatInt:highLatInt+1,
                                            lowLonInt:highLonInt+1]*\
                                            var1ValueMult
var2Values = cdfIn.variables[var2ValueName][timelimit1:timelimit2,
                                            lowLatInt:highLatInt+1,
                                            lowLonInt:highLonInt+1]*\
                                            var2ValueMult

np.shape(var1Values)

(10, 3, 3)

*(We now subset the longitudes and latitudes since the previous call to getIntEdges needed the full lon/lat.)*

In [17]:
lonValues = lonValues[lowLonInt:highLonInt+1]
latValues = latValues[lowLatInt:highLatInt+1]

In [18]:
cdfIn.close()

In [19]:
timeValues

array([ 64800000.,  64803600.,  64807200.,  64810800.,  64814400.,
        64818000.,  64821600.,  64825200.,  64828800.,  64832400.])

## (4) Create ClickHist and ClickHistDo Instances

#### This call is necessary to make sure the output displays properly

(If interested in the details, see: http://bit.ly/1SsishU)

In [20]:
oldsysstdout = sys.stdout
sys.stdout = housekeeping_G5NR.flushfile(sys.stdout)

### Initialize 'ClickHistDo'

In [21]:
ClickHistDo1 = ClickHistDo.ClickHistDo(lonValues,latValues,
                                       timeValues,startDatetime,
                                       bundleInFilenames,
                                       bundleOutTags,
                                       caseNotebookFilename,
                                       xVarName=var1Name,
                                       yVarName=var2Name,
                                       lonOffset=lonOffset,
                                       latOffset=latOffset,
                                       dtFromCenter=dtFromCenter,
                                       imageVar=imageVar,
                                       openTab=False)

### Initialize 'ClickHist' and launch!

If you want the output of CHAD to be in a separate window, make sure `%qtconsole` below is not commented. Otherwise, the text output will appear below the last cell.

In [22]:
#%qtconsole
ClickHist1 = ClickHist.ClickHist(var1Edges,var2Edges,
                                 var1Values,var2Values,
                                 xVarName=var1Name,yVarName=var2Name,
                                 xUnits=var1Units,yUnits=var2Units,
                                 xFmtStr=var1FmtStr,
                                 yFmtStr=var2FmtStr,
                                 maxPlottedInBin=housekeeping_G5NR.maxPlottedInBin_UD,
                                 quantiles=quantiles,
                                 metadata=metadata_UD)
ClickHist1.setDo(ClickHistDo1)
ClickHist1.showPlot()

Saving IDV bundle(s)...
2007-06-05 06:30:00
208 E -20 N
X= 11 mm day-1 Y=-0.417 W m-2
x%: 96.667 y%: 1.111

Link to cloudsir image: http://g5nr.nccs.nasa.gov/static/naturerun/fimages/CLOUDSIR/Y2007/M06/D05/cloudsir_globe_c1440_NR_BETA9-SNAP_20070605_0630z.png

Link to tropical image: http://g5nr.nccs.nasa.gov/static/naturerun/fimages/TROPICAL/Y2007/M06/D05/tropical_globe_c1440_NR_BETA9-SNAP_20070605_0630z.png

Bundle 'full' Saved!
Bundle 'simple' Saved!

Creating Case Notebook (Hamburg_Dec2016_Precip_quantile_96.667_SKEDot_quantile_1.111_lat_-20_lon_208_time_20070605_0630.ipynb)
Copy the template
Adding quicklook images to the notebook
Now looping over  ['cloudsir', 'tropical'] ...
 trying  cloudsir ...
appending 0
 trying  tropical ...
appending 1 and adding URL
Case Notebook created!
