<p style="float:right">
<img src="images/cu.png" style="display:inline" />
<img src="images/cires.png" style="display:inline" />
<img src="images/nasa.png" style="display:inline" />
</p>

# Python, Jupyter & pandas tutorial: Module 2

## Obtaining data and basic inspection

### Basic data access

It is, of course, possible to obtain data (rougly construed -- we'll look at images here because they're simple to view) externally (or via the `%%script` magic, which saves the trouble of opening a separate terminal / command / browser window). We can fetch an image to the local filesystem, then display it with Markdown:

In [78]:
%%script bash
wget ftp://sidads.colorado.edu/DATASETS/NOAA/G02135/Feb/N_197902_extn.png

--2016-03-10 16:51:19--  ftp://sidads.colorado.edu/DATASETS/NOAA/G02135/Feb/N_197902_extn.png
           => 'N_197902_extn.png'
Resolving sidads.colorado.edu... 128.138.135.20
Connecting to sidads.colorado.edu|128.138.135.20|:21... connected.
Logging in as anonymous ... Logged in!
==> SYST ... done.    ==> PWD ... done.
==> TYPE I ... done.  ==> CWD (1) /DATASETS/NOAA/G02135/Feb ... done.
==> SIZE N_197902_extn.png ... 224912
==> PASV ... done.    ==> RETR N_197902_extn.png ... done.
Length: 224912 (220K) (unauthoritative)

     0K .......... .......... .......... .......... .......... 22% 1.54M 0s
    50K .......... .......... .......... .......... .......... 45% 45.6M 0s
   100K .......... .......... .......... .......... .......... 68% 40.5M 0s
   150K .......... .......... .......... .......... .......... 91% 44.5M 0s
   200K .......... .........                                  100%  171M=0.04s

2016-03-10 16:51:19 (6.11 MB/s) - 'N_197902_extn.png' saved [224912]



<img src='N_197902_extn.png' style='float:left'/>

Or, we can obtain an image directly from the internet and display in with Python code:

In [79]:
from IPython.display import Image
Image(url='ftp://sidads.colorado.edu/DATASETS/NOAA/G02135/Feb/N_201602_extn.png')

# Yo! Do the "Total area" figures account for the difference in pole-hole size? Seems like, here, the hole is full of ice, but what about earlier/later in the season?

### OpenDAP data access

The `netCDF4` package provide OpenDAP client capabilities. Here we use it to obtain data via an OpenDAP server at NSIDC:

In [3]:
import netCDF4
opendap_base = 'http://opendap.apps.nsidc.org:80/opendap/DATASETS'
opendap_data = 'nsidc0530_MEASURES_nhsnow_daily25/2012/nhtsd25e2_20120101_v01r01.nc'
url = '/'.join([opendap_base, opendap_data])
dataset = netCDF4.Dataset(url)

In [5]:
type(dataset)

netCDF4._netCDF4.Dataset

In [6]:
import re
list(filter(lambda x: not re.match('^_.*', x), dir(dataset)))

['Conventions',
 'DODS_EXTRA.Unlimited_Dimension',
 'Metadata_Conventions',
 'cdm_data_type',
 'close',
 'cmptypes',
 'createCompoundType',
 'createDimension',
 'createEnumType',
 'createGroup',
 'createVLType',
 'createVariable',
 'data_model',
 'date_created',
 'delncattr',
 'dimensions',
 'disk_format',
 'enumtypes',
 'file_format',
 'filepath',
 'geospatial_lat_max',
 'geospatial_lat_min',
 'geospatial_lat_units',
 'geospatial_lon_max',
 'geospatial_lon_min',
 'geospatial_lon_units',
 'get_variables_by_attributes',
 'getncattr',
 'groups',
 'id',
 'institution',
 'isopen',
 'keepweakref',
 'keywords',
 'keywords_vocabulary',
 'license',
 'metadata_link',
 'naming_authority',
 'ncattrs',
 'parent',
 'path',
 'platform',
 'product_version',
 'reference',
 'renameAttribute',
 'renameDimension',
 'renameGroup',
 'renameVariable',
 'sensor',
 'set_auto_mask',
 'set_auto_maskandscale',
 'set_auto_scale',
 'set_fill_off',
 'set_fill_on',
 'setncattr',
 'setncatts',
 'source',
 'spatial_re

In [7]:
dataset.title

'MEaSUREs Northern Hemisphere Terrestrial Snow Cover Extent Daily 25km EASE-Grid 2.0'

In [8]:
for variable in dataset.variables:
    print(variable)

time
rows
cols
coord_system
latitude
longitude
merged_snow_cover_extent
ims_snow_cover_extent
passive_microwave_gap_filled_snow_cover_extent
modis_cloud_gap_filled_snow_cover_extent


In [72]:
latitude = dataset.variables['latitude']
latitude

<class 'netCDF4._netCDF4.Variable'>
float32 latitude(rows, cols)
    _FillValue: -999.0
    long_name: latitude of cell center in EASE-Grid-2.0
    units: degrees_north
    valid_range: [-90.  90.]
    standard_name: latitude
unlimited dimensions: 
current shape = (720, 720)
filling off

In [10]:
type(latitude)

netCDF4._netCDF4.Variable

In [12]:
latitude.datatype

dtype('float32')

In [13]:
latitude.long_name

'latitude of cell center in EASE-Grid-2.0'

In [21]:
latitude.valid_range

array([-90.,  90.], dtype=float32)

In [24]:
latitude.shape

(720, 720)

In [16]:
latitude.ndim

2

In [20]:
latitude.size

518400

In [11]:
list(filter(lambda x: not re.match('^_.*', x), dir(latitude)))

['assignValue',
 'chunking',
 'datatype',
 'delncattr',
 'dimensions',
 'dtype',
 'endian',
 'filters',
 'getValue',
 'get_var_chunk_cache',
 'getncattr',
 'group',
 'long_name',
 'mask',
 'name',
 'ncattrs',
 'ndim',
 'renameAttribute',
 'scale',
 'set_auto_mask',
 'set_auto_maskandscale',
 'set_auto_scale',
 'set_var_chunk_cache',
 'setncattr',
 'setncatts',
 'shape',
 'size',
 'standard_name',
 'units',
 'valid_range']

In [30]:
len(latitude)

720

In [31]:
len(latitude[0])

720

In [85]:
time = dataset.variables['time']
print(time)
print(time[0])

<class 'netCDF4._netCDF4.Variable'>
int32 time(time)
    calendar: gregorian
    axis: T
    units: days since 1998-12-31
    long_name: time
    standard_name: time
unlimited dimensions: time
current shape = (1,)
filling off

4749


In [70]:
longitude = dataset.variables['longitude']
longitude

<class 'netCDF4._netCDF4.Variable'>
float32 longitude(rows, cols)
    _FillValue: -999.0
    long_name: longitude of cell center in EASE-Grid-2.0
    units: degrees_east
    valid_range: [-180.  180.]
    standard_name: longitude
unlimited dimensions: 
current shape = (720, 720)
filling off

In [107]:
msce = dataset.variables['merged_snow_cover_extent']
msce

<class 'netCDF4._netCDF4.Variable'>
int16 merged_snow_cover_extent(time, rows, cols)
    flag_meanings: modis_microwave_ims_report_snow modis_microwave_report_snow modis_ims_report_snow microwave_ims_report_snow modis_only_reports_snow microwave_only_reports_snow ims_only_reports_snow snow_free_land permanent_ice ocean
    flag_values: [10 11 12 13 14 15 16 20 30 40]
    _FillValue: -99
    comment: 10: Snow cover reported by modis_cloud_gap_filled, passive_microwave, ims, 11: Snow cover reported by modis_cloud_gap_filled, passive_microwave,  12: Snow cover reported by modis_cloud_gap_filled, ims, 13: Snow cover reported by passive_microwave, ims, 14: Snow cover reported by modis_cloud_gap_filled only, 15: Snow cover reported by passive_microwave only, 16: Snow cover reported by ims only, 20: Snow free land, 30: Permanent ice covered land, 40: Ocean
    valid_range: [10 40]
    coordinates: longitude latitude time
    long_name: Merged Snow Cover Extent
    grid_mapping: coord_system
u

In [108]:
msce[0][360][360]

40

In [109]:
import numpy as np
msce = np.array(msce)[0, :, :]
msce.shape

(720, 720)

In [111]:
print(msce.size)
print(msce[msce != -99].size)

518400
408052
