# CIS Introduction

CIS has it's own version of the Iris Cube. But it's designed to work with any observational data. The CIS data structure is just called UngriddedData:

<img src="../images/ungridded_data.png" width="640"/>

First create a variable which points to the example data

In [None]:
shared_data_path = '/mnt/efs/fs1/Data_Analysis_In_Python/Data/'

In [None]:
from cis import read_data, read_data_list, get_variables

get_variables(shared_data_path+'Aeronet/920801_150530_Brussels.lev20')

In [None]:
aeronet_aot_500 = read_data(shared_data_path+"Aeronet/920801_150530_Brussels.lev20", "AOT_500")
print(aeronet_aot_500)

In [None]:
aeronet_aot_500.name()

## Some example datasets

In [None]:
%matplotlib inline

### Ungridded time series data

In [None]:
aeronet_aot_500.plot()

In [None]:
ax = aeronet_aot_500.plot(color='red')
ax.set_yscale('log')

In [None]:
aeronet_aot = read_data_list(shared_data_path+"Aeronet/920801_150530_Brussels.lev20", 
                             ['AOT_500', 'AOT_675'])
ax = aeronet_aot.plot()

In [None]:
ax.set_title('Brussels Aeronet AOT')
ax.set_xlabel('Date')

In [None]:
from datetime import datetime 
ax.set_xlim(datetime(2007,5,5), datetime(2007,8,26))    

In [None]:
aeronet_aot.plot(how='comparativescatter')
# Note that this will only work if we have two datasets in our list

### Subsetting

CIS is able to `subset` datasets across any of the given coordinates

In [None]:
aeronet_aot_2007 = aeronet_aot.subset(t=[datetime(2007,1,1), datetime(2007,12,31)])
aeronet_aot_2007

In [None]:
aeronet_aot_2007.plot(how='comparativescatter')

### Model time series

In [None]:
model_aod = read_data(shared_data_path+"od550aer.nc", "od550aer")

In [None]:
print(model_aod)

In [None]:
import iris.analysis
maod_global_mean, = model_aod.collapsed(['longitude', 'latitude'], iris.analysis.MEAN)

In [None]:
print(maod_global_mean)

In [None]:
ax = maod_global_mean.plot(itemwidth=2)

In [None]:
aeronet_aot_500.plot(ax=ax)

### Aircraft data

In [None]:
number_concentration = read_data(shared_data_path+'/ARCPAC_2008', 
                                 'NUMBER_CONCENTRATION')
print(number_concentration)

In [None]:
import numpy as np
# Fix the maks on the data....
number_concentration.data = np.ma.masked_less(number_concentration.data, 0.)

In [None]:
ax = number_concentration.plot()

In [None]:
ax.bluemarble()


### Satellite data

In [None]:
aerosol_cci = read_data(shared_data_path+'/AerosolCCI', 'AOD550')
aerosol_cci.plot()

In [None]:
aerosol_cci_one_day = read_data(shared_data_path+'AerosolCCI/20080415*.nc', 'AOD550')
ax = aerosol_cci_one_day.plot()

In [None]:
aerosol_cci_one_day.plot(projection='Orthographic')

In [None]:
ax=aerosol_cci_one_day.plot(projection='InterruptedGoodeHomolosine')
ax.bluemarble()

## Aggregation

Given a set of UngriddedData...

<img src="../images/ungridded_aggregation_1.png" width="640"/>

... we can perform an aggregation over a specified grid...

<img src="../images/ungridded_aggregation_2.png" width="640"/>

... to create a new GriddedData object (which is essentiall an Iris Cube)

<img src="../images/ungridded_aggregation_3.png" width="640"/>

In [None]:
gridded_aerosol_cci_one_day = aerosol_cci_one_day.aggregate(x=[-180,180,10], y=[-90,90,5])

In [None]:
gridded_aerosol_cci_one_day[0].plot()

## Exercises

**1.** Read in ``AOD550`` and ``AOD670`` from the 5 days of satellite data 

**2.** Subset this data down to the region covered by the aircraft data

**3.** Try plotting ``AOD550`` against ``AOD670`` from the subsetted satellite data using a comparative scatter plot


## Collocation

<img src="../images/collocation_options.png" width="640"/>

### Model onto Aeronet

<img src="../images/model_onto_aeronet.png" width="640"/>

This is an gridded onto un-gridded collocation and can be done using either linear interpolation or nearest neighbour.

This is very quick and in general CIS can even handle hybrid height coordinates: 

<img src="../images/gridded_ungridded_collocation.png" width="640"/>

In [None]:
# Lets take a closer look at the model data
print(model_aod)

In [None]:
from cis.time_util import PartialDateTime
# First subset the aeronet data:
aeronet_aot_2008 = aeronet_aot_500.subset(t=PartialDateTime(2008))

Note that we don’t actually have to do this subsetting, but that otherwise CIS will interpolate the nearest values, which in this case we don’t really want.

In [None]:
# Now do the collocation:
model_aod_onto_aeronet = model_aod.collocated_onto(aeronet_aot_2008)

In [None]:
print(model_aod_onto_aeronet[0])

Note the updated history

In [None]:
from cis.plotting.plot import multilayer_plot, taylor_plot
ax = multilayer_plot([model_aod_onto_aeronet[0], aeronet_aot_2008], 
                     layer_opts=[dict(label='Model'), 
                                 dict(label='Aeronet')], xaxis='time',
                    itemwidth=1)

In [None]:
taylor_plot([aeronet_aot_2008, model_aod_onto_aeronet[0]], 
            layer_opts=[dict(label='Aeronet'),dict(label='Model')])

In [None]:
# Basic maths on the data
print(model_aod_onto_aeronet[0] - aeronet_aot_2008)

### Aircraft onto satellite

<img src="../images/aircraft_onto_satellite.png" width="640"/>

As you can see the difficulty here is the sparseness of the aircraft data, and actually of the satellite data in this region.

This is an ungridded to ungridded collocation:

<img src="../images/ungridded_ungridded_collocation.png" width="640" />

In [None]:
# Read all of the AOD satelite variables
aerosol_cci = read_data_list(shared_data_path+'AerosolCCI', 'AOD*0')
aoerosol_cci_Alaska = aerosol_cci.subset(x=[-170,-100],y=[35,80])

In [None]:
print(aerosol_cci)

In [None]:
aoerosol_cci_Alaska[0].plot(yaxis='latitude')

In [None]:
aerosol_cci_collocated = aoerosol_cci_Alaska.collocated_onto(number_concentration, 
                                                             h_sep=10, t_sep='P1D')

In [None]:
aerosol_cci_collocated.append(number_concentration)
print(aerosol_cci_collocated)

In [None]:
aerosol_cci_collocated = aerosol_cci_collocated[::3]

In [None]:
aerosol_cci_collocated[:2].plot('comparativescatter')

## Exercises

**1.** How does the correlation change if we only include those average number concentrations which averaged more than one point?

**2.** Consider the case of comparing our model AOD with the AerosolCCI.

**a.** What strategies could you employ?
    
**b.** Perform an initial assesment of the model AOD field using the Aerosol CCI data for the few days we have data.

## CIS and Pandas

In [None]:
df = aerosol_cci_collocated.as_data_frame()
print(df)

In [None]:
df.corr()
# Then do a pretty plot of it...
# This is a nice segway into the Pandas lesson.

In [None]:
# Save the collocation output so that we can come back to it during the Pandas tutorial.
aerosol_cci_collocated.save_data('col_output.nc')