# Plot Temperature TS from aggregated product 

This notebook shows you how to plot the time series aggregated in a single file from the LTPS aggregation. It is important to note that the Aggregated-TimeSeries products are normally very big files with some millions of records. It is possible that some memory issues may arise running this notebook. 

In [2]:
import xarray as xr
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt


In this case the file is accessed from [AODN THREDDS](http://thredds.aodn.org.au/thredds/catalog/IMOS/catalog.html)  server 

In [3]:
fname = 'http://thredds.aodn.org.au/thredds/dodsC/IMOS/ANMN/QLD/PIL050/aggregated_timeseries/IMOS_ANMN-QLD_TZ_20120221_PIL050_FV01_TEMP-aggregated-timeseries_END-20140816_C-20190819.nc'

nc = xr.open_dataset(fname)

Look at the structure of the aggregated dataset. Note that in this case the file is an aggregation of TEMP from 16 instruments from the QLD site PIL050. The dimensions of the file are INSTRUMENT: the number of instruments aggregated and OBSERVATION the total number of observations combined from the 16 instruments

In [None]:
nc

You can PLOT TEMP against it OBSERVATION dimension and it will give you a full idea of all the time series combined one after the other.

In [None]:
nc.TEMP.plot()

With this format is not recommended to directly plot the variable of interest (TEMP) against TIME, as you could have overlapping timestamps in the TIME variable sequence. 

In [None]:
nc.TEMP.plot(x='TIME')

Lets have a look at the individual deployments, identified by the variable `instrument_index`. For that we can convert the xarray dataset to a pandas data frame which is more flexible and provide the results in a tabular format.

In [4]:
## convert to a data frame and close the nc connection
df = nc.to_dataframe()
nc.close()
df.columns

Index(['TEMP', 'TEMP_quality_control', 'TIME', 'DEPTH',
       'DEPTH_quality_control', 'PRES', 'PRES_quality_control', 'PRES_REL',
       'PRES_REL_quality_control', 'instrument_index', 'LONGITUDE', 'LATITUDE',
       'NOMINAL_DEPTH', 'instrument_id', 'source_file'],
      dtype='object')

We use the powerful `groupby` method to group by instrument and produce a summary table of basics statistics for `TEMP`

In [10]:
TIMEmin = df['TIME'].min()
TIMEmax = df['TIME'].max()


In [11]:
df.grouped = df.groupby(['instrument_index'])


  """Entry point for launching an IPython kernel.


In [12]:
df.grouped['TEMP'].describe()

Unnamed: 0_level_0,count,mean,std,min,25%,50%,75%,max
instrument_index,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1
0,4148160.0,26.884838,2.622774,22.2404,23.729601,27.6136,28.7047,30.569
1,414816.0,26.041752,2.6902,21.690001,22.93,27.709999,28.65,29.469999
2,2074064.0,26.414576,2.575372,22.3549,23.7446,27.6105,28.715401,30.5777
3,2074064.0,25.895723,2.753637,21.3454,22.7047,27.6873,28.6325,29.454399
4,375936.0,24.808086,1.319306,22.84,23.73,24.629999,25.459999,29.32
5,375952.0,25.497343,1.965282,23.131701,23.781,25.118099,26.255899,33.877998
6,1879696.0,25.758133,2.116447,23.181601,23.806801,25.4478,27.2202,30.490499
7,1879696.0,24.813429,1.314867,22.8409,23.717899,24.611099,25.477699,30.0194
8,2041424.0,26.732534,2.35607,22.4697,24.006001,27.8622,28.9594,33.055302
9,2041408.0,27.520563,2.221495,23.512199,25.426176,28.951099,29.298599,32.917


In [20]:
df.grouped.plot(x='TIME', y='TEMP', kind='line', subplots=True, xlim=(TIMEmin, TIMEmax), figsize=(10,4))

MemoryError: 

In [19]:
plt.close('all')