## Data Report Assignment.

<span style='color:blue'> Instructions </span>:

- Rename this notebook to include your name.
- Use the *_recovered.csv file that lists the *.nc files. 
- Use the outline below to complete the assignment.

### Notebook Outline:
- Plot **two** of the science variables using [**plot_var( )**](#1) function. 
> - load your data files.
  - choose **sci_var** from the following list:
     ['density', 'practical_salinity', 'ctdmo_seawater_temperature', 'ctdmo_seawater_conductivity']
  - choose **ctdmo_seawater_pressure** for the **outlier_var**
  - to run plot_var() function use the following syntax:
      fig, ax1, ax2 = plot_var(datafile, sci_var, outlier_var)
      - the function should return two plots.
  - add dependensices: reject_outliers() function.
 
- Compare data annotations and data notes to the observations in the example notebook and answer the following question:
    - Do the annotations and notes address all data issues? Explain your answer. Refer to previous note books from Modules 3 and 4 if necessary.

<a id="1"></a>
### plot_var function:

In [41]:

def plot_var(file_list, sci_var, outlier_var):
    '''
    datafile: a .csv file with netCDF file(s) names.
    sci_var: a science variable in the netCDF file.
    outlier_var: a variable selected to use to find outiers in the data.
    
    - to run the function use the following syntax:
    fig, ax1, ax2 = plot_var(datafile, sci_var, outlier_var)
    
    - dependensices: reject_outliers()
    '''
    
    fig, (ax1, ax2) = plt.subplots(2,1,figsize=(14,10))
    
    for dataset in file_list['files']:

        #open dataset
        ds = xr.open_dataset(dataset)
        
        # prepare time array for plotting
        ds = ds.swap_dims({'obs': 'time'})        
        date_arr = [str(d)[:-4] for d in ds['time'].values]
        date_arr = [datetime.strptime(d, '%Y-%m-%dT%H:%M:%S.%f') for d in date_arr]
        
        # get dataset method and set yaxis label
        delivery_method = dataset.split('_')[1].split('-')[4]
        yaxis_label = ds[sci_var].long_name + ds[sci_var].units

        # alternate colors between telemetered and recovered datasets
        if delivery_method == 'recovered':
            color_dots = 'blue'
        else:
            color_dots = 'cyan'
        
        # Get outlier indices using the function 'reject_outliers()'.
        # Notice the variable used to select outliers is seawater pressure
        ind = reject_outliers(ds[outlier_var].values, 5)
        
        res_t = [i for i, val in enumerate(ind) if val == True] 
        res_f = [i for i, val in enumerate(ind) if val == False]

        date_list_good = [date_arr[i] for i in res_t]
        temp_good = [ds[sci_var].values[i] for i in res_t]

        date_list_suspect = [date_arr[i] for i in res_f]
        temp_suspect = [ds[sci_var].values[i] for i in res_f]

        # plot all data
        ds[sci_var].plot(ax=ax1, linestyle='None', marker='.', markersize=1, color=color_dots) 
        # plot outliers
        ax1.plot(date_list_suspect, temp_suspect, linestyle='None',marker='o',markersize=10, fillstyle='none')
        # set subplot 1
        ax1.set_ylabel(yaxis_label, rotation='vertical', wrap=False)
        ax1.set(title= 'with outliers')
        
        # plot non-suspect data
        ax2.plot(date_list_good, temp_good, linestyle='None', marker='.', markersize=1, color=color_dots) 
        # set subplot 2
        ax2.set(title= 'without outliers')       
        ax2.set_ylabel(yaxis_label, rotation='vertical', wrap=False)
        
        # fix plot to the best fit
        plt.tight_layout()
        
    return fig, ax1, ax2