# Disclaimer
A reminder that this UI is meant to demonstrate the art of the possible. Different components were created in different and possibly disjointed ways to showcase various capabilities. The goal is to help think of what the user experience COULD look like, rather than build a fully functional user interface.

# Overview
Welcome to the new way to load data! We have created this notebook so that you can select what type of data you want and then load that data without having to figure out how. You don't need to know where the data is coming from or how to extract it - just pick the specifications you want for the data, run the program, and you will get a dataset returned that you can interact with. 

## What to expect in each section
### Data specification exploration
In this section, you will be able to see what types of data specifications you might want to select. For example, we'll show you some of the options for the activities, and you can simply select which activities you are interested in seeing. These come from the Amazon Sustanability Data Initiative, or [ASDI](https://sustainability.aboutamazon.com/environment/the-cloud/amazon-sustainability-data-initiative) if you want to reference or learn more about the data specifications.

### Getting the data set
Now that you have selected the various data sets, you can simply run the cell to get the data set. Behind the scenes, this dynamically determines where the data is stored, how to extract it, and does so in just a few minutes! Feel free to grab a coffee while you wait for it to run. 

### Data exploration and analysis
Now that you have the data, you can use this section of the notebook to run experiments on the data, plot it, create hypotheses, and eventually publish your data. If you decide you want to examine a different data set, you can start again at the beginning and keep all of the code youve written here to analyze the new returned data set. 

# Data specification exploration

## View and select the desired inputs for the data
The cells below will show you some of the available data, and allow you to specify variables as well as filter down lists to make it easy for you to further specify what data you want. 

In [None]:
# Display and save variable categories (e.g. temperature)
%run get_variables.ipynb

### For demo purposes
For demo purposes, let's take a look at the attribute we've stored

In [None]:
%store -r desired_attribute data_type start_date end_date
print(desired_attribute)
print(data_type)
print(start_date)
print(end_date)

## Set catalog inputs
The cells below will show you some of the available catalog inputs, and allow you to specify variables as well as filter down lists to make it easy for you to further specify what data you want. 

In [None]:
%run get_catalog_inputs.ipynb

### For demo purposes
For demo purposes, let's take a look at the attribute we've stored

In [None]:
%store -r activity_id
%store -r variable_id
%store -r table_id
print(activity_id)
print(variable_id)
print(table_id)

## Getting the data set
Now that we have selected and stored our variables, we will run the script with (some) of those variables as inputs. We will then get the data as a result

#### Run processes to generate data set
Use the inputs provided to run the processes to generate the data set using the filters and specifications provided above

In [None]:
%%time
%%capture
%run get_data.ipynb

### Cost
Now, let's look at how much this calculation cost to run

In [None]:
%store -r cost_information
print(cost_information)

## Data exploration and analysis
We can now do analysis on the data set with the data as an xarray. The cells below calculate the mean and regrided value for the data set

In [None]:
import numpy as np
from dask_worker_pools import pool, propagate_pools

In [None]:
#some paramters for regridding
regrid_lon=np.arange(0.0,360.0,0.1)
regrid_lat=np.arange(-90.0,90.0,0.1)
regrid_method='slinear'

### Predictive

#### Retrieve calculated data set

In [None]:
%%time
if (data_type != "historical"):
    %store -r prediction_pool_region predictive_data_set
    with pool(prediction_pool_region):
        predicted_tas_mean = (predictive_data_set['tas']-273).mean(dim='time') #mean
        predicted_tas_regridded = predicted_tas_mean.interp(lon=regrid_lon,lat=regrid_lat,method=regrid_method) #dodgy regridding

    with pool(prediction_pool_region):
        predicted_tas_regridded.compute() #explict compute so you can see where it happens (could just do .plot() but it would be hidden)
        predicted_tas_regridded
        predicted_tas_regridded.plot(figsize=(14,7))
        plt.title(f'2090-2100 Mean {desired_attribute.replace("_"," ")}')
else:
    print("Did not run predictive calculations for historical data")

### Historical

#### Perform Calculations

In [None]:
%%time
if data_type != "predictive":
    %store -r historical_data historical_pool_region
#     %matplotlib widget
    with pool(historical_pool_region):
        historic_temp_in_C = historical_data[desired_attribute] - 273.0
        historic_temp_mean = historic_temp_in_C.mean(dim='time0') #mean
        historic_temp_regridded = historic_temp_mean.interp(lon=regrid_lon,lat=regrid_lat,method=regrid_method) #dodgy regrid
        historic_temp_regridded.compute()
        historic_temp_regridded.plot(figsize=(14,7))

        plt.title(f'ERA5 Mean {desired_attribute.replace("_"," ")}')
else:
    print("Did not run historical calculations for predictive data")

## Publish analysis
Once you are ready to publish your analysis, click the button below to publish it as a web page for others to view

In [None]:
%run publish_research.ipynb