# Table of Contents
 <p><div class="lev1 toc-item"><a href="#User-story" data-toc-modified-id="User-story-1"><span class="toc-item-num">1&nbsp;&nbsp;</span>User story</a></div><div class="lev1 toc-item"><a href="#Dataset-creation:" data-toc-modified-id="Dataset-creation:-2"><span class="toc-item-num">2&nbsp;&nbsp;</span>Dataset creation:</a></div><div class="lev2 toc-item"><a href="#Example-dataset-we-are-working-with:" data-toc-modified-id="Example-dataset-we-are-working-with:-21"><span class="toc-item-num">2.1&nbsp;&nbsp;</span>Example dataset we are working with:</a></div><div class="lev1 toc-item"><a href="#Queries-availables-through-the-query-api" data-toc-modified-id="Queries-availables-through-the-query-api-3"><span class="toc-item-num">3&nbsp;&nbsp;</span>Queries availables through the query api</a></div><div class="lev2 toc-item"><a href="#selected-point-table-view-of-all-variables-columns-" data-toc-modified-id="selected-point-table-view-of-all-variables-columns--31"><span class="toc-item-num">3.1&nbsp;&nbsp;</span>Selected point table view of all variables (columns)</a></div><div class="lev2 toc-item"><a href="#selected-point-table-view-of-selected-columns-filtering-by-date" data-toc-modified-id="selected-point-table-view-of-selected-columns-filtering-by-date-32"><span class="toc-item-num">3.2&nbsp;&nbsp;</span>Selected point table view of selected columns filtering by date</a></div><div class="lev2 toc-item"><a href="#selected-point-min-and-max-year-available" data-toc-modified-id="selected-point-min-and-max-year-available-33"><span class="toc-item-num">3.3&nbsp;&nbsp;</span>Selected point min and max year available</a></div><div class="lev2 toc-item"><a href="#selected-point-agg-statistics-for-selected-columns" data-toc-modified-id="selected-point-agg-statistics-for-selected-columns-34"><span class="toc-item-num">3.4&nbsp;&nbsp;</span>Selected point agg statistics for selected columns</a></div><div class="lev2 toc-item"><a href="#histograms-range-of-years" data-toc-modified-id="histograms-range-of-years-35"><span class="toc-item-num">3.5&nbsp;&nbsp;</span>Histograms: range of years</a></div><div class="lev2 toc-item"><a href="#histograms-select-one-year" data-toc-modified-id="histograms-select-one-year-36"><span class="toc-item-num">3.6&nbsp;&nbsp;</span>Histograms: select one year</a></div>

In [1]:
import requests
import json
import pandas as pd 
from pprint import pprint
from IPython.display import display, Markdown, HTML
import matplotlib.pyplot as plt
from numpy import arange

In [2]:
prod_env= 'https://api.resourcewatch.org'
staging_env='https://staging-api.globalforestwatch.org'
api_post= staging_env + '/v1/{endpoint}'
api_baseurl= staging_env + '/v1/{endpoint}/{dataset_id}'

def query(sql, dataset_id, geostore):
    display(Markdown(">#### Sql applied:  "))
    display(Markdown(">>```sql {0}  ```".format(sql)))
    payload = {"sql": sql,
              "geostore": geostore,
              "geoagg": "max"}
    r = requests.get(api_baseurl.format(endpoint='query',dataset_id=dataset_id), params=payload)
    if r.status_code != 200:
        return print(r.json())
    else:
        return pd.read_json(json.dumps(r.json()['data']), orient='records')

# User story

Dataset creation
* User should be able to create nexgddp datasets - as many as needed. (different time aggregation level, scenarios and models)

Widget editor:  
* User should be able to select a point in space (latitude + longitude). In the future we will allow to select a bounding box (in which case user needs to select the type of aggregation)
* User can filter the range of years from the available years. Available years will be implemented in the /fields response
* User can select type of chart (line or bar) and display it. Years should be ordered asc by default
* User has the Category, Value and Filters option (no order, no limit). 
* User can save the widget and embed it elsewhere

****Once Bounding Box selection is implemented:*** User can select different aggregation functions over the original data: avg, min, max, std, plus st_histogram and temporal series.


*[Source document](https://docs.google.com/document/d/1psuVjlVYmgUx7ba_1B9aUw5VuNtP3zp8dERPgExV098/edit?ts=59c0e980)*

# Dataset creation:

```json
{
    "connectorType":"rest",
    "provider":"nexgddp",
    "tableName": "historical/ACCESS1_0"
    "application":[
     "rw"
    ],
    "name":"Nexgddp"
    ...
}
```
Dataset aggregated by year. The table name will be the `<scenario/model>` all in lowercase and `_` istead of `-`.  
F.E: `historical/access1_0`

List of scenarios
* historical
* rcp45
* rcp85

List of models available
* ACCESS1-0 
* CSIRO-MK3-6-0 
* MIROC-ESM
* BCC-CSM1-1 
* GFDL-CM3 
* MIROC-ESM-CHEM
* BNU-ESM 
* GFDL-ESM2G 
* MIROC5
* CanESM2 
* GFDL-ESM2M 
* MPI-ESM-LR
* CCSM4 
* INMCM4 
* MPI-ESM-MR
* CESM1-BGC 
* IPSL-CM5A-LR 
* MRI-CGCM3
* CNRM-CM5 
* IPSL-CM5A-MR 
* NorESM1-M


## Example dataset we are working with:

In [3]:
dataset_id='nexgddp-historical-ACCESS1_0'
geojson={"type":"FeatureCollection","features":[{"type":"Feature","properties":{},"geometry":{"type":"Point","coordinates":[-3.6749267578125,40.43858586704331]}}]}

r = requests.get(api_baseurl.format(endpoint='dataset',dataset_id=dataset_id))
pprint(r.json())    

{'data': {'attributes': {'application': ['prep'],
                         'attributesPath': None,
                         'blockchain': {},
                         'clonedHost': {},
                         'connectorType': 'rest',
                         'connectorUrl': '',
                         'dataPath': None,
                         'env': 'production',
                         'errorMessage': None,
                         'geoInfo': False,
                         'layerRelevantProps': [],
                         'legend': {'country': [], 'date': [], 'region': []},
                         'name': 'nexgddp-historical-ACCESS1_0',
                         'overwrite': False,
                         'provider': 'nexgddp',
                         'published': True,
                         'slug': 'nexgddp-historical-ACCESS1_0',
                         'status': 'saved',
                         'subtitle': None,
                         'tableName': 'historical/ACCESS1_

In [4]:
payload = {"geojson":geojson}
r = requests.post(api_post.format(endpoint='geostore'),json=payload)
geostore=r.json()['data']['id']

display(Markdown("""> #### Geojson to display:  

>> ```json 
{0} 
```""".format(geojson)))
display(Markdown("> ### Geostore creation:  "))
pprint(r.json())

> #### Geojson to display:  

>> ```json 
{'type': 'FeatureCollection', 'features': [{'type': 'Feature', 'properties': {}, 'geometry': {'type': 'Point', 'coordinates': [-3.6749267578125, 40.43858586704331]}}]} 
```

> ### Geostore creation:  

{'data': {'attributes': {'areaHa': 0,
                         'bbox': [-3.6749267578125,
                                  40.43858586704331,
                                  -3.6749267578125,
                                  40.43858586704331],
                         'geojson': {'crs': {},
                                     'features': [{'geometry': {'coordinates': [-3.6749267578125,
                                                                                40.43858586704331],
                                                                'type': 'Point'},
                                                   'type': 'Feature'}],
                                     'type': 'FeatureCollection'},
                         'hash': 'e669023a10c35364e0b1780c181bd06b',
                         'lock': False,
                         'provider': {}},
          'id': 'e669023a10c35364e0b1780c181bd06b',
          'type': 'geoStore'}}


In [5]:
r = requests.get(api_baseurl.format(endpoint='fields',dataset_id=dataset_id))
display(Markdown("> ### Dataset fields metadata: "))
pprint(r.json())

> ### Dataset fields metadata: 

{'fields': {'cdd': {'type': 'Quantity', 'uom': '10^0'},
            'cwd': {'type': 'Quantity', 'uom': '10^0'},
            'frost_days': {'type': 'Quantity', 'uom': '10^0'},
            'icing_days': {'type': 'Quantity', 'uom': '10^0'},
            'pr1p': {'type': 'Quantity', 'uom': '10^0'},
            'pr2in': {'type': 'Quantity', 'uom': '10^0'},
            'pr3in': {'type': 'Quantity', 'uom': '10^0'},
            'pr4in': {'type': 'Quantity', 'uom': '10^0'},
            'pr99p': {'type': 'Quantity', 'uom': '10^0'},
            'prmaxday': {'type': 'Quantity', 'uom': '10^0'},
            'tmax100F': {'type': 'Quantity', 'uom': '10^0'},
            'tmax1p': {'type': 'Quantity', 'uom': '10^0'},
            'tmax5day': {'type': 'Quantity', 'uom': '10^0'},
            'tmax90F': {'type': 'Quantity', 'uom': '10^0'},
            'tmax95F': {'type': 'Quantity', 'uom': '10^0'},
            'tmax99p': {'type': 'Quantity', 'uom': '10^0'},
            'tmin1p': {'type': 'Quantity', 'uom': '

In [6]:
r = requests.get(api_baseurl.format(endpoint='geostore',dataset_id=geostore))
display(Markdown("> ### Geostore"))
pprint(r.json())

> ### Geostore

{'data': {'attributes': {'areaHa': 0,
                         'bbox': [-3.6749267578125,
                                  40.43858586704331,
                                  -3.6749267578125,
                                  40.43858586704331],
                         'geojson': {'crs': {},
                                     'features': [{'geometry': {'coordinates': [-3.6749267578125,
                                                                                40.43858586704331],
                                                                'type': 'Point'},
                                                   'type': 'Feature'}],
                                     'type': 'FeatureCollection'},
                         'hash': 'e669023a10c35364e0b1780c181bd06b',
                         'lock': False,
                         'provider': {}},
          'id': 'e669023a10c35364e0b1780c181bd06b',
          'type': 'geoStore'}}


# Queries availables through the query api

It is mandatory to send a geostore representing a point (in the future also bbox will be allowed to evaluate a region of interest).  
In order to test it we are going to use the next geostore code: `e669023a10c35364e0b1780c181bd06b`

In [7]:
display(Markdown("> ## Selected point table view of all variables (columns)"))
sql = '''
select * from "historical/ACCESS1_0"
'''
dataset = query(sql, dataset_id, geostore)
display(dataset.head(6))
#display(Markdown("> ### Visual"))
#display(dataset.plot.bar(x='x',y='y',figsize=(10, 10)));

> ## Selected point table view of all variables (columns)

>#### Sql applied:  

>>```sql 
select * from "historical/ACCESS1_0"
  ```

{'errors': [{'detail': 'Invalid Select', 'status': 400}]}


AttributeError: 'NoneType' object has no attribute 'head'

In [20]:
display(Markdown("> ## Selected point table view of selected columns filtering by date"))
sql = '''
select prmaxday from nexgddp-historical-ACCESS1_0 where year between 1950 and 1955
'''
dataset = query(sql, dataset_id, geostore)
display(dataset.head(6))
display(Markdown("> ### Visual"))

> ## Selected point table view of selected columns filtering by date

>#### Sql applied:  

>>```sql 
select prmaxday from nexgddp-historical-ACCESS1_0 where year between 1950 and 1955
  ```

Unnamed: 0,prmaxday,year
0,70.5369,1950
1,44.4796,1951
2,25.75,1952
3,83.7281,1953
4,32.4476,1954
5,58.466,1955


> ### Visual

In [27]:
dataset.plot.bar(x='year',y='prmaxday',figsize=(6, 6));

<IPython.core.display.Javascript object>

In [9]:
display(Markdown("> ## Selected point min and max year available"))
sql = '''
Select max(year), min(year) from from nexgddp-historical-ACCESS1_0
'''
dataset = query(sql, dataset_id, geostore)
display(dataset.head(6))
#display(Markdown("> ### Visual"))
#display(dataset.plot.bar(x='x',y='y',figsize=(10, 10)));

> ## Selected point min and max year available

>#### Sql applied:  

>>```sql 
Select max(year), min(year) from from nexgddp-historical-ACCESS1_0
  ```

{'errors': [{'detail': 'Invalid Select', 'status': 400}]}


AttributeError: 'NoneType' object has no attribute 'head'

In [10]:
display(Markdown("> ## Selected point agg statistics for selected columns"))
display(Markdown(""">  Available aggregation functions:
* `avg(<column>)`
* `min(<column>)`
* `max(<column>)`
* `std(<column>)` 

> ***`Group by <column>`  operation is not currently available*
"""))
sql = '''
Select avg(prmaxday) from nexgddp-historical-ACCESS1_0 where year between 1950 and 1955
'''
dataset = query(sql, dataset_id, geostore)
display(dataset.head(6))
#display(Markdown("> ### Visual"))
#display(dataset.plot.bar(x='x',y='y',figsize=(10, 10)));

> ## Selected point agg statistics for selected columns

>  Available aggregation functions:
* `avg(<column>)`
* `min(<column>)`
* `max(<column>)`
* `std(<column>)` 

> ***`Group by <column>`  operation is not currently available*


>#### Sql applied:  

>>```sql 
Select avg(prmaxday) from nexgddp-historical-ACCESS1_0 where year between 1950 and 1955
  ```

{'errors': [{'detail': 'Internal Server Error', 'status': 500}]}


AttributeError: 'NoneType' object has no attribute 'head'

In [11]:
display(Markdown("> ## Histograms: range of years"))
display(Markdown("> **disclaimer: this will only make sesnse for bbox"))

payload = {"geojson":{"type":"FeatureCollection","features":[{"type":"Feature","properties":{},"geometry":{"type":"Polygon","coordinates":[[[-4.4000244140625,40.14109012528468],[-2.955322265625,40.14109012528468],[-2.955322265625,40.65563874006118],[-4.4000244140625,40.65563874006118],[-4.4000244140625,40.14109012528468]]]}}]}}
r = requests.post(api_post.format(endpoint='geostore'),json=payload)
geostore=r.json()['data']['id']

display(Markdown("""> #### Geojson to display:  

>> ```json 
{0} 
```""".format(geojson)))
display(Markdown("> ### Geostore creation:  "))

sql = '''
Select st_histogram(prmaxday) from nexgddp-historical-ACCESS1_0 where year between 1950 and 1952
'''
dataset = query(sql, dataset_id, geostore)
display(dataset.head(6))

> ## Histograms: range of years

> **disclaimer: this will only make sesnse for bbox

> #### Geojson to display:  

>> ```json 
{'type': 'FeatureCollection', 'features': [{'type': 'Feature', 'properties': {}, 'geometry': {'type': 'Point', 'coordinates': [-3.6749267578125, 40.43858586704331]}}]} 
```

> ### Geostore creation:  

>#### Sql applied:  

>>```sql 
Select st_histogram(prmaxday) from nexgddp-historical-ACCESS1_0 where year between 1950 and 1952
  ```

Unnamed: 0,buckets,max,min,st_histogram,year
0,256,116.271501,17.181006,"[1, 0, 1, 2, 2, 0, 1, 0, 1, 0, 0, 0, 0, 1, 1, ...",1950
1,256,128.741944,9.483575,"[1, 0, 0, 3, 0, 0, 1, 2, 1, 0, 0, 0, 0, 0, 0, ...",1951
2,256,35.366004,13.731727,"[1, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 1, ...",1952


In [16]:
display(Markdown("> ## Histograms: select one year"))
sql = '''
Select st_histogram(prmaxday) from nexgddp-historical-ACCESS1_0 where year=1998
'''
dataset = query(sql, dataset_id, geostore)
display(dataset.head(6))
display(Markdown("> ### Visual"))
steps = (dataset['max'][0]-dataset['min'][0])/dataset['buckets'][0]
print(dataset['min'][0], dataset['max'][0],steps)

> ## Histograms: select one year

>#### Sql applied:  

>>```sql 
Select st_histogram(prmaxday) from nexgddp-historical-ACCESS1_0 where year=1998
  ```

Unnamed: 0,buckets,max,min,st_histogram,year
0,256,80.373535,14.862143,"[2, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 1, 0, 0, ...",1998


> ### Visual

14.8621432024 80.3735345167 0.255903872322


In [17]:
plt.plot(arange(dataset['min'][0], dataset['max'][0],steps),dataset['st_histogram'][0])
plt.show()

<IPython.core.display.Javascript object>