# Download hydrology results from BigQuery
---

## Introduction
This notebook provides an example for downloading hydrology modeling results from the Nature Conservancy's Stormwater Heatmap. 
Access to the `tnc-data-v1` project in Google Cloud Platform is required. 

For more details and instructions see the [documentation on the stormwater heatmap website](https://www.stormwaterheatmap.org/docs/timeseries). 

---




## Variables 

### Grid ID

`grid_id` refers to the WRF precipitation grid id for the location of interest. 

### HRU 
`hru` refers to the [hydrologic response unit](https://www.stormwaterheatmap.org/docs/Data%20Layers/hydrologic_response_units) of interest. 

`hru` contains an three-digit encoding as described below: 

- First digit: Hydrologic Soil Group Number (0 = A/B, 1 = C, 2 = Saturated)
- Second digit: Land cover (0=Forest, 1=Pasture, 2=Lawn, 5=Impervious)
- Third Digit: Slope (0=Flat, 1=Mod, 2=Steep)

### Flow Path 

`flow_path` refers to the hspf flow path for which results are calculated. 

Available flow paths are: 
- `suro` - Surface Runoff 
- `ifwo` - Interflow
- `agwo` - outflow to groundwater

### Datetime

`Datetime` is the timestamp for the hourly simulation result. 




## Tables 
There are two options for accessing result tables: 
1. Grid and flowpath specific tables
2. Single table for all grids 

### Grid and flowpath specific tables.
To reduce query costs, this option should be used when querying a single grid or single flow path. Results have the following schema: `tnc-data-v1.{grid_id}.{flowpath}` 
```sql
SELECT
  datetime,
  hru
FROM
  tnc-DATA-v1.{grid_id}.{flowpath}
ORDER BY
  datetime


```

### Grid and flowpath specific tables.
A single table is available that holds all results. It is found at ```tnc-data-v1.hydrology.gfdl``` This table also includes useful columns such as `year`, `month`, and `simulation_day`

You can query this table for flow path and hru results. An example query is below: 

```sql 

SELECT
  datetime,
  SUM(hru250)
FROM
  tnc-data-v1.hydrology.gfdl
WHERE
  comp = 'suro'
  OR comp = 'agwo'
  and year between 1970 and 2000 
GROUP BY
  datetime
ORDER BY
  datetime
```

# Code

---


In [9]:
## Install client libraries if needed
#!pip install --upgrade pandas-gbq 

In [4]:
#import library
import pandas_gbq
import tqdm 
#set project id
project_id = 'tnc-data-v1'


Create the SQL statement 

In [5]:
sql = """
SELECT  datetime, hru250 FROM `tnc-data-v1.ID5_V15.suro` order by datetime
"""

Read the table from BigQuery. If not already authenticated, this command will trigger an authentication to Google Cloud Platform in a web browser.  

In [11]:
suro = pandas_gbq.read_gbq(sql, project_id=project_id)


  record_batch = self.to_arrow(


View the data

In [8]:
# view data 
suro.head

<bound method NDFrame.head of                          datetime  hru250
0       1970-01-02 00:00:00+00:00     0.0
1       1970-01-02 01:00:00+00:00     0.0
2       1970-01-02 02:00:00+00:00     0.0
3       1970-01-02 03:00:00+00:00     0.0
4       1970-01-02 04:00:00+00:00     0.0
...                           ...     ...
1139520 2099-12-30 20:00:00+00:00     0.0
1139521 2099-12-30 21:00:00+00:00     0.0
1139522 2099-12-30 22:00:00+00:00     0.0
1139523 2099-12-30 23:00:00+00:00     0.0
1139524 2099-12-31 00:00:00+00:00     0.0

[1139525 rows x 2 columns]>