# TsDB  Usage

* Setup
* Connect to DB
* Table structure
* Views
* Read data from TsDB 
* Import Data

## Resources

* Learning with Kaggle
  * [Kaggle - Python](https://www.kaggle.com/learn/python)
  * [Kaggle - Pandas](https://www.kaggle.com/learn/pandas)
  * [Kaggle - SQL intro](https://www.kaggle.com/learn/intro-to-sql)

* Docs
  * [Dash/Plotly](http://dash.plotly.com/)
  * [Pandas](https://pandas.pydata.org/pandas-docs/stable/user_guide/index.html#user-guide)
  * [TimescaleDB](https://docs.timescale.com/latest/introduction)
  * [SQL Tutorial](https://www.w3schools.com/sql/default.asp)
* Cheat Sheets
  * [Conda](https://docs.conda.io/projects/conda/en/latest/_downloads/843d9e0198f2a193a3484886fa28163c/conda-cheatsheet.pdf)
  * [Pandas](https://pandas.pydata.org/Pandas_Cheat_Sheet.pdf)
  * [PostgreSQL through SQLAlchemy](https://www.compose.com/articles/using-postgresql-through-sqlalchemy/)

In [1]:
import dash
import dash_core_components as dcc
import dash_html_components as html
import plotly.graph_objs as go
import plotly.figure_factory as FF
import plotly.offline as offline
from datetime import datetime
import glob
import os.path
import pymysql
import sqlconfig # From sqlconfig.py
import pandas as pd
import sqlalchemy
import psycopg2
from tqdm import tqdm
print("Import Complete")

Import Complete


## SQL setup
create [sqlalchemy](https://docs.sqlalchemy.org/en/13/core/engines.html#postgresql) engine to connect to DB
using SQL credentials from `sqlconfig.py`

Host IP - 34.68.85.80

```python
passwd = "passwd"  # password for DB
user = "user"  # Username for DB
DB = 'cbas'  # name of database
```


In [7]:
passwd = sqlconfig.passwd  # From sqlconfig.py
user = sqlconfig.user  # From sqlconfig.py
DB = 'cbas'  #name of databases to activate 

In [14]:
print("User: "+user) # check user

User: ad


In [15]:
engine = sqlalchemy.create_engine('postgresql+psycopg2://'+user+':'+passwd+'@34.68.85.80/'+DB)

---

### DB/table structure

Databases and tables

---

```
├─cbas - (Database)
    └─ Tables
       ├── cbasdef (data from VM ingestion+(NULL)comfort metrics)
       ├── values (units and names of values for charting)
       ├── newlab(*) (NewLab data)
       ├── telemetry(*) (Telemetry data from CBAS)
```

---

### SQL VIEWS

* [Continuous Aggregates](https://docs.timescale.com/latest/api#continuous-aggregates)

* raw
  
```SQL
CREATE VIEW raw AS
SELECT "sensor","battery", "Air", "Tdb_BME680", "RH_BME680", "P_BME680", "Alt_BME680", "TVOC","ECO2", "RCO2", "Tdb_scd30", "RH_scd30", "Lux", "PM1", "PM25", "PM10"
FROM cbasdef
order by timestamp desc;
```

## Read Data

* [TimescaleDB-"Reading data"](https://docs.timescale.com/latest/using-timescaledb/reading-data)

Just going to try pulling everythingto see what we have....

In [10]:
query= ''' 
SELECT * 
FROM cbasdef
'''

In [12]:

CBAS= pd.read_sql(query,engine,index_col=["timestamp"])
#CBAS
CBAS.head()

Unnamed: 0_level_0,battery,Tdb_BME680,RH_BME680,P_BME680,Alt_BME680,TVOC,ECO2,RCO2,Tdb_scd30,RH_scd30,...,Ta_adj_fixed_air,Cooling_effect_fixed_air,SET_fixed_air,TComf_fixed_air,TempDiff_fixed_air,TComfLower_fixed_air,TComfUpper_fixed_air,Acceptability_fixed_air,Condit_fixed_air,epoch
timestamp,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1,Unnamed: 16_level_1,Unnamed: 17_level_1,Unnamed: 18_level_1,Unnamed: 19_level_1,Unnamed: 20_level_1,Unnamed: 21_level_1
2020-02-27 20:57:57+00:00,4.061965,23.96,18.5,99.69,100.49,90.0,400.0,638.0,27.55,15.55,...,,,,,,,,,,NaT
2020-02-27 20:58:59+00:00,4.061965,23.97,18.49,99.69,100.66,92.0,400.0,636.0,27.57,15.6,...,,,,,,,,,,NaT
2020-02-27 21:00:01+00:00,4.059721,23.98,18.58,99.69,100.49,87.0,400.0,640.0,27.56,15.69,...,,,,,,,,,,NaT
2020-02-27 21:01:02+00:00,4.072067,23.98,18.58,99.69,100.49,88.0,400.0,644.0,27.59,15.71,...,,,,,,,,,,NaT
2020-02-27 21:02:04+00:00,4.065333,23.99,18.61,99.69,101.16,103.0,400.0,650.0,27.59,15.66,...,,,,,,,,,,NaT


In [24]:
# What sensors do we have?
CBAS['sensor'].unique()

array(['Moe'], dtype=object)

### More Queries

#### From Now() to interval
* Starting now() go back to `[interval]`:
```SQL
SELECT * 
FROM [table]
WHERE timestamp > NOW() - interval '[interval]';
```

In [20]:
query = '''
SELECT * 
FROM cbasdef
WHERE timestamp > NOW() - interval '1 hour';
'''

In [21]:
CBAS = pd.read_sql(query,engine,index_col=["timestamp"])
#CBAS
CBAS.head()

Unnamed: 0_level_0,battery,Tdb_BME680,RH_BME680,P_BME680,Alt_BME680,TVOC,ECO2,RCO2,Tdb_scd30,RH_scd30,...,Ta_adj_fixed_air,Cooling_effect_fixed_air,SET_fixed_air,TComf_fixed_air,TempDiff_fixed_air,TComfLower_fixed_air,TComfUpper_fixed_air,Acceptability_fixed_air,Condit_fixed_air,epoch
timestamp,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1,Unnamed: 16_level_1,Unnamed: 17_level_1,Unnamed: 18_level_1,Unnamed: 19_level_1,Unnamed: 20_level_1,Unnamed: 21_level_1
2020-04-01 20:30:19+00:00,4.037273,28.12,24.67,100.48,34.09,50.0,403.0,468.0,31.15,20.85,...,,,,,,,,,,1970-01-01 00:00:01.585773
2020-04-01 20:30:08+00:00,4.035028,28.13,24.65,100.49,33.75,50.0,403.0,469.0,31.17,21.07,...,,,,,,,,,,1970-01-01 00:00:01.585773
2020-04-01 20:30:06+00:00,4.125942,-99.0,-99.0,-99.0,-99.0,-99.0,-99.0,-99.0,-99.0,-99.0,...,,,,,,,,,,1970-01-01 00:00:01.585773
2020-04-01 20:29:57+00:00,4.084414,25.57,53.35,100.65,19.75,234.0,610.0,1927.0,27.31,50.89,...,,,,,,,,,,1970-01-01 00:00:01.585772
2020-04-01 20:29:57+00:00,4.035028,28.13,24.66,100.49,33.75,64.0,441.0,468.0,31.15,21.24,...,,,,,,,,,,1970-01-01 00:00:01.585772


#### SELECT sensor(s)
* select sensor:
```SQL
SELECT * FROM cbasdef
WHERE sensor IN ('Moe')
AND timestamp > NOW() - interval '1 hour';
```

In [22]:
query = '''
SELECT * FROM cbasdef
WHERE sensor IN ('Moe')
AND timestamp > NOW() - interval '1 hour';
'''

In [23]:
CBAS = pd.read_sql(query,engine,index_col=["timestamp"])
#CBAS
CBAS.head()

Unnamed: 0_level_0,battery,Tdb_BME680,RH_BME680,P_BME680,Alt_BME680,TVOC,ECO2,RCO2,Tdb_scd30,RH_scd30,...,Ta_adj_fixed_air,Cooling_effect_fixed_air,SET_fixed_air,TComf_fixed_air,TempDiff_fixed_air,TComfLower_fixed_air,TComfUpper_fixed_air,Acceptability_fixed_air,Condit_fixed_air,epoch
timestamp,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1,Unnamed: 16_level_1,Unnamed: 17_level_1,Unnamed: 18_level_1,Unnamed: 19_level_1,Unnamed: 20_level_1,Unnamed: 21_level_1
2020-04-01 20:33:13+00:00,4.107984,-99.0,-99.0,-99.0,-99.0,-99.0,-99.0,-99.0,-99.0,-99.0,...,,,,,,,,,,1970-01-01 00:00:01.585773
2020-04-01 20:32:11+00:00,4.113596,-99.0,-99.0,-99.0,-99.0,-99.0,-99.0,-99.0,-99.0,-99.0,...,,,,,,,,,,1970-01-01 00:00:01.585773
2020-04-01 20:31:08+00:00,4.125942,-99.0,-99.0,-99.0,-99.0,-99.0,-99.0,-99.0,-99.0,-99.0,...,,,,,,,,,,1970-01-01 00:00:01.585773
2020-04-01 20:30:06+00:00,4.125942,-99.0,-99.0,-99.0,-99.0,-99.0,-99.0,-99.0,-99.0,-99.0,...,,,,,,,,,,1970-01-01 00:00:01.585773
2020-04-01 20:29:03+00:00,4.133799,-99.0,-99.0,-99.0,-99.0,-99.0,-99.0,-99.0,-99.0,-99.0,...,,,,,,,,,,1970-01-01 00:00:01.585772


#### time_buckets ([pd.resample](https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.resample.html)) 

* [TimescaleDB-"time_bucket()"](https://docs.timescale.com/latest/api#time_bucket)

* [TimescaleDB-Blog](https://blog.timescale.com/blog/simplified-time-series-analytics-using-the-time_bucket-function/)


In [None]:
query = '''
SELECT time_bucket('5 minutes', time) AS five_min, "Tdb_BME680"
FROM raw
WHERE sensor IN ('Moe')
AND timestamp > NOW() - interval '1 hour'
GROUPBY five_min;
'''