# SQLite itegration

If you want to process your sensor data and store it for later, you can use the sqlite integration. Gensor's `Timeseries` and `Dataset` come with a `.to_sql()` method which is uses `pandas.Series.to_sql()` method under the hood to save the data in a SQLite database. 

It is a simple implementation, where each timeseries is stored in a separate schema (database table) which is named in the following pattern: `f"{location}_{sensor}_{variable}_{unit}".lower()`. There is a double check on duplicates. First, when you create a `Dataset`, duplicates are nicely handled by merging timeseries from the same location, sensor and of the same variable and unit. Secondly the `Timeseries.to_sql()` method is designed to ignore conflicts, so only new records are inserted into the database if you attempt to run the same commend twice.

### Load test data

In [12]:
import gensor as gs
from gensor import read_from_csv
from gensor.testdata import all_paths, pb02a_plain

pattern = r"[A-Za-z]{2}\d{2}[A-Za-z]{1}|Barodiver"

ds = read_from_csv(path=all_paths, file_format="vanessen", location_pattern=pattern)


ds2 = read_from_csv(
    path=pb02a_plain, file_format="plain", location="PB02A", sensor="AV336"
)

ds.add(ds2)

INFO: Loading file: /workspaces/gensor/gensor/testdata/Barodiver_220427183008_BY222.csv


INFO: Loading file: /workspaces/gensor/gensor/testdata/PB01A_moni_AV319_220427183019_AV319.csv
INFO: Loading file: /workspaces/gensor/gensor/testdata/PB02A_plain.csv
INFO: Skipping file /workspaces/gensor/gensor/testdata/PB02A_plain.csv due to missing metadata.
INFO: Loading file: /workspaces/gensor/gensor/testdata/PB02A_plain.csv


Dataset(6)

### Create `DatabaseConnection`

Both saving and loading data from sqlite require a `DatabaseConnection` object to be passed as attribute. You can just instanciate it with empty parentheses to create a new database in the current working directory, or specify the path and name of the database.

If you have an existing Gensor database, you can use `DatabaseConnection.get_timeseries_metadata()` to see if there already are some tables in the database that you want to use. If no arguments are provided, all records are returned.

In [13]:
db = gs.db.DatabaseConnection()
db.get_timeseries_metadata()

Unnamed: 0_level_0,table_name,location,variable,unit,start,end,extra,cls
id,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1
1,barodiver_pressure_cmh2o,Barodiver,pressure,cmh2o,20200704040000,20220330130000,"{'sensor': 'BY222', 'sensor_alt': None}",gensor.core.timeseries.Timeseries
2,barodiver_temperature_degc,Barodiver,temperature,degc,20200704040000,20220330130000,"{'sensor': 'BY222', 'sensor_alt': None}",gensor.core.timeseries.Timeseries
3,pb01a_pressure_cmh2o,PB01A,pressure,cmh2o,20200704040000,20220330090000,"{'sensor': 'AV319', 'sensor_alt': None}",gensor.core.timeseries.Timeseries
4,pb01a_temperature_degc,PB01A,temperature,degc,20200704040000,20220330090000,"{'sensor': 'AV319', 'sensor_alt': None}",gensor.core.timeseries.Timeseries
5,pb02a_pressure_cmh2o,PB02A,pressure,cmh2o,20200704060000,20220207160000,"{'sensor': 'AV336', 'sensor_alt': None}",gensor.core.timeseries.Timeseries
6,pb02a_temperature_degc,PB02A,temperature,degc,20200704060000,20220207160000,"{'sensor': 'AV336', 'sensor_alt': None}",gensor.core.timeseries.Timeseries


Loading the dataset to the database is straightforward. You just need to call `.to_sql()` on the dataset instance and check the tables again to see that now there are a few.

In [14]:
ds.to_sql(db)
df = db.get_timeseries_metadata()

In [16]:
df

Unnamed: 0_level_0,table_name,location,variable,unit,start,end,extra,cls
id,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1
1,barodiver_pressure_cmh2o,Barodiver,pressure,cmh2o,20200704040000,20220330130000,"{'sensor': 'BY222', 'sensor_alt': None}",gensor.core.timeseries.Timeseries
2,barodiver_temperature_degc,Barodiver,temperature,degc,20200704040000,20220330130000,"{'sensor': 'BY222', 'sensor_alt': None}",gensor.core.timeseries.Timeseries
3,pb01a_pressure_cmh2o,PB01A,pressure,cmh2o,20200704040000,20220330090000,"{'sensor': 'AV319', 'sensor_alt': None}",gensor.core.timeseries.Timeseries
4,pb01a_temperature_degc,PB01A,temperature,degc,20200704040000,20220330090000,"{'sensor': 'AV319', 'sensor_alt': None}",gensor.core.timeseries.Timeseries
5,pb02a_pressure_cmh2o,PB02A,pressure,cmh2o,20200704060000,20220207160000,"{'sensor': 'AV336', 'sensor_alt': None}",gensor.core.timeseries.Timeseries
6,pb02a_temperature_degc,PB02A,temperature,degc,20200704060000,20220207160000,"{'sensor': 'AV336', 'sensor_alt': None}",gensor.core.timeseries.Timeseries


In [15]:
from gensor import Dataset, read_from_sql

new_ds: Dataset = read_from_sql(db, True)
new_ds

Dataset(6)