# Example notebook 03

Using the data generated from notebook `00_create_data.ipynb` this notebook takes you through some of the basic functionality using the `Connections` class:

+ [Initialise a SqliteDB connection](#Initialise-a-SqliteDB-connection)
+ [Read from cnx](#Read-from-cnx)
+ [Write to a table](#Write-to-a-table)

## Setup
<hr>

Imports and setting options

In [1]:
from datetime import datetime
import pickle

from data_etl import Connections, Checks

## Examples
<hr>

Initialise the class

In [2]:
cnxs = Connections()

### Initialise a SqliteDB connection
<hr>

Initialise the SqliteDB, it doesn't already exist so a warning message is output that a file is being created

The optional kwarg `sqlite_df_issues_create` creates a table structure to match the issues tables present in `DataCuration` and `Checks` objects

In [3]:
cnxs.add_cnx(
    cnx_key='df_issues', 
    cnx_type='sqlite3',
    table_name='df_issues',
    file_path='data/00_db.db',
    sqlite_df_issues_create=True
)

The `file_path` data/00_db.db is not valid so this file will be created


### Read from cnx
<hr>

Using `read_from_db` you can read data out from a table, or from a database on the same connection

In [4]:
cnxs.read_from_db('df_issues', 'SELECT * FROM df_issues')

Unnamed: 0,key_1,key_2,key_3,file,sub_file,step_number,category,issue_short_desc,issue_long_desc,column,issue_count,issue_idx,grouping


### Write to a table
<hr>

We needs some issues to write to the table

In [5]:
var_start_time = datetime.now()
ch_checks = Checks(var_start_time, '1')

dict_data = {
    'df_checks_issues.pkl': pickle.load(open('data/df_checks_issues.pkl', 'rb'))
}

dict_checks = dict()
dict_checks['Number should be greater than 0'] = {
    'calc_condition': lambda df, col, **kwargs: df['number'] <= 0
}

ch_checks.apply_checks(dict_data, dictionary=dict_checks)

ch_checks.df_issues

Unnamed: 0,key_1,key_2,key_3,file,sub_file,step_number,category,issue_short_desc,issue_long_desc,column,issue_count,issue_idx,grouping
0,1,,,df_checks_issues.pkl,,0,,Number should be greater than 0,,,1,4,2020-05-25 20:35:03.604898


Using `write_to_db` creates a temporary table in the background which the data is written to, if that has written with no issues then it moves all that data to the main table

In [6]:
cnxs.write_to_db('df_issues', ch_checks.df_issues)

And then check it wrote to the table

In [7]:
cnxs.read_from_db('df_issues', 'SELECT * FROM df_issues')

Unnamed: 0,key_1,key_2,key_3,file,sub_file,step_number,category,issue_short_desc,issue_long_desc,column,issue_count,issue_idx,grouping
0,1,,,df_checks_issues.pkl,,0,,Number should be greater than 0,,,1,4,2020-05-25 20:35:03.604898


---
**GigiSR**