# Data Importer
[![Static Badge](https://img.shields.io/badge/Jupyter_Notebook-F37726?style=for-the-badge)](https://jupyter.org/)

<br>

Performs any **adhoc** database operations that need to be done in bulk. 

<br>

## Requirements
- Python (Version 3.6 or up)

<br>
<br>

## Installation
Run the pip install command below:

In [None]:
%pip install -r requirements.txt

<br>
<br>

## Initialization

Run the codeblock below to initialize all the necessary tools

<br>

> ***❇️ Important*** <br>
>
> You may need to restart the kernel of this notebook if you changed anything in the source code
> 

In [1]:
import sys

sys.path.insert(1, r"src")

import DataImporter as DI

        
########
# MAIN #
########
Secrets = DI.DBSecrets.load()
Database = DI.DBNames.Prod.value
importer = DI.Importer(Secrets, database = Database, useConnPool = False)


<br>
<br>

## Dataset Format
A *dataset* is a folder that contains many .csv files.<br> 
For simplicity, each .csv file references a particular table. 

<br>

> ***📝 NOTE:*** <br>
>
> The id keys in the .csv files are only for convenience of debugging
> 
> These ids will be regenerated on the database side.


<br>
<br>

## Importing a Dataset

The following codeblock gives some example of importing a dataset. <br>
The data will first be cleaning, before being imported.

<br>

For cleaning, we have the following settings:

| Clean Level | Description |
| ----------- | ----------- |
| None        | No data cleaning done |
| Tuples      | Clears all data from every table |
| Tables      | Deletes every table in the database |
| Database    | Deletes an entire database |

<br>

For importing, we have the following setttings:

| Build Level | Description |
| ----------- | ----------- |
| Tuples      | Only Imports the data into existing tables |
| Tables      | Constructs the required tables, then imports the data |
| Database    | Constructs a database and the required tables, before importing the data | 

In [None]:
print("===== STARTING TO IMPORT DATA ========")

importer.importData(r"data/Toy Dataset", cleanLevel = DI.ImportLevel.Database, buildLevel = DI.ImportLevel.Database)

print("========== IMPORT COMPLETE ===========")

Inserting User Data...
Inserting Building Data...


UniqueViolation: duplicate key value violates unique constraint "Building_buildingName_addressLine1_addressLine2_city_provin_key"
DETAIL:  Key ("buildingName", "addressLine1", "addressLine2", city, province, country, "postalCode")=(MC, 200 University Avenue West, MC, Waterloo, Ontario, Canada, N2L 3G1) already exists.


<br>
<br>

## Clearing all Data

The following codeblock gives an example of clearing the data from all the tables

<br>

> ***❗ WARNING:*** <br>
>
> ONLY DO THIS IF YOU ARE ABSOLUTELY SURE OF WHAT YOU ARE DOING
> 

<br>

For the cleaning settings, please refer to the table at [Importing a Dataset](#importing-a-dataset)


In [None]:
print("===== STARTING TO DELETE DATA ========")

importer.clean(cleanLevel = DI.ImportLevel.Tables)

print("========= DELETION COMPLETE ==========")

Deleting all tables...


<br>
<br>

## Execute Custom SQL

The following codeblock gives some example to execute some custom sql command

In [6]:
import psycopg2
import psycopg2.sql


selectNameSQL = psycopg2.sql.SQL("SELECT {nameCol} FROM {userTable} LIMIT 1").format(
    nameCol = psycopg2.sql.Identifier("name"),
    userTable = psycopg2.sql.Identifier(DI.TableNames.User.value)
)

conn, cursor, err = importer.executeSQL(selectNameSQL, closeConn = False)

if (err is None):
    print(cursor.fetchone())
    conn.close()
else:
    conn.close()
    raise err

('Alice Margatroid',)
