# IEDC_tools Getting started

A short walkhrough for the different functions in `IEDC_tools`.

## Preparatory steps

- Get the `IEDC_pass.py` ready. For your convenience there is a template file `IEDC_pass_TEMPLATE.py` in the root folder.
- Get the `IEDC_paths.py` ready. For your convenience there is a template file `IEDC_paths_TEMPLATE.py` in the root folder.
- Create a Python environment that contains: pymysql, pandas, numpy

In [1]:
# Since this notebook is in a subfolder, we need to include the path

import sys
sys.path.insert(0,'..')

In [2]:
from IEDC_tools import dbio, file_io, validate

# The directory browsing is not yet fully implemented. 
# So far IEDC_tools works only on a single file basis.
# Therefore it is necessary to specify a file. The path is specified in `IEDC_paths.py`

candidate_file = 'test.xlsx'

## Validation

These are some simple validation functions.

In [3]:
# Get the file's classifications

class_names = validate.get_class_names(candidate_file)
class_names

Unnamed: 0,classification_id,name,attribute_no,custom_name
aspect_1,custom,origin_process,custom,origin_process__1_F_steel_SankeyFlows_2008_Global
aspect_2,custom,destination_process,custom,destination_process__1_F_steel_SankeyFlows_200...
aspect_3,custom,commodity,custom,commodity__1_F_steel_SankeyFlows_2008_Global
aspect_4,1,element,3,chemical_elements
aspect_5,2,region,1,regions_iso_iedc
aspect_6,3,time,1,time


In [4]:
# Checks for the above classifications if they exists in the database,
#  i.e. classification_definition

validate.check_classification_definition(class_names, crash=False)



[True, True, True, True, True, True]

In [5]:
# To check if the attributes are in the DB,
#  we need to load the data (Value_Master) first

file_data = file_io.read_candidate_data(candidate_file)
file_data.head()

Unnamed: 0,origin_process,destination_process,commodity,element,region,time,value,unit nominator,unit denominator,stats_array string,comment
0,End-of-life scrap market,Scrap separation,Postconsumer scrap,Fe,Global,2008,290.0,Mt,yr,none,none
1,Iron ore market,Direct reduction,Iron ore,Fe,Global,2008,66.3,Mt,yr,none,none
2,Direct reduction,Loss,Iron losses,Fe,Global,2008,0.5,Mt,yr,none,none
3,Direct reduction,Electric furnace,DRI,Fe,Global,2008,65.8,Mt,yr,none,none
4,Electric furnace,Secondary metallurgy,Liquid steel,Fe,Global,2008,410.3,Mt,yr,none,none


In [6]:
# Checks in classification_items if a. the classification_ids exists and
#  b. an attribute exist

validate.check_classification_items(class_names, file_data, crash=False)



[True,
 True,
 True,
 True,
 True,
 True,
 True,
 True,
 True,
 True,
 True,
 True,
 True,
 True,
 True,
 True,
 True,
 True,
 True,
 True,
 True,
 True,
 True,
 True,
 True,
 True,
 True,
 True,
 True,
 True,
 True,
 True,
 True,
 True,
 True,
 True,
 True,
 True,
 True,
 True,
 True,
 True,
 True,
 True,
 True,
 True,
 True,
 True,
 True,
 True,
 True,
 True,
 True,
 True,
 True,
 True,
 True,
 True,
 True,
 True,
 True,
 True,
 True,
 True,
 True,
 True,
 True,
 True,
 True,
 True,
 True,
 True,
 True,
 True,
 True,
 True,
 True,
 True,
 True,
 True,
 True,
 True,
 True,
 True,
 True,
 True,
 True,
 True,
 True,
 True,
 True,
 True,
 True,
 True,
 True,
 True,
 True,
 True,
 True,
 True,
 True,
 True,
 True,
 True,
 True,
 True,
 True,
 True,
 True,
 True,
 True,
 True,
 True,
 True,
 True,
 True,
 True,
 True,
 True,
 True,
 True,
 True,
 True,
 True,
 True,
 True,
 True,
 True,
 True,
 True,
 True,
 True,
 True,
 True,
 True,
 True,
 True,
 True,
 True,
 True,
 True,
 True,
 True,

# Writing to the Database

To some more exciting features...

***In order to make the following functions work, the classification and attributes must be deleted from the tables `calssification_definitions` and `classification_items` first.*** That means they will only work if the two checks above ran without warning. I will have to: 

```sql
DELETE FROM iedc_review.classification_definition WHERE id IN (48, 49, 50);
DELETE FROM iedc_review.classification_items WHERE classification_id IN (48, 49, 50);
```

The `id`s 48, 49, 50 were named above.

In [8]:
# This will create the classifications in the `classification_definitions` table
# For demonstration purposes, I will *now* delete classification_ids and attributes
#  that appeared as warnings above.

validate.create_db_class_defs(candidate_file)

Wrote custom classification 'origin_process__1_F_steel_SankeyFlows_2008_Global' to classification_definitions
Wrote custom classification 'destination_process__1_F_steel_SankeyFlows_2008_Global' to classification_definitions
Wrote custom classification 'commodity__1_F_steel_SankeyFlows_2008_Global' to classification_definitions


In [9]:
# Writes the attributes of a custom classification to the database `classification_items`

validate.create_db_class_items(candidate_file)

Wrote attributes for custom classification '51' to classification_items
Wrote attributes for custom classification '52' to classification_items
Wrote attributes for custom classification '53' to classification_items


In [10]:
# Add a the contributor to the `users` table, if not known yet

validate.add_user(candidate_file)

User 'Stefan Pauliuk' already exists in db table users


In [11]:
# Add a licence to the `licences` table, if not known yet

validate.add_license(candidate_file)

Licence 'MIT' already exists in db table 'licences'


### Data upload

Finally it is time to upload the data to the `data` table. **To make this work, the `data` table must not contain values for this dataset_id yet**
For demonstration purposes, I will delete the values in question: 

```sql
DELETE FROM iedc_review.data WHERE dataset_id = 1;
```

*The script should also fail if there is a value in the `datasets` table, but we need to figure out who creates that table, first.* Therefore `crash=False` for now.

In [4]:
# Finally it is time to upload the data to the `data` table
# **To make this work, the `data` table must not contain values for this dataset_id yet**
# For demonstration purposes, I will delete all values 

validate.upload_data(candidate_file, crash=False)

Wrote data for '1_F_steel_SankeyFlows_2008_Global'


That's it for now...