# Overview
This notebook serves as a brief example of how to make changes to the model through CSV file. We will be using the following function:
- set_param_file function: sets parameters values, given a specific context (node, year, technology, and sub-parameter) from each row of a CSV file

# Build the model
We need to first have the model ready before using the function. All steps of building the model are the same as indicated in the Quickstart notebook. We do not run the model yet as we need to make the changes beforehand.

In [1]:
import pyCIMS
import pprint as pp

# description file
model_description_file = '../../model_descriptions/pyCIMS_model_description_Alberta_Test.xlsb'

# model validator
model_validator = pyCIMS.ModelValidator(model_description_file)
model_validator.validate(verbose=True, raise_warnings=False)

# Model Reader
model_reader = pyCIMS.ModelReader(infile=model_description_file,
                                  sheet_map={'model': 'Model', 
                                             'default_tech': 'Technology_Node templates'},
                                  node_col='Node')

# Model
model = pyCIMS.Model(model_reader)



0 node name/branch mismatches. 
0 references to unspecified nodes. 
0 non-root nodes are never referenced. 
0 nodes were specified but don't provide a service. 
0 nodes had invalid competition types. 
0 nodes requested services of themselves. 
0 nodes have 0 in the output line. 
0 fuel nodes don't have an Life Cycle Cost. 
0 tech compete nodes don't have capital cost. 
0 technologies are missing a base year market share. 
0 tech compete nodes don't contain Technology or Service headings 


# The input file
Let's take a look at what the input CSV file should look like. You will find an example in SetParams_script.csv:

In [2]:
import pandas as pd

pd.read_csv('SetParams_script.csv', delimiter=',')

Unnamed: 0,node,node_regex,param,tech,sub_param,year_operator,year,val_operator,val,search_param,search_operator,search_pattern
0,pyCIMS.Canada.Alberta.Residential.Buildings.Fl...,,Market share,Incandescent,,==,2000,==,0.8,,,
1,pyCIMS.Canada.Alberta.Residential.Buildings.Fl...,,Market share,CFL,,==,2000,==,0.2,,,
2,pyCIMS.Canada.Alberta.Transportation Personal....,,Service requested,,Recent Car,>=,2000,==,0.7,,,
3,pyCIMS.Canada.Alberta.Transportation Personal....,,Service requested,,Old Truck,>=,2000,==,0.3,,,
4,pyCIMS.Canada.Alberta.Transportation Personal....,,Heterogeneity,,,>=,2000,==,25.0,,,
5,.*,,Price Multiplier,,.*,>=,2000,==,3.0,competition type,==,sector
6,.*,,Price Multiplier,,.*,>=,2000,==,3.0,competition type,==,sector no tech
7,.*,,Tax,,CO2,>=,2025,>=,50.0,competition type,==,sector
8,,.*Pumping\.Precision\.Small$,Heterogeneity,,,>=,2000,==,0.6,,,
9,,^pyCIMS\.Canada\.Alberta\.Residential\.Buildin...,Heterogeneity,,,==,2000,==,1.3,,,


The input CSV should contain the following columns:
1. **`node`** : is either empty, '.*', or the name of the node whose parameter you are interested in
    - `empty` : This indicates that the node value should not be taken from this column. The function will look at the node_regex column instead to determine which nodes to change.
    - `.*` : This indicates that the function should look for all nodes satifying the conditions in the search_param, search_operator, search_pattern columns.
    - `node name` : This indicates that the function should change the values corresponding to this node name.
2. **`node_regex`** : is either empty or a regex expression
    - `empty` : This indicates that the function should not use this node_regex column to determine which nodes to change. It will look at the node column instead. Note that if node is empty, node_regex cannot be empty. If node_regex is empty, node cannot be empty.
    - `regex expression` : This is the regex expression the function will use to search for nodes that satisfy this pattern. See below for a quick regex tutorial.
3. **`param`** : is the name of the parameter you are interested in. This cell cannot be empty
4. **`tech`** : is either empty, '.*', or the name of the technology you are interested in
    - `empty` : This indicates that there is no technology specified.
    - `.*` : This indicates that the function should look through all possible technologies given the node, parameter, and year from the corresponding columns.
    - `technology name` : This is the name of the technology you are interested in
5. **`sub_param`** : is either empty, '.*', or the name of the sub-parameter you are interested in
    - `empty` : This indicates that there is no sub-parameter specified.
    - `.*` : This indicates that the function should look through all possible sub-parameter given the node, parameter, year, and technology (if specified) from the corresponding columns.
    - `sub-parameter name` : This is the name of the sub-parameter you are interested in
6. **`year_operator`** : is one of <=, <, >=, >, == and indicates the range of years you are interested in.  This cell cannot be empty.
7. **`year`** : is an integer value indicating the year you are interested in. It is used with the `year_operator` to determine the list of years you are interested in. This cell cannot be empty.
8. **`val_operator`** : is one of <=, >=, == and is used with the `val` column.
    - `<=` : This indicates that the value will be changed to val if the original value is higher than val
    - `>=` : This indicates that the value will be changed to val if the original value is less than val
    - `==` : This indicates that the value will be changed to val
9. **`val`** : is the new value you would like to change to. This cell cannot be empty.
10. **`search_param`** : is the parameter name to be searched. This cell is only populated when node is .*
11. **`search_operator`** : is the parameter operator to be applied to search_param and search_pattern. It is either empty or ==. This cell is only populated when node is .*
12. **`search_pattern`** : is the value of the parameter specified in search_param to be searched. This cell is only populated when node is .*

Note: the NaN values you see in the above table are empty values in the CSV

### Regex Tutorial
Regex is short for regular expression and is a sequence of characters that specifies a search pattern. There are many resources online (https://www.rexegg.com/regex-quickstart.html) to learn more about writing Regex expressions but here are some basic ideas we will be using in the following examples:
1. Characters:
    1. `.` represents any character (except line break)
    2. `\` escapes a special character. For example, `.` means any character but `\.` means a period. 
2. Quantifiers:
    1. `*` means zero or more times
    2. `+` means one or more times
3. Anchors and Boundaries:
    1. `^` means the start of a string
    2. `$` means the end of the string
    
You can use https://regex101.com/ to check whether your regex expression is working as expected

### Example Regex Expressions

#### Example 1
Match any string that ends with 'Pumping.Precision.Small' (e.g. pyCIMS.Canada.Alberta.Coal Mining.Pumping.Precision.Small, pyCIMS.Canada.Alberta.Pulp  Paper.Pumping.Precision.Small etc)

Let's look at how to build the corresponding Regex expression. 

We want to be able to search for all node names that end with the string 'Pumping.Precision.Small'. In Regex, since `.` is a special character representing any character, we will write the string as `'Pumping\.Precision\.Small'`, where `\.` represents a period. 

We want to look for node names that end with this string so we will add `.*` to the front of the string and `$` to the end. `.*` specifies that we can have zero or more of any character before the string and `$` specifies that there should be nothing after this string.

The final resulting Regex expression is `'.*Pumping\.Precision\.Small$'`

#### Example 2
Match any string that starts with 'pyCIMS.Canada.Alberta.Residential.Buildings.Floorspace.' (e.g. pyCIMS.Canada.Alberta.Residential.Buildings.Floorspace.Solar Electricity, pyCIMS.Canada.Alberta.Residential.Buildings.Floorspace.Space Conditioning.Apartments etc)

Let's look at how to build the corresponding Regex expression. 

We want to be able to search for all node names that start with the string 'pyCIMS.Canada.Alberta.Residential.Buildings.Floorspace.'. As before, we will begin with the string as `'pyCIMS\.Canada\.Alberta\.Residential\.Buildings\.Floorspace\.'` where `\.` represents a period. 

We want to look for nodes names that start with this string so we will add `^` to the start and `.*` to the end of the string. `^` specifies that there should be nothing before this string and `.*` specifies that we can have zero or more of any character after the string.

The final resulting Regex expression is `'^pyCIMS\.Canada\.Alberta\.Residential\.Buildings\.Floorspace\..*'`

#### Example 3
Match any string that has '.Residential.Buildings.Floorspace.' anywhere in the string except at the very front or very end of the string (e.g. pyCIMS.Canada.Alberta.Residential.Buildings.Floorspace.Lighting, pyCIMS.Canada.Alberta.Residential.Buildings.Floorspace.Space Conditioning.Single Family Attached etc)

Let's look at how to build the corresponding Regex expression. 

We want to be able to search for all node names that contain the string '.Residential.Buildings.Floorspace.'. As before, we will begin with the string as `'\.Residential\.Buildings\.Floorspace\.'` where `\.` represents a period. 

We want to look for nodes names that contains this string. We will add `.+` to the front and back of the string where `.+` represents one or more of any character.

The final resulting Regex expression is `'.+\.Residential\.Buildings\.Floorspace\..+'`

# Using the set_param_file function

<div class="alert alert-block alert-success">
<b></b> The function has 1 argument : filepath.
</div>

This function is under the `model` class and has 1 required arguments:
* filepath : the path to the CSV file 

In [3]:
model.set_param_file('SetParams_script.csv')

Row 8: Unable to access parameter at get_param(Tax, pyCIMS.Canada.Alberta.Ethanol, 2025, None, CO2). Corresponding value was not set to 50.0.
Row 8: Unable to access parameter at get_param(Tax, pyCIMS.Canada.Alberta.Ethanol, 2030, None, CO2). Corresponding value was not set to 50.0.
Row 8: Unable to access parameter at get_param(Tax, pyCIMS.Canada.Alberta.Ethanol, 2035, None, CO2). Corresponding value was not set to 50.0.
Row 8: Unable to access parameter at get_param(Tax, pyCIMS.Canada.Alberta.Ethanol, 2040, None, CO2). Corresponding value was not set to 50.0.
Row 8: Unable to access parameter at get_param(Tax, pyCIMS.Canada.Alberta.Ethanol, 2045, None, CO2). Corresponding value was not set to 50.0.
Row 8: Unable to access parameter at get_param(Tax, pyCIMS.Canada.Alberta.Ethanol, 2050, None, CO2). Corresponding value was not set to 50.0.
Row 8: Unable to access parameter at get_param(Tax, pyCIMS.Canada.Alberta.Biodiesel, 2025, None, CO2). Corresponding value was not set to 50.0.
Row 

You will notice that some messages may be printed such as:

`Row 8: Unable to access parameter at get_param(Tax, pyCIMS.Canada.Alberta.Ethanol, 2025, None, CO2). Corresponding value was not set to 50.0.`

This will be printed if the value at a certain context (node, param, tech, sub-param) could not be accessed. For the above example, `pyCIMS.Canada.Alberta.Ethanol` satisfied the search_param, search_operator, search_pattern conditions, but it does not have a Tax parameter at year 2025.

# Run the Model
We can now run the model training after all the changes have been applied.

In [4]:
# run the model 
model.run(max_iterations=5, show_warnings=False)

***** ***** year: 2000 ***** *****
iter 0
***** ***** year: 2005 ***** *****
iter 0
iter 1
***** ***** year: 2010 ***** *****
iter 0
iter 1
***** ***** year: 2015 ***** *****
iter 0
iter 1
iter 2
***** ***** year: 2020 ***** *****
iter 0
iter 1
iter 2
iter 3
***** ***** year: 2025 ***** *****
iter 0
iter 1
iter 2
iter 3
***** ***** year: 2030 ***** *****
iter 0
iter 1
iter 2
***** ***** year: 2035 ***** *****
iter 0
iter 1
***** ***** year: 2040 ***** *****
iter 0
iter 1
iter 2
***** ***** year: 2045 ***** *****
iter 0
iter 1
***** ***** year: 2050 ***** *****
iter 0
iter 1


# The set_param_log function

<div class="alert alert-block alert-success">
<b></b> The function has 1 arguments : output_file.
</div>

This function is under the `model` class and only has 1 optional argument:
* output_file : the output file location where the change history CSV will be saved. If this is left blank, the file will be outputed at the current location with the name of the original model description and a timestamp in the filename.

### Example Usages

In [5]:
model.set_param_log(output_file='./change_log.csv') 

Let's take a look at how this log file. The first column is the name of the model description used. The last two columns show the previous value of each context (node, year, tech, param, sub-param) and the new value it was changed to.

In [11]:
pd.read_csv('change_log.csv').head(10)

Unnamed: 0,base_model_description,node,year,technology,parameter,sub_parameter,old_value,new_value
0,pyCIMS_model_description_Alberta_Test,pyCIMS.Canada.Alberta.Residential.Buildings.Fl...,2000,Incandescent,Market share,,0.92,0.8
1,pyCIMS_model_description_Alberta_Test,pyCIMS.Canada.Alberta.Residential.Buildings.Fl...,2000,CFL,Market share,,0.08,0.2
2,pyCIMS_model_description_Alberta_Test,pyCIMS.Canada.Alberta.Transportation Personal....,2000,,Service requested,Recent Car,0.642948,0.7
3,pyCIMS_model_description_Alberta_Test,pyCIMS.Canada.Alberta.Transportation Personal....,2005,,Service requested,Recent Car,0.642948,0.7
4,pyCIMS_model_description_Alberta_Test,pyCIMS.Canada.Alberta.Transportation Personal....,2010,,Service requested,Recent Car,0.642948,0.7
5,pyCIMS_model_description_Alberta_Test,pyCIMS.Canada.Alberta.Transportation Personal....,2015,,Service requested,Recent Car,0.642948,0.7
6,pyCIMS_model_description_Alberta_Test,pyCIMS.Canada.Alberta.Transportation Personal....,2020,,Service requested,Recent Car,0.642948,0.7
7,pyCIMS_model_description_Alberta_Test,pyCIMS.Canada.Alberta.Transportation Personal....,2025,,Service requested,Recent Car,0.642948,0.7
8,pyCIMS_model_description_Alberta_Test,pyCIMS.Canada.Alberta.Transportation Personal....,2030,,Service requested,Recent Car,0.642948,0.7
9,pyCIMS_model_description_Alberta_Test,pyCIMS.Canada.Alberta.Transportation Personal....,2035,,Service requested,Recent Car,0.642948,0.7


# Saving and loading model
In order to save the model for later use, you can use the `save_model` function and `load_model` functions.

The `save_model` function is under the `model` class and has 2 optional arguments:
* model_file : The model file location where the model file will be saved. If this is left blank, the model will be saved at the current location with the name of the original model description and a timestamp in the filename.
* save_changes : a boolean value that specifies whether the changes will be saved to a CSV file with a similar filename as the model_file

In [6]:
model.save_model(model_file='model_file_test.pkl', save_changes=False) 

In [7]:
model = pyCIMS.load_model(model_file='model_file_test.pkl')