# Running a Monte Carlo Simulation
## Before Starting
This Monte Carlo Simulation should be run after the first tutorial has been completed. You can download it
[here](https://bitbucket.org/phanchieta/lca/src/master/Tutorial.ipynb)

## Creating Uncertainty
The first step to run the Monte Carlo Simulation is creating the uncertainty. You may wish to create uncertainty in one or more of the following areas:
- Process Model inputs
- Common Data
- Parameters
- Material Properties

We will go over each one of them individually and then we will run a full simulation. Your simulation may include any combination of these

## Process Model Uncertainty
Probably the most common uncertainty situation is adding to the individual process models. As you may recall from the previous tutorial, we created a dictionary with all the process models. It was called Treatment_processes. Let's take a brief look.

In [None]:
Treatment_processes

We can see our `AD`, `COMP`, `LF`, `WTE`, `REPROC` models with their objects (ex.  `<AD.AD at 0xb8ae4e0>`) and input types. In order to add uncertainty to them, we will be accessing this dictionary and adding uncertainty to the instantiated model inside the dictionary. We will first be adding uncertainty to `ad_pCasCH4` inside Curing Bio. Note that all models have the object Model_input, where the input uncertainty can be added. Let's take a look on this dictionary before we add the uncertainty. 

In [None]:
Treatment_processes['AD']['model'].AD_input.Curing_Bio['ad_pCasCH4']

Now we can add our uncertainty. There are two ways of doing that. First is simply rewriting the dictionary or adding the individual fields. In this case is simply easier to rewrite the whole thing.

In [None]:
Treatment_processes['AD']['model'].AD_input.Curing_Bio['ad_pCasCH4']={'Name':'Proportion of emitted C emitted as CH4','amount':0.017,'unit':None,'Reference':'19',
                          'uncertainty_type':3,'loc':0.017,'scale':0.004}

You can see that we added 3 new fields to the dictionary: `uncertainty_type`, `loc` and `scale`. The whole documentation for these can be found on the `stats_arrays` package located [here](https://stats-arrays.readthedocs.io/en/latest/). But to briefly explain, `uncertainty_type=3` is a Normal distribution, `loc` and `scale` are mean and standard deviation, respectively. Let's add some more. Now to the Composting model:

In [None]:
Treatment_processes['COMP']['model'].Comp_input.Biological_Degredation['pCasCH4'] = {"Name":"Proportion of emitted C emitted as CH4","amount":0.017,"unit":None,"Reference":'12',
                           'uncertainty_type':3,'loc':0.017,'scale':0.004}

You can choose different distributions, max, min according to the `stats_arrays` package. All of its functions are mapped here.
## Common Data
All of the Process Models share some data, and to avoid duplication, they all read from the Common Data class. In order to create uncertainty for all process models, you may create it in the Common Data class. The Common Data class is instantiated in the background by the process models so in order to create uncertainty, we must do the following steps:
- Instantiate our own Common Data class
- Add uncertainty to it
- Tell the simulation to use our Common Data class when performing Monte Carlo
Let's first import and instantiate

In [None]:
import CommonData as cd
CommonData = cd.CommonData()

Now let's add uncertainty:

In [None]:
CommonData.Land_app['cmpLandDies']={"Name":"Compost application diesel use","amount":0.8,"unit":'L/Mg compost',"Reference":None,
                               'uncertainty_type':3,'loc':0.8,'scale':0.2}

You may or may not wish to add more uncertainty, accoding to your needs. 
## Parameters
Parameters can also have uncertainty but since every group of parameters must sum to 1, there are a few considerations. If a group of parameters contains only 1 parameter, that parameters will always be equal to 1, even if uncertainty is added. When adding uncertainty to a group larger than 1, adding uncertainty to a single parameter will affect all the other since they are normalized. To learn more about the parameters, check out the Parameter class inside the PySWOLF Documentation.

For this example, let's add uncertainty to the parameter `frac_of_Other_Residual_from_AD_to_LF`. Adding uncertainty to parameters is done by calling a function inside the Project object and passing a series of arguments. These arguments are the same as the keys in the Common Data and Process Model input dictionaries and all follow the `stats_arrys` package. 

In [None]:
demo.unified_params.add_uncertainty('frac_of_Other_Residual_from_AD_to_LF', loc = 0.8, scale = 0.3, uncertainty_type = 7, minimum = 0, maximum = 2)

## Material Properties
Adding uncertainty to the Material Properties is done by editing the file 'Material Properties - process models.xlsx', that is provided with the code checked out. This file contains a series of columns that follow the same pattern as the other types. The user can add the uncertainty by editing this file 

## Running the Simulation
Before we can actually run the simulation, we have to setup a few things. Since the simulation is an independent module, we must specify a few things for our run:

- Functional Unit
- Method(s)
- Project name


In [None]:
functional_unit = {db.get("scenario1") : 1}
method = [('SWOLF_IPCC','SWOLF'),('SWOLF_Acidification','SWOLF')]

The Functional Unit has to indicate which scenario we are going to run. We can get it from the Database. `Method` has to be a list of methods that the simulation will iterate through. If you are using just one method, it must be a list with one element. From the previous Notebook, we defined a variable called `Project_name` and it has "Organic Analysis" as its value. We will use it again here.

Now, because the simulation may or may not have process models and there can be several process models of one kind, we must tell it which ones do we want for it to simulate. In order to do that, we must pass 2 lists. One with the process models themselves and one with their names.

In [None]:
process_models = list()
process_model_names = list()

process_models.append(Treatment_processes['AD']['model'])
process_model_names.append('AD')

process_models.append(Treatment_processes['COMP']['model'])
process_model_names.append('COMP')

These two lists will tell the simulation that we added uncertainty to `AD` and `COMP` and their respective models. This is necessary because you might have `AD1` and `AD2` and only wish to add uncertainty to `AD1`, for example.

Once this is done, we will call the function to create our simulation. It can be a bit tricky because there are a lot of arguments in the function but many are optional, according to the simulation. Let's first import the class.

In [None]:
import building_matrices as bm
pd = bm.ParallelData(functional_unit, 
                     method, 
                     Project_name, 
                     process_models=process_models, 
                     process_model_names=process_model_names, 
                     parameters=demo.unified_params, 
                     common_data=CommonData, 
                     seed = 1)

This class instantiation might seem a little scary but if we break it down, it is not that hard to understand. All the arguments after `Project_name` are optional and for the sake of this example, we are using them all. In your case, you may only want to use Process Model uncertainty, so you wouldn't need many of these arguments. The `process_models` and `process_model_names` are the two lists we just created to define which are the process models we are running uncertainty for. The parameters argument must have the `unified_parameters` attribute inside the `Project` object and the `common_data` is the Common Data object that we created and added uncertainty. In the simulation, this Common Data object will be replaced inside the process models passed in `process_models` and `process_model_names`. So, if you want Common Data uncertainty across all process models, they must all go in these lists. 

Let's finally run our simulation. The first argument is the amount of threads it will use. Most modern machines have 4 or 8 cores so the speedups can be considerable on more modern architechtures. The second argument is the number of runs. For this simple example, let's use 100. 

In [None]:
pd.run(4,100)

Once the simulation is completed, we can convert our results to a Data Frame with:

In [None]:
output=pd.result_to_DF()

Or even dump to a pickle file with:

In [None]:
pd.save_results('results.p')