<span style='color:Blue'> ORIGINAL </span>

# Useful model instance properties and methods
The focus of this chapter is to introduce some properties and methods of the model instance.

First a model and data is loaded, then a scenario is run. So we have some content to use.

A model instance gives the user access to a number of properties and methods which helps in managing the model and its results. 

If ´´´mmodel´´ is a model instance ```mmodel.<property>``` will return a property. Some properties can also be assigned by the user just by
>mmodel.property = something 

The model class itself also have a few properties. 

## Import the model class
This class incorporates most of the methods used to manage a model.

Assuming the ModelFlow library has been installed on your machine, the following imports set up your notebook so that you can run the cells in this notebook.

In order to manipulate plots later matplotlib.pyplot is also imported. 

In [1]:
from modelclass import model 

In [2]:
import matplotlib.pyplot as plt # 

In [3]:
# housekeeping for developement 
%load_ext autoreload   
%autoreload 2

## Class methods to help in Jupyter Notebook



### .widescreen() use Jupyter Notebook in widescreen 
Enables the whole viewing area of the browser.

In [4]:
model.widescreen() 

### .scroll_off() Turn off scroll cells in Jupyter Notebook
Can be useful

model.scroll_off()

## .modelload Load a pre-cooked model, data and descriptions 

In this notebook, we will be using a pre-existing  model of Pakistan.

The file 'pak.pcim' has been created from a Eviews workspace. It contains all that is needed to run the model: 

- Model equations
- Data
- Simulation options 
- Variable descriptions 

Using the 'modelload' method of the  'model' class, a model instance 'mpak' and a 'result' DataFrame is created.


In [5]:
mpak,baseline = model.modelload('../models/pak.pcim',run=1,silent=1,keep='Baseline')


**mpak** <br> 
The *modelload* method processes the file and initiates the model, that we call 'mpak' (m for model and pak for Pakistan) with both equations and the data.

'mpak' is an instance of the  model object with which we will work.

**baseline**  <br> 
'result' is a Pandas dataframe containing the data that was loaded. This data is also exists inside the model object but can be accessed separately through 'result'.

**run=1** the model is simulated. The simulation time and options from the time the file where dumped will used. <br>The two objects **mpak.basedf** and **mpak.lastdf** will contain the simulation result. If run=0 the model will not be simulated. 

**silent=1** if silent is set to 1 a number of information regarding the simulation will be displayed.

**keep='Baseline'** This saves the result in a dictionary mpak.keep_solutions

## Create a scenario
Many objects relates to comparison of different scenarios. So a scenario is created by updating some exogenous variables.<br>
In this case the carbon tax rates for gas, oil and coal are all set to 29 from 2023 to 2100. <br>Then the scenario is simulated. 
<br>Now the mpak object contains a number of useful properties and methods. 

You can find more on this experiment [here](../update/create_experiment.ipynb)

In [6]:
scenario_exo  =  baseline.upd("<2020 2100> PAKGGREVCO2CER PAKGGREVCO2GER PAKGGREVCO2OER = 29")

## () Simulate on a dataframe 
When calling the model instance like ```mpak(dataframe,start, end)``` the model will be simulated for the time frame ```start to end```using the dataframe  <br>
Just above we created a dataframe ``scenario_exo``` where the tax variables are updated. Now the ```mpak``` can be simulated. We simulate from 2020 to 2100. 

In [7]:
scenario = mpak(scenario_exo,2020,2100,keep=f'Coal, Oil and Gastax : 29') # runs the simulation

## Access results 

Now we have two dataframes with results ```baseline```and ```scenario```. These dataframes can be manipulated and visualized
with the tools provided by the **pandas** library and other like **Matplotlib* and **Plotly**. However to make things easy the first and
latest simulation result is also in the mpak object:

- **mpak.basedf**: Dataframe with the values for baseline
- **mpak.lastdf**: Dataframe with the values for alternative  

This means that .basedf and .lastdf will contain the same result after the first simulation. <br>
If new scenarios are simulated the data in .lastdf will then be replaced with the latest results

These dataframes are used by a number of model instance methods as you will see later. 

The user can assign dataframes to both .basedf and .lastdf. This is useful for comparing simulations which are not the first and last. 

In [8]:
print(f'mpak.basedf: Dataframe: with {mpak.basedf.shape[0]} years and {mpak.basedf.shape[1]} variables')
print(f'mpak.lastdf: Dataframe: with {mpak.lastdf.shape[0]} years and {mpak.lastdf.shape[1]} variables')

mpak.basedf: Dataframe: with 121 years and 1291 variables
mpak.lastdf: Dataframe: with 121 years and 1291 variables


### .keep_solutions, A dictionary of dataframes with results

Create a dictionary of dataframes with .keep_solutions. Sometimes we want to be able to compare more than two scenarios. Using ```keep='some description'``` the dataframe with results can be saved into a dictionary with the description as key and the dataframe as value.  

In our example we have created two scenarios. A baseline and a scenario with the tax set to 29. So mpak.keep_solutions looks like this: 

In [13]:
print('mpak.keep_solutions contains:')
for key,value in mpak.keep_solutions.items(): 
    print(f'key = {key:25}|Dataframe: {value.shape[0]} years and {value.shape[1]} variables')

mpak.keep_solutions contains:
key = Baseline                 |Dataframe: 121 years and 1291 variables
key = Coal, Oil and Gastax : 29|Dataframe: 121 years and 1291 variables


Sometime it can be useful to reset the keep_solutions so that new solutions can ve inspected. This is done by replacing it with an empty dictionary. Two ways can be used:  
>mpak.keep_solutions = {}

or in the simulation call: 
>mpak(,,keep='')

### More on manipulating keep_solution here: <a link > 

### .oldkwargs Options in the simulation call is persistent between calls 
When simulating a model the parameters are persistent. So the user just have to provide the 
solution options once. These persistent parameters are located tin the property .oldkwargs.

The user may have to reset the parameters this is done like this: 

To reset the options just do: 
>mpak.oldkwargs = {}

In this case the persistent parameters are: 

In [10]:
mpak.oldkwargs

{'silent': 1, 'keep': 'Coal, Oil and Gastax : 29'}

## .current_per, The time frame operations are performed on
Most operations on a model class instance operates on the current time frame. 
It is a subset of the row index of the dataframe which is simulated. 

In this case it is: 

In [11]:
mpak.current_per

Int64Index([2020, 2021, 2022, 2023, 2024, 2025, 2026, 2027, 2028, 2029, 2030,
            2031, 2032, 2033, 2034, 2035, 2036, 2037, 2038, 2039, 2040, 2041,
            2042, 2043, 2044, 2045, 2046, 2047, 2048, 2049, 2050, 2051, 2052,
            2053, 2054, 2055, 2056, 2057, 2058, 2059, 2060, 2061, 2062, 2063,
            2064, 2065, 2066, 2067, 2068, 2069, 2070, 2071, 2072, 2073, 2074,
            2075, 2076, 2077, 2078, 2079, 2080, 2081, 2082, 2083, 2084, 2085,
            2086, 2087, 2088, 2089, 2090, 2091, 2092, 2093, 2094, 2095, 2096,
            2097, 2098, 2099, 2100],
           dtype='int64')

In [12]:
result.index  # the index of the dataframe

NameError: name 'result' is not defined

### .smpl, Set time frame 
The time frame can be set like this

In [None]:
mpak.smpl(2020,2025)
mpak.current_per

### .set_smpl, Set timeframe for a local scope
For many operations it can be useful to apply the operations for a shorter time frame, but retain the global time frame after the operation. <br>
This can be done  with a ```with``` statement like this. 
You will see this used here [here](#Contributions-of-variables-to-the-changes-observed)  

In [None]:
print(f'Global time  before   {mpak.current_per}')
with mpak.set_smpl(2022,2023):
    print(f'Local time frame      {mpak.current_per}')
print(f'Unchanged global time {mpak.current_per}')

### .set_smpl_relative Set relative timeframe for a local scope
When creating a script it can be useful to set the time frame relative to the 
current time. 

Like this:

In [None]:
print(f'Global time  before   {mpak.current_per}')
with mpak.set_smpl_relative (-1,0):
    print(f'Local time frame      {mpak.current_per}')
print(f'Unchanged global time {mpak.current_per}')

(index-operator)=
## Using the index operator to  [ ] to select and visualize variables. 
The index operator [ ] can be used to select variables and then process the values for quick analysis. 
 
To select variables the method accept pattern which defines variable names. Wildcards: 
- ```\*``` matches everything
- ```?``` matches any single character
- ```\[seq]``` matches any character in seq
- ```\[!seq]``` matches any character not in seq



For more on wildcards can be used, the specification can be found here https://docs.python.org/3/library/fnmatch.html


In the following example we are selecting the results of mpak['PAKNYGDPMKTPKN']

This call will return a special class (called ```vis```). It implements a number 
of methods and properties which comes in handy for quick analyses. 


Then several properties and methods are chained with the following plot as a result: 

In [None]:
with mpak.set_smpl(2020,2100):
    mpak['PAKNYGDPMKTPKN'].difpctlevel.mul100.rename().plot(colrow=1,
                title='Difference to baseline in percent',top=0.8);

But first some basic information


### model['#ENDO'] 

Use '#ENDO' to access all endogenous variables in your model instance. 

For the sake of space, the result is saved in the variable 'allendo' and not printed. 

In [None]:
allendo = mpak['#ENDO']
allendo.show

### Access values in .lastdf and .basedf

To limit the output printed, we set the time frame to 2020 to 2023. 

In [None]:
mpak.smpl(2020,2023);

To access the values of 'PAKNYGDPMKTPKN' and 'PAKNECONPRVTKN' from the latest simulation:
<span style='color:Blue'> this is correct, right? </span>

In [None]:
mpak['PAKNYGDPMKTPKN PAKNECONPRVTKN'] 

To access the values of 'PAKNYGDPMKTPKN' and 'PAKNECONPRVTKN' from the base dataframe, specify .base

In [None]:
mpak['PAKNYGDPMKTPKN PAKNECONPRVTKN'].base 


### .df  Pandas dataframe 

If you need the data to calculate on use the .df


In [None]:
mpak['PAKNYGDPMKTPKN PAKNECONPRVTKN'].df


### .show  as a html table with tooltips 

If you want the variable descriptions use this 


In [None]:
mpak['PAKNYGDPMKTPKN PAKNECONPRVTKN'].show

### .names Variable names

If you select variables using wildcards, then you can access the names that correspond to your query.



In [None]:
mpak['PAKNYGDP??????'].names

### .frml The formulas 

Use .frml to access all the equations for the endogenous variables.  

In [None]:
mpak['PAKNYGDPMKTPKN PAKNECONPRVTKN'].frml

### .rename() Rename variables to descriptions

Use .rename() to assign variable descriptions as variable names. 

Handy when plotting! 

In [None]:
mpak['PAKNYGDPMKTPKN PAKNECONPRVTKN'].rename()

### Transformations of solution results 

When the variables has been selected through the index operator a number of standard data transformations can be performed. 

|Transfomation|Meaning|expression 
| :--- | :--- | :----------------------------------------------: |
| pct |Growth rates| $\left(\cfrac{this_t}{this_{t-1}}-1\right )$                  |
| dif |Difference in level| $l-b$                                            |
| difpct| Differens in growth rate|$\left( \cfrac{l_t}{l_{t-1}}-1 \right) - \left(\cfrac{b_t}{b_{t-1}}-1 \right)$
| difpctlevel |differens in level in pct of baseline |$\left( \cfrac{l_t-b_t}{b_{t}} \right) $
| mul100 | multiply by 100 | $this_t \times 100$|

- $this$ is the chained value. Default lastdf but if preseeded by .base the values from .basedf will be used 
- $b$ is the values from .basedf
- $l$ is the values from .lastdf 

### .dif Difference in level 

The 'dif' command displays the difference in levels of the latest and previous solutions.

$l-b$

where l is the variable from the .lastdf and b is the variable from .basedf.  

In [None]:
mpak['PAKNYGDPMKTPKN PAKNECONPRVTKN'].dif

### .pct  Growthrates 
Display growth rates

$\left(\cfrac{l_t}{l_{t-1}}-1\right )$

In [None]:
mpak['PAKNYGDPMKTPKN PAKNECONPRVTKN'].pct

### .difpct property difference in growthrate 
The difference in the growth rates  between the last and base dataframe.  

$\left( \cfrac{l_t}{l_{t-1}}-1 \right) - \left(\cfrac{b_t}{b_{t-1}}-1 \right)$

In [None]:
mpak['PAKNYGDPMKTPKN PAKNECONPRVTKN'].difpct  

### .difpctlevel percent difference of  levels 

$\left( \cfrac{l_t-b_t}{b_{t}} \right) $

In [None]:
mpak['PAKNYGDPMKTPKN PAKNECONPRVTKN'].difpctlevel  

### mul100 multiply by 100 

multiply growth rate by 100. 

In [None]:
mpak['PAKNYGDPMKTPKN PAKNECONPRVTKN'].pct.mul100 

## .plot chart the selected and transformed variables
After the varaibles has been selected and transformed, they can  be plotted. The .plot() method plots the selected variables separately

In [None]:
mpak.smpl(2020,2100);

mpak['PAKNYGDP??????'].rename().plot();

### Options to plot() 

Common:<br>
- title (optional): title. Defaults to ''.
- colrow (TYPE, optional): columns per row . Defaults to 2.
- sharey (TYPE, optional): Share y axis between plots. Defaults to False.
- top (TYPE, optional): relative position of the title. Defaults to 0.90.
        
        
More excotic:<br>         
- splitchar (TYPE, optional): if the name should be split . Defaults to '__'.
- savefig (TYPE, optional): save figure. Defaults to ''.
- xsize  (TYPE, optional): x size default to 10 
- ysize  (TYPE, optional): y size per row, defaults to 2
- ppos (optional): # of position to use if split. Defaults to -1.
- kind (TYPE, optional): matplotlib kind . Defaults to 'line'.


In [None]:
mpak['PAKNYGDP??????'].difpct.mul100.rename().plot(title='GDP growth ',top = 0.92);

## Plotting inspiration


The following graph shows the components of GDP using the values of the baseline dataframe. 

In [None]:
mpak['PAKNYGDPMKTPKN PAKNECONPRVTKN PAKNEGDIFTOTKN'].\
difpctlevel.mul100.rename().\
plot(title='Components of GDP in pct of baseline',colrow=1,top=0.90,kind='bar') ;


###  Heatmaps

For some model types heatmaps can be helpful, and they come out of the box. This feature was developed for use by bank stress test models. 

In [None]:
with mpak.set_smpl(2020,2030):
    heatmap = mpak['PAKNYGDPMKTPKN PAKNECONPRVTKN'].pct.rename().mul100.heat(title='Growth rates',annot=True,dec=1,size=(10,3))  


<a id=’With’></a>
### Violin, swarm and boxplots, 
Not obvious for macro models, but useful for stress test  models with many banks. 

In [None]:
with mpak.set_smpl(2020,2030): 
    mpak['PAKNYGDPMKTPKN PAKNECONPRVTKN'].difpct.box()  
    mpak['PAKNYGDPMKTPKN PAKNECONPRVTKN'].difpct.violin()  
    mpak['PAKNYGDPMKTPKN PAKNECONPRVTKN'].difpct.swarm()  

### Plot baseline vs alternative
A raw routine, only showing levels.
To make it really useful it should be expanded. 

In [None]:
mpak['PAKNYGDPMKTPKN PAKNECONPRVTKN'].plot_alt() ;

<span style='color:Blue'> ORIGINAL </span>


## Relationships (dependencies) between variables

ModelFlow perform a number of analytical chores, that it can oresent graphically or in tabular form.

These inlude presenting the relationships between variables.  In this case, we are displaying all of the vairables that depend directly upon GDP and consumption and those that are determined by them.  Directly because in this example we are only going one step up and backward inthe formulae.

The thicker  arrow the more dependent the variable is on the other, ie, it is related to the cobntribution that the variable made to the value of the other.


## .draw() Graphical presentation of relationships between variables

.draw() helps you understand the relationship between variables in your model better. 
 
The thicker the arrow the more dependent the variable is on the other.

### .draw(up = level, down = level)


You can specify how many levels up and down you want in your graphical presentation (Needs more explanation).

In this example all variables that depend directly upon GDP and consumption as well as those that are determined by them, are displayed. This means one step and one backwards (more explanation).


In [None]:
mpak['PAKNYGDPMKTPKN PAKNECONPRVTKN'].draw(up=1,down=1)  # diagram of all direct dependencies 

### .draw(filter = minimal_impact)

Use filter to only include variables that impact or are impacted by for example 20% 


In [None]:
mpak['PAKNECONPRVTKN'].draw(up=3,down=1,filter=20)  



## dekomp() Contribution of variables to observed change

The dekomp command decomposes the contributions of the right hand side variables to the observed change in the left hand side variables. 

In the example below, from our simulation imposing a carbon tax, the change in consumption demand had a large impact on GDP. 280% of the total in 2023 (third set of results), while government consumption contributed 24% and investment 16%.

Imports had a strong negative contribution of -227% since much of the increase in consumption, investment and government demand was imported. 

In [None]:
with mpak.set_smpl(2021,2025):
    mpak['PAKNYGDPMKTPKN PAKNECONPRVTKN'].dekomp()  # frml attribution 

## Bespoken plots using matplotlib  (or plotly -later) (should go to a separate plot book 

The predefined plots are not necessary created for presentation purpose. To create  bespoken the the plots can be 
constructed directly in python scripts. The two main libraries are matplotlib and plotly. 

## a simple matplotlib plot

In [None]:

# first we call our baseline df for 'base' and the last dataframe for alt (which means everything in Danish)
base = mpak.basedf
alt = mpak.lastdf

# The plot 
plt.plot(mpak.basedf.loc[2000:2099,'PAKGGBALOVRLCN']/mpak.basedf.loc[2000:2099,'PAKNYGDPMKTPCN']*100,label='Baseline')
plt.plot(mpak.lastdf.loc[2000:2099,'PAKGGBALOVRLCN_'],label='Carbon Tax scenario')
  
# Setting the y and x labels, legend and title 
# set y label
plt.ylabel('% of GDP')
plt.xlabel('Time')
plt.legend()  
plt.title('Overall Fiscal balance')
  
# display plot
plt.show()




## Plot four separate plots of multiple series in grid

In [None]:
figure,axs= plt.subplots(2,2,figsize=(11, 7))
axs[0,0].plot(mpak.basedf.loc[2020:2099,'PAKGGBALOVRLCN_'],label='Baseline')
axs[0,0].plot(mpak.lastdf.loc[2020:2099,'PAKGGBALOVRLCN_'],label='Scenario')
#axs[0,0].legend()

axs[0,1].plot(mpak.basedf.loc[2020:2099,'PAKGGDBTTOTLCN_'],label='Baseline')
axs[0,1].plot(mpak.lastdf.loc[2020:2099,'PAKGGDBTTOTLCN_'],label='Scenario')

axs[1,0].plot(mpak.basedf.loc[2020:2099,'PAKGGREVTOTLCN']/mpak.basedf.loc[2020:2099,'PAKNYGDPMKTPCN']*100,label='Baseline')
axs[1,0].plot(mpak.lastdf.loc[2020:2099,'PAKGGREVTOTLCN']/mpak.lastdf.loc[2020:2099,'PAKNYGDPMKTPCN']*100,label='Scenario')

axs[1,1].plot(mpak.basedf.loc[2020:2099,'PAKGGREVGRNTCN']/mpak.basedf.loc[2020:2099,'PAKNYGDPMKTPCN']*100,label='Baseline')
axs[1,1].plot(mpak.lastdf.loc[2020:2099,'PAKGGREVGRNTCN']/mpak.lastdf.loc[2020:2099,'PAKNYGDPMKTPCN']*100,label='Scenario')
#axs2[4].plot(mpak.lastdf.loc[2000:2099,'PAKGGREVGRNTCN']/mpak.basedf.loc[2000:2099,'PAKNYGDPMKTPCN']*100,label='Scenario')

axs[0,0].title.set_text("Fiscal balance (% of GDP)")
axs[0,1].title.set_text("Gov't Debt (% of GDP)")
axs[1,0].title.set_text("Total revenues (% of GDP)")
axs[1,1].title.set_text("Grant Revenues (% of GDP)")
figure.suptitle("Fiscal outcomes")

plt.figlegend(['Baseline','Scenario'],loc='lower left',ncol=5)  
figure.tight_layout(pad=2.3) #Ensures legend does not overlap dates
