:::{index} ModelFlow Prepare your workspace
:::

:::{index} Preparing your workspace
:::

In [None]:
#This is code to manage dependencies if the notebook is executed in the google colab cloud service
if 'google.colab' in str(get_ipython()):
  import os
  os.system('apt -qqq install graphviz')
  os.system('pip -qqq install ModelFlowIb   ')


# Working with a World Bank Model under ModelFlow


The basic method for working with any model is the same. Indeed the initial steps followed here are the same as were followed during the preceding discussions of `ModelFlow` features.

Process:
1. Prepare the workspace
1. Load the model ModelFlow
2. Design some scenarios  
3. Simulate the model
4. Visualize the results 

## Prepare the work space

To use `ModelFlow` it must be installed as per the instructions in Chapter 3 (a one-time operation). The  the python environment into which `ModelFlow` was installed must be activated.  For users that installed `ModelFlow` according to the earlier instructions this can be achieved by executing `conda activate ModelFlow` in line with earlier instructions.

Once the python environment is activated, the `ModelFlow` and `pandas` packages must be imported into your workspace. Once this is done (accomplished by the code below), the user is ready to work with `ModelFlow`. 

:::{admonition} In this chapter - Working with a World Bank Model under ModelFlow
:class: tip

This chapter provides practical guidance on using World Bank models within the ModelFlow framework. 

Key points include:

- **Setup and Preparation**:
  - Prepare the workspace by setting up the Python environment.
  - Load the model, associated data, and variable descriptions.

- **Model Exploration**:
  - Extract information about the model, including:
      - its equations
      - its variables
      - its structure
      - its data
  - Organize variables into groups for streamlined exploration and analysis.

- **Key Methods**:
  - Use ModelFlow’s built-in functions to query, modify, and analyze data.
  - Explore relationships between variables and their role in the model.




:::


In [None]:
# Prepare the notebook for use of ModelFlow 

# Jupyter magic command to improve the display of charts in the Notebook
%matplotlib inline

# Import pandas 
import pandas as pd

# Import the model class from the modelclass module 
from modelclass import model 

# functions that improve rendering of ModelFlow outputs
model.widescreen()
model.scroll_off();

:::{index} ModelFlow; Load model from file
:::

## Load the model: Load a pre-existing model, data and descriptions 

To load a model use the ```model.modelload()``` method of ```ModelFlow```. In the example below, the model has been saved to the models folder located one level above the directory from which the `Jupyter Notebook` has been executed (but within the scope of `Jupyter` itself, i.e. below the directory from which the `Jupyter` system was launched.


### The `.modelload()` method

The command below

```
mpak,bline = model.modelload('../models/pak.pcim', alfa=0.7,run=1,keep= 'Baseline')
```
 
instantiates (creates an instance of) a `ModelFlow model` object using the model object (equiations and data) contained in the file `pak.pcim` and assigns it to the variable name `mpak`.

The ```run=1``` option executes the model and assigns the result of the model execution to the `DataFrame` ```bline```.  

The model is solved with the parameter alfa set to 0.7.  The $alfa \in (0,1)$ parameter determines the step size of the solution engine. The larger alfa the larger the step size. Larger step sizes solve faster, but may have trouble finding a unique solution.  Smaller step sizes take longer to solve but are more likely to find a unique solution.  Values of alfa=.7 work well for World Bank models.

The ```keep``` option instructs ```ModelFlow``` to maintain in the model object (```mpak```) the results of the initial scenario, assigning it the text name ```Baseline```.   As written, `modelload` returns both the model object `mpak`, but also a `DataFrame` ```bline``` that is assigned the results of the simulation.  This `DataFrame` is distinct from the one that is stored inside the `mpak` model object by the ```keep=``` command, although the data inside each of these `DataFrame`s will have the same numerical values. The `keep` option is described in more detail in the following chapter on scenarios.

:::{warning}
If `ModelFlow` cannot find the file at the position indicated it will look for it in the global Model repository on line.

Upon return, the `modelload` command indicates the location from which the model was retrieved.  In this case, from the requested local file store.
:::


In [None]:
#Replace the path below with the location of the pak.pcim file (or some other world bank model file) on your computer
mpak,bline = model.modelload('../models/pak.pcim', \
                                alfa=0.7,run=1,keep= 'Baseline')



### Extracting information about the model

The newly loaded python object  `mpak` is an instance of the model class and as such inherits the `methods` (functions) and `properties` (data) of that class. To learn about the model there are a variety of methods that can be used to extract information about the model and its data.

A World Bank model in `ModelFlow` contains a wide range of objects.

* variables  -- time series variables comprised of mnemonics and data
* dataframes -- data for each variable generated in  different simulations 
* groups     -- lists of variables
* equations  -- identities and behaviorals
* model      -- the model object itself

Extracting information about each of these objects is central to working with WBG models in `ModelFlow`.


The model object contains information about the model itself, its name, its structure (does it contain simultaneous equations or is it recursive), the number of variables it contains and the number that are exogenous and endogenous (have associated equations). Executing the unadorned name of a model object, i.e. `mpak` displays summary information about the model object.



In [None]:
mpak

The model work space also has a time dimension, its sample period.  This can be retrieved and changed.


In [None]:
mpak.current_per


:::{index} double: model instance; Information about model variables
:::

:::{index} Variable selection; Wildcards  
:::

Here the model is currently set up to solve over the period 2016 through 2030.  That period can be changed assuming, as is the case with the Pakistan model, that additional data are available.

### Information about variables 

The model object `mpak` contains lists of all the variables that form part of the model, and these lists can be interrogated to garner information about the model.  The Table below indicates some of the most important of these queries.  The variables for which information is sought can be specified directly or through a wildcard specification (see note).  


|Method | Example|Information returned |
|:--|:--|:--|
|`.names`|`mpak ['PAKNECON*XN] .name`| A python list of the mnemonics of all the variables defined and contained in the model object that match the search paremers in the `[]`|
|`.des`|`mpak ['PAKNECONPRVT?N'] .des` | A dictionary of mnemonics and their variable descriptions |
|`.<var name>.show`|`mpak.PAKNECONPRVTXN.show` | Lists the equation (formula), variable descriptions and data  values of a specific variable |


:::{index} double: Wildcard; variable names
::: 
    
:::{note}
**Wildcards**

Most of the variable and equation information commands accept wildcard specifications in the search parameter.

The `*` character in the command ```mpak['PAKNECON*XN'].names``` example is a `wildcard` character and the expression will return all variables that begin PAKNECON and end XN.  

The `?` in the `.des` example is another wildcard expression. It will match only single characters.  Thus ```mpak['PAKNECONPRVT?N'].names```  would return three variables: ```PAKNECONPRVTKN```, ```PAKNECONPRVTXN```, and ```PAKNECONPRVTXN```.  The real, current value, and deflators for household consumption expenditure.

Note the final show example uses a slightly different syntax where the variable to be operated upon is specified directly: `modelname.PAKNECONPRVTXN.show`. 

:::

The example below returns the mnemonics and descriptions of all variables matching the pattern `PAKNYGDP*KN`, i.e. Pakistani variables (PAK) from the National Income Accounts (NY) from the main sub-category GDP that are also real (K) expressed in local currency units (N) variables.

In [None]:
%%html
<style>
#varinfotable + table {
    color:#000000;
    width:90%; 
    border-collapse: collapse;
        
        
        
    font-family: Geneva, sans-serif;
}
#varinfotable + table th:nth-child(1) {
  width: 10%;
}
#varinfotable + table th:nth-child(2) {
  width: 30%
}
#varinfotable + table th:nth-child(3) {
  width: 60%
}  
</style>

:::{only} latex
latexcommand \newpage
:::

:::{index} single: Boxes; Box   3.  World Bank Mnemonics
:::

:::{admonition} Box 3.  World Bank Mnemonics

A typical World Bank model will have in excess of 300 variables.  Each has a mnemonic typically comprised of 14 characters that is structured in a specific way, The root for almost all variables is the three letters of the ISO code for the country to which the variable pertains.   Other letters describe the variable in ever finer detail (see below).
  
  
$$\texttt{12345678901234}$$
$$\color{green}{\texttt{CCC}}\color{red}{\texttt{AA}}\color{lime}{\texttt{MMM}}\color{blue}{\texttt{NNNN}}\color{magenta}{\texttt{U}}\color{black}{\texttt{C}}$$
    
where:

| Letters| Meaning |
|:---:|:---|
| $\color{green}{\texttt{CCC}}$ | The three-leter ISO code for a country -- i.e. IDN for Indonesia, RUS for Russia | 
| $\color{red}{\texttt{AA}}$ | The two-letter major accounting system to which the variable attaches (see following Table for more info) | 
| $\color{lime}{\texttt{MMM}}$ |The three-letter major sub-category of the data -  i.e. GDP, EXP - expenditure | 
|$\color{blue}{\texttt{NNNN}}$ | The four-letter minor sub-category  MKTP for market prices |
| $\color{magenta}{\texttt{U}}$ | The measure  (K: real variable;C: Current Values; X: Prices)|
|$\color{black}{\texttt{C}}$ | denotes the Currency (N: National currency; D: USD; P: PPP)|

Common major accounting systems mnemonics: 

The, $\color{red}{\texttt{AA}}$s from above include:

| Code | Meaning |
|:--:|:--|
| NY | National income |
| NE | National expenditure Accounts|
| NV | Value added accounts |
| GG | General Government Accounts|
| BX | Balance of Payments: Exports |
| BM | Balance of Payments: Imports |
| BN | Balance of Payments: Net |
| BF | Balance of Payments: Financial Account |

**Less common terminations:**
Occasionally you will see variables with and '\_' appended to the name.  This indicates that the variable is being expressed as a percent of something (usually GDP).  Thus PAKBNCABFUNDCD_ means Pakistan (PAK) Balance of Payments, net  (BN) of the Current Account (CAB) IMF definition (FUND) in Current (C) Dollars (D) expressed as a percent of GDP.

Others less common terminations include ER (Effective rate) and (SR) Statutory rate used to denote the average tax rate (ER) of a tax versus the legal rate (SR).

Thus:

| Mnemonic | Meaning |
|:--:|:---|
|IDNNYGDPMKTPKN| Indonesia GDP at market prices, real in Indonesian Rupiah|
|KENNECPNPRVTXN| Kenya Private (household) consumption expenditure schillings deflator |
|BOLGGEXPGNFSCN| Bolivia Government Expenditure on Goods and services (GNFS) in current Bolivars|
|HRVGGREVDCITCN| Croatia Government Revenues Direct Corporate Income Taxes in current Euros|
|NPLBXGSRNFSVCD| Nepal BOP Exports of non-factor services (goods and services) in current USD|
:::



If executed, the command `mpak['*'].des` would return a dictionary of all the mnemonics and descriptions of all the variables in the `mpak` model object.

The below command is more restrictive and returns only the variables that start `PAKNYGDP` and `KN`.

In [None]:
mpak['PAKNYGDP*KN'].des

:::{index} Variable selection; ! Variable descriptions  
:::

#### The `!` operator -- searching on the variable description

The `!` operator allows the same methods to be used to retrieve information about variables, but based on their descriptions. Pre-pending the search string with the  `!` operator, tells it to try and match (and display) information about variables based on their descriptions not their mnemonics.

:::{note}
**The ! operator**
If a wildcard is preceded by an exclamation mark **!** the search will be done over the description of variables instead of the mnemonic
:::

The below expression returns the mnemonics of all variables whose description includes the word Carbon. 

In [None]:
mpak['!*Carbon*'].names

The following expression returns the mnemonics and descriptions of the same variables.

In [None]:
mpak['!*Carbon*'].des

The following expression returns the descriptions of a specific variable.

In [None]:
mpak.var_description['PAKGGREVCO2OER']

:::{index} model instance; Groups
:::

## Groups

`ModelFlow` incorporates a variant of the idea of groups from `EViews`.  In `ModelFlow` the groups defined in an imported `EViews` workfile are converted into entries in a dictionary called `var_groups` which can be interrogated, added to and amended like any dictionary in `python`.

The command
`mpak.var_groups` will return all of the groups already defined in mpak.


In [None]:
mpak.var_groups

:::{index} model instance; Add a group
:::
:::{index} Add a group
:::
:::{index} Groups; Add a group of variables 
:::

A group can be added to the dictionary by giving it a unique identifier (key) and associating with it a string defining the group, using a wildcard specification or just a space de-limited list of mnemonics.

Thus the first command below will generate a new group called 'MyGroup' that contains all variables beginning PAKGGREV and ending CN, plus the variable PAKGGBALOVRLCN to the dictionary var_groups that is part of the model object `mpak`. The second creates a group called LaborMarket which contains the variables for Employment and the Unemployment rate.



In [None]:
mpak.var_groups['MyGroup']='PAKGGREV*CN PAKGGBALOVRLCN'
mpak.var_groups['LaborMarket']='PAKLMEMPTOTLCN PAKLMUNRTOTLCN'


:::{index} # operator; Instructs the search algorthim to restrict itself to groups
:::
:::{index} Groups; The # operator instructs search to restrict itself to groups 
:::
### The `#` operator -- searching on the variable description

The `#` operator allows the same methods to be used to retrieve information about groups. Pre-pending the search string with the  `#` operator, tells it to try and match (and display) information about the variables in groups that match the search expression following the #.

:::{note}
**The # operator**
If a wildcard is preceded by an exclamation mark **#** the search will be done over the groups in the model object and will return information about the members of the returned groups 
:::

The below expression returns the mnemonics of all variables that are a member of the MyGroup Group. 

:::{index} model instance; Groups - list variables in group
:::
:::{index} Groups; List variables in group
:::
:::{index} List variables in group
:::

In [None]:
mpak['#MyGroup'].names

:::{index} Wildcard selection of data
:::
:::{index} Variable selection; # Variable groups   
:::

### Information about the data of series in a group

The unadorned command `mpak[#MyGroups]` invokes a widget that shows all of the data in the group `MyGroup` and various representations (level and growth rates) both as tables and charts.

In [None]:
mpak['#MyGroup']

:::{image} LoadingWBModel_group_out.PNG
    :alt: group output widget 
    :class: bg-primary mb-1
    :width: 100%
    :align: center

:::    

Alternatively just the graphs and or tables can be returned, by appending the `.df` method (tables) or `.plot()` methods (charts).  Modifying the command further by including the `.pct` command  would display the data as growth rates.

:::{index} ModelFlow - wildcard selection of data - return dataframe
:::
:::{index} Variable selection; ! Variable descriptions  
:::

In [None]:
mpak['#MyGroup'].df


Below the command has been placed inside a `with mpak.set_smpl()` clause to restrict the output to a shorter period.  If it was not used the output would cover the whole time period of the `.lastdf` DataFrame from which all of this data is drawn. In addition the round command is used to restrict the number of decimal places shown.  

:::{note}
When using a `with` clause, an explicit print statement is required. 
:::

:::{index} with .set_smpl(); Restrict the time period
:::
:::{index} Restrict the time period of displayed output
:::

:::{index} Data display - format output
:::
:::{index} Data formatting, decimal points
:::

Below the same logic is used to display the data from variables matching a mnemonic search.  The results have been placed inside a `with mpak.set_smpl()` clause to restrict the output to a shorter period.  If it was not used the output would cover the whole time period of the `.lastdf  DataFrame` from which all of these data are drawn.  

:::{note}
When using a `with` clause, an explicit print statement is required. 
:::

In [None]:
with mpak.set_smpl(2020,2025):
    print(round(mpak['#MyGroup'].pct.df,2)) # round restricts the display to 2 decimal points

When displaying a dataframe or a manipulation of a dataframe in cases where the output might include very many lines of output, Jupyter will, by default, truncate the output by showing the first and last five observations of the active sample period when the same call is  made without the with clause.

In [None]:
mpak.smpl(2000,2100)  # change the default view to cover 100 observations
round(mpak['#MyGroup'].pct.df,2)  #Jupyter will truncate the output

:::{index} model instance; wildcard selection of data - return plot 
:::
:::{index} Plot data from  wildcard search of data
:::

### Display data from a group graphically

In [None]:
mpak['#MyGroup'].pct.plot(title="Plot of Mygroup\ngrowth rates");


:::{index} single: model instance[]; .frml 
:::
:::{index} single: Return normalized formula of equations
:::

:::{index} model instance; information about equations
:::


## Information about equations

Information about specific equations can also be extracted and displayed.

:::{index} single: model instance; .endogene property
:::

###  The `endogene` property

The  `endogene` property returns a list of all variables in the model that are endogenous (have an equation). It can also be used to test whether a specific mnemonic has an equation associated with it. 

The  `endogene` property returns a `list`. For brevity only the first 5 elements are show below.

In [None]:
sorted(mpak.endogene)[:5]

:::{index} single: model instance; test if mnemonic is endogenous
:::
:::{index} Test if mnemonic is endogenous
:::

The expression `'PAKNECONPRVTKN' in mpak.endogene` returns True if the passed mnemonic is in the list returned by `mpak.endogene`.

In [None]:
'PAKNECONPRVTKN' in mpak.endogene


:::{index} Information on equations
:::

### Retrieving info on equations

There are three functions to extract the equations from a model.  

|Command|Effect|
|:--|:--|
|`mpak['PAKNECONPRVTKN'].frml`|Returns a **normalized** version of the equation (the one actually used in ModelFlow)|
|`mpak['PAKNECONPRVTKN'].eviews`|In models imported from Eviews, reports the original eviews specification|
|`mpak.PAKNECONPRVTXN.show` | Displays the equation (formula); variable descriptions; and variable values.|


:::{index} single: model instance[]; .eviews - The Eviews representation of an equation(prior to normalization)
:::

:::{index} Equations; The Eviews representation of an equation(prior to normalization)
:::

### The `.eviews` method

The ```mpak['PAKNECONPRVTKN'].eviews``` command returns the equations before they were normalized. In most cases this is a slightly more legible form. Here following the EViews syntax, $\Delta ln()$ is written as dlog().



In [None]:

mpak['PAKNECONPRVTKN'].eviews



:::{index} single: model instance.<variable>; .frml - the normalized  representation of an equation 
:::
:::{index} single: Equations; The normalized representation of an equation 
:::

### The `.frml`  property

The `.frml` method returns the normalized equation that is actually used in ModelFlow.  

In this instance the variable to be displayed is referenced directly (not as the result of a search operation `['partial*variablename']` syntax.

Note: The `.frml` method also returns a long-text description of all the variables in the equation (assuming that one was defined for each variable).  Below, the DURING_2019 variable has had no dseciption defined so it returns a blank.

:::{admonition} Note
Following the normalized equation is a listing of all the dependent variables of the equation and their descriptions.
:::

:::{only} latex
latexcommand \begin{samepage}
:::

In [None]:
mpak.PAKNECONPRVTKN.frml

:::{only} latex
latexcommand \end{samepage}
:::

### The `.show` method

The `.show` method returns:
1. The description of the variable
1. The normalized equation that is actually used in ModelFlow.
1. A listing of the mnemonics and descriptions of the RHS variables
1. The data of that variable (drawn from the `basedf` and `.lastdf` DataFrames in the model object as well as the data of the RHS variables of the equation from both the `basedf` and `.lastdf` DataFrames.

In [None]:
mpak.smpl(2020,2025) #change the actual sample range to limit the number of columns displayed
mpak.PAKNECONPRVTKN.show

:::{only} latex
latexcommand \par
:::

:::{image} LoadingWBModel_show_1.PNG
    :alt: show output 
    :class: bg-primary mb-1
    :width: 100%
    :align: center

:::    

:::{only} latex
latexcommand \par
:::
:::{image} LoadingWBModel_show_2.PNG
    :alt: show output 
    :class: bg-primary mb-1
    :width: 100%
    :align: center

:::    

:::{only} latex
latexcommand \par
:::
:::{image} LoadingWBModel_show_3.PNG
    :alt: show output 
    :class: bg-primary mb-1
    :width: 100%
    :align: center

:::    

:::{only} latex
latexcommand \par
:::

:::{image} LoadingWBModel_show_4.PNG
    :alt: show output 
    :class: bg-primary mb-1
    :width: 100%
    :align: center

:::    