(onboardone)=
# Import a World Bank Eviews model from its solution file (.wf1)
This notebook takes a .wf1 workfile and transforms it into a modelflow model.

Most standard World Bank models should work with limited intervention.  Some using unusual techniques or variable definitions may require intervention by the user.

## Overview of the import process

The overall import process is performed by a special a ModelFlow class named:**GrabWfModel**. Certain steps require the use of EViews itself.  This howoto was designed using EViews version 12, but has been tested with versions 13 and 14.

Steps to follow:
 1. Start EViews and open the solution file 
    -    If returning to this step after an initial error perform transformations on the data (needed in some cases where special EViews functions are used in the model).
    -    The UNlink the model. This transforms linked equations into explicit equations in the EViews model object. 
    -    Save the revised model as a .wf2 file  (the .wf2 is a JSON format that can be easily read into python), with the same name as the original file but with "\_modelflow" appended to the name.
    
 5. Close Eviews.
 6. The wf2 file is read as a json file. 
 7. Relevant objects are extracted. 
 7. The MFMSAOptions strinng from the views file is  extracted, to be saved in the ModelFlow pcim file. 
 8. The equations are transformed and normalized to modelflow format and classified into identities and stochastic
 9. Stochastic equations are enriched by add_factor and fixing terms (dummy + fixing value)  
 9. For Stochastic equations new fitted variables are generated - without add add_factors and dummies.  
 9. A model to generate fitted variables is created  
 9. A model to generate add_factors is created. 
 9. A model encompassing the original equations, the model for fitted variables and for add_factors is created. 
 9. The data series and scalars are shoveled into a Pandas dataframe 
     - Some special series are generated, for the case where an inline function is used in EViews that can not be translated directly into modelflow model specifications
     - The model for fitted values is simulated in the specified timespan
     - The model for add_factors is simulated in the timespan set in MFMSAOptions
 10. The data descriptions are extracted into a dictionary. 
    - Data descriptions for dummies, fixed values, fitted values and add_factors are derived. 
 11. Now we have a model and a dataframe with all variables which are needed.
 
The GrapWfModel instance in general keeps most of the steps so the developer can inspect the the different steps.  


## Prerequisites  

The import process requires EViews and tehrefore cannot be run on a non-windows computerthat does not have EViews 12 or later installed.  It also assumes that the python library `pyeviews` has been installed (`conda install pyeviews`).  `pyeviews` uses the EViews com interface to control eviews, allowing a python script to directly manipulate parts of the eviews model using eviews itself.  This is required only for the import process.


In [1]:
from pathlib import Path

from modelclass import model 
from modelgrabwf2 import GrabWfModel  #this is the class (part of the Modelflow Library) that will translate
                                      # the ewbvis model and equations into modelflow business logic and
                                      # import the data into pandas        

## Debugging the error

Many EViews models can be converted to ModelFlow. However, sometimes plain vanilla conversion doesn't work. In broad terms, experience shows three types of issues:  

1. **Unknown EViews @functions in equations.**
  The `GrabWfModel` function knows the most common Eviews @functions, however not all. 
2. **Syntax errors**. Conversion and  normalization of EViews equations to python statements can result in syntax errors.   
3. **The ModelFlow model don't calculate the same values as in the `.wf1` file.**. The `GrabWfModel` function calculates adjustment factors for all stochastic equations so that the equations yield the same values as in the `.wf1` file. However, this is not always the case, particularly if the value of an identity has been explicitly set. It can also be caused by limited numerical prescision.  


GrabWfModel knows some Eviews @<functions> but not all. As we can see from the line: 

 >New modelflow line:BOLBFCAF2BOP_=(1-DURING_1980_2021)*@BETWEEN(RESERVERATIO3,1/2,100000)*RESERVERATIO3

 GrabWfModel can not make @Between go away. That is causing trouble.

The EViews help tells us:

```
@between(series, val1, val2)  Creates a dummy variable equal to 1 for observations where series is greater than or equal to val1 and less than or equal to val2.
```

In python @between(series, val1, val2) is equivalent to: 
    
 >  (1. * float(val1 <= series <= val2)))  
    
We can reproduce the logic by adding a python commmand to perform the substitution of @between in the equation with a equivalent 
python expression 


```BOL_trans = lambda text: re.sub(
    r"@BETWEEN\(\s*(\w+)\s*,\s*([^,]+)\s*,\s*([^)]+)\s*\)",
    r"(1.0*float(\2 <= \1 <= \3))",
    text
)```


In [2]:
try: 
    all_about_mda = GrabWfModel(r'wfs\bolsoln.wf1', 
                  #eviews_run_lines= mda_eviews_run_lines,
                  #country_trans    =  mda_trans,
                    make_fitted = True,        # make equatios for fitted values of stocastic equations 
                    do_add_factor_calc=True,   # Calculate the add factors which makes the stocastic equations match    
                    fit_start = 2000,          # Start of calculation of fittet model in baseline (to have some historic values) 
                    fit_end   = None,           # end of calc for fittted model, if None taken from mdmfsa options  
                    disable_progress =True     # Better for jupyter book 
                           ) 
except Exception as e:
    print(f'Error converting model: {e}')


Reading c:\modelflow manual\papers\mfbook\content\archived\howto\onboard\eviews\wfs\bolsoln.wf1
Assummed model name: BOL
The model: BOL is unlinked 
Writing C:\modelflow manual\papers\mfbook\content\archived\howto\onboard\eviews\wfs\bolsoln_modelflow.wf2
Model name: BOL

Processing the model:BOL
Check for Eviews @ which are not caught in the translation
Probably errors as @ in lines:

Eviews line      :@IDENTITY BOLBFCAF2BOP_  = (1  - @DURING("1980 2021"))  * @between(reserveRatio3  , 1  / 2  , 100000)  * reserveRatio3
Original line     :@IDENTITY BOLBFCAF2BOP_  = (1  - DURING_1980_2021)  * @BETWEEN(RESERVERATIO3  , 1  / 2  , 100000)  * RESERVERATIO3
New modelflow line:BOLBFCAF2BOP_=(1-DURING_1980_2021)*@BETWEEN(RESERVERATIO3,1/2,100000)*RESERVERATIO3
Error converting model: @ in lines 


In [3]:
import re
BOL_trans = lambda text: re.sub(
    r"@between\(\s*(\w+)\s*,\s*([^,]+)\s*,\s*([^)]+)\s*\)",
    r"(1.0*float(\2 <= \1 <= \3))",
    text
)


# Usage example
text = '@between(3  , 1  / 2  , 100000)'
result = BOL_trans(text)

print(result)


(1.0*float(1  / 2   <= 3 <= 100000))


In [4]:
# some results 
print(f'{1.0*((1  / 2)   <= 30 <= 100000)=}')
print(f'{1.0*((1  / 2)   <= 30000 <= 100000)=}')
print(f'{1.0*((1  / 2)   <= 0 <= 100000)=}')

1.0*((1  / 2)   <= 30 <= 100000)=1.0
1.0*((1  / 2)   <= 30000 <= 100000)=1.0
1.0*((1  / 2)   <= 0 <= 100000)=0.0


In [5]:

BOL_trans=lambda text: re.sub(
    r"@between\(\s*(\w+)\s*,\s*([^,]+)\s*,\s*([^)]+)\s*\)",
    r"(1.0*float(\2 <= \1 <= \3))",
    text
)
all_about_bol = GrabWfModel(r'wfs\bolsoln.wf1', 
                  country_trans    =  BOL_trans,
                    make_fitted = True,        # make equatios for fitted values of stocastic equations 
                    do_add_factor_calc=True,   # Calculate the add factors which makes the stocastic equations match    
                    fit_start = 2000,          # Start of calculation of fittet model in baseline (to have some historic values) 
                    fit_end   = None,           # end of calc for fittted model, if None taken from mdmfsa options  
                    disable_progress =True     # Better for jupyter book 
                           ) 


Reading c:\modelflow manual\papers\mfbook\content\archived\howto\onboard\eviews\wfs\bolsoln.wf1
Assummed model name: BOL
The model: BOL is unlinked 
Writing C:\modelflow manual\papers\mfbook\content\archived\howto\onboard\eviews\wfs\bolsoln_modelflow.wf2
Model name: BOL

Processing the model:BOL
Check for Eviews @ which are not caught in the translation
Default WB var_group loaded
Variable description in wf1 file read
Default WB var_description loaded self.cty='' len(var_description)=16
var_description loaded from WF len(this)=301
testmodel calculated  
Calculation of add factors for BOL calculated  


## Check if each equation on its own result in the values provided. 
aka: residual check <br> 
If they are not pretty close, something is very wrong. 

For a more detailled display of error values use showall=1. 

In [6]:
all_about_bol.test_model(2015,2035,maxerr=100,tol=0.1,showall=1)   # tol determins the max acceptable absolute difference 

BOL calculated  

Chekking residuals for BOL 2015 to 2035

Variable with residuals above threshold
BOLNYGDPPOTLKN              , Max difference:   121.79006866 Max Pct    0.2060497541% It is number    49 in the solveorder and error number 1
FRML <IDENT> BOLNYGDPPOTLKN = BOLNYGDPTFP*(BOLLMEMPSTRL**BOLNYYWBTOTLCN_)*(BOLNEGDIKSTKKN(-1)**(1-BOLNYYWBTOTLCN_)) $
Potential Output, constant LCU

Result of equation 
          Before check  After calculation      Difference           Pct
2015 41237.2588902825   41237.2588902825    0.0000000000  0.0000000000
2016 42910.0314188969   42910.0314188969    0.0000000000  0.0000000000
2017 44488.8108181219   44488.8108181219    0.0000000000  0.0000000000
2018 46112.2953318973   46112.2953318973    0.0000000000  0.0000000000
2019 47653.1702917458   47653.1702917458    0.0000000000  0.0000000000
2020 49006.1818883145   49006.1818883145    0.0000000000  0.0000000000
2021 49885.0566274899   49885.0566274899    0.0000000000  0.0000000000
2022 51338.420507908

## Extract the model and the baseline
**all_about_bol** has a lot of content including. 
- .mmodel is the model instance
- .base_input is the baseline where the add factors and the fitted values are calculated 

In [7]:
mbol,baseline = all_about_bol() 
reference = baseline.copy() 

In [8]:
# And the time from  eviews forecast 
mbol.current_per

Index([2015, 2016, 2017, 2018, 2019, 2020, 2021, 2022, 2023, 2024, 2025, 2026, 2027, 2028, 2029, 2030, 2031, 2032, 2033, 2034, 2035], dtype='int64')

# Solve the model
to observe if the single equation residuals have material impact  

In [9]:
res = mbol(baseline,2016,2035,silent=0,ldumpvar=0,
           solver='sim',reset_options=True)
mbol.basedf = reference  # so we compare the values with the eviews results 

Will start solving: BOL
Reusing the solver as no new data 
now makelos makes a sim solvefunction
2016 Solved in 6 iterations
2017 Solved in 6 iterations
2018 Solved in 6 iterations
2019 Solved in 6 iterations
2020 Solved in 6 iterations
2021 Solved in 6 iterations
2022 Solved in 6 iterations
2023 Solved in 6 iterations
2024 Solved in 6 iterations
2025 Solved in 17 iterations
2026 Solved in 18 iterations
2027 Solved in 19 iterations
2028 Solved in 23 iterations
2029 Solved in 20 iterations
2030 Solved in 19 iterations
2031 Solved in 18 iterations
2032 Solved in 18 iterations
2033 Solved in 18 iterations
2034 Solved in 18 iterations
2035 Solved in 18 iterations
BOL solved  


In [10]:
# The values in the forecast periode don't match. 
# As expected when the each of the 3 equation identified above don't have the same result 
mbol['{cty}GGEXPCAPTCN {cty}NYGDPMKTPCN {cty}GGDBTTOTLCN {cty}BNCABFUNDCD'].dif.df

Unnamed: 0,BOLGGEXPCAPTCN,BOLNYGDPMKTPCN,BOLGGDBTTOTLCN,BOLBNCABFUNDCD
2016,1e-10,5e-10,1e-10,0.0
2017,1e-10,2e-10,2e-10,-0.0
2018,0.0,2e-10,2e-10,-0.0
2019,0.0,6e-10,1e-10,-0.0
2020,0.0,5e-10,-0.0,-0.0
2021,0.0,-1e-10,-3e-10,-0.0
2022,-1e-10,-9e-10,-7e-10,-0.0
2023,-1e-10,-1e-09,-1.4e-09,-0.0
2024,-3e-10,-1.6e-09,-2e-09,0.0
2025,49.2676122219,452.0225565115,95.3120150091,-2.2032865561


## The ModelFlow model don't calculate the same values as in the .wf1 file.
The reason being that some identities are not really identities. The offending equations need an add-factor which can make left and right hand side the same. 

This the simple method achieve this is to create a new model where the 3 problem equations are decorated with  an add factor and the addfactor is calculated. 

the lines below will do this. The equations are just the same as before, but  ```add_add_factor=True, calc_add=True```
 will introduce an add-factor and calculate it again. 

```python
newmbol, newbaseline = mbol_temp.equpdate(
'''
<IDENT> BOLNYGDPPOTLKN = BOLNYGDPTFP*(BOLLMEMPSTRL**BOLNYYWBTOTLCN_)*(BOLNEGDIKSTKKN(-1)**(1-BOLNYYWBTOTLCN_))
<IDENT> BOLBFCAFRACGCD = -BOLBFBOPTOTLCD*(1-BOLBFCAF2BOP_(-1))
<IDENT> BOLNEGDIKSTKKN = BOLNEGDIKSTKKN(-1)*(1-BOLDEPR/100)+BOLNEGDIFTOTKN
''',
    add_add_factor=True, calc_add=True
)
```



In [11]:
# Make a new model and set the .lastdf to the desired value 
mbol_temp,temp = all_about_bol()
mbol_temp.lastdf = temp

In [12]:
# Update the equations so they include an add factor, and calculate the addfactors 
newmbol,newbaseline = mbol_temp.equpdate(
'''
<IDENT> BOLNYGDPPOTLKN = BOLNYGDPTFP*(BOLLMEMPSTRL**BOLNYYWBTOTLCN_)*(BOLNEGDIKSTKKN(-1)**(1-BOLNYYWBTOTLCN_))
<IDENT> BOLBFCAFRACGCD = -BOLBFBOPTOTLCD*(1-BOLBFCAF2BOP_(-1))
<IDENT> BOLNEGDIKSTKKN = BOLNEGDIKSTKKN(-1)*(1-BOLDEPR/100)+BOLNEGDIFTOTKN
''',
             add_add_factor=True,calc_add=True)


The model:"BOL" got new equations, new model name is:"BOL Updated"
New equation for For BOLNYGDPPOTLKN
Old frml   :FRML <IDENT> BOLNYGDPPOTLKN = BOLNYGDPTFP*(BOLLMEMPSTRL**BOLNYYWBTOTLCN_)*(BOLNEGDIKSTKKN(-1)**(1-BOLNYYWBTOTLCN_)) $
New frml   :FRML <IDENT> BOLNYGDPPOTLKN = (BOLNYGDPTFP*(BOLLMEMPSTRL**BOLNYYWBTOTLCN_)*(BOLNEGDIKSTKKN(-1)**(1-BOLNYYWBTOTLCN_)) + BOLNYGDPPOTLKN_A)                               $
Adjust calc:FRML <IDENT> BOLNYGDPPOTLKN_A = (BOLNYGDPPOTLKN) - (BOLNYGDPTFP*(BOLLMEMPSTRL**BOLNYYWBTOTLCN_)*(BOLNEGDIKSTKKN(-1)**(1-BOLNYYWBTOTLCN_)))$

New equation for For BOLBFCAFRACGCD
Old frml   :FRML <IDENT> BOLBFCAFRACGCD = -BOLBFBOPTOTLCD*(1-BOLBFCAF2BOP_(-1)) $
New frml   :FRML <IDENT> BOLBFCAFRACGCD = (-BOLBFBOPTOTLCD*(1-BOLBFCAF2BOP_(-1)) + BOLBFCAFRACGCD_A)                               $
Adjust calc:FRML <IDENT> BOLBFCAFRACGCD_A = (BOLBFCAFRACGCD) - (-BOLBFBOPTOTLCD*(1-BOLBFCAF2BOP_(-1)))$

New equation for For BOLNEGDIKSTKKN
Old frml   :FRML <IDENT> BOLNEGDIKSTKKN 

In [14]:
# Now the new model with new add factors are simulated
res = newmbol(newbaseline,silent=0)
newmbol.basedf = reference
newmbol.lastdf = res.loc[:2035,:]  # only the relevant years are saved

Will start solving: updated BOL
Reusing the solver as no new data 
2016 Solved in 6 iterations
2017 Solved in 6 iterations
2018 Solved in 6 iterations
2019 Solved in 6 iterations
2020 Solved in 6 iterations
2021 Solved in 6 iterations
2022 Solved in 6 iterations
2023 Solved in 6 iterations
2024 Solved in 6 iterations
2025 Solved in 6 iterations
2026 Solved in 6 iterations
2027 Solved in 6 iterations
2028 Solved in 6 iterations
2029 Solved in 6 iterations
2030 Solved in 6 iterations
2031 Solved in 6 iterations
2032 Solved in 6 iterations
2033 Solved in 6 iterations
2034 Solved in 6 iterations
2035 Solved in 6 iterations
updated BOL solved  


In [17]:
# Now the values match the values in the wf1 file 
newmbol['{cty}GGEXPCAPTCN {cty}NYGDPMKTPCN {cty}GGDBTTOTLCN {cty}BNCABFUNDCD'].dif.rename().df

Unnamed: 0,General Government Capital Expenditure,"GDP, Market Prices, LCU mn",General Government Gross Debt,"Current Account Balance, US$ mn"
2016,1e-10,5e-10,1e-10,0.0
2017,1e-10,2e-10,2e-10,-0.0
2018,0.0,2e-10,2e-10,-0.0
2019,0.0,6e-10,1e-10,-0.0
2020,0.0,5e-10,-0.0,-0.0
2021,0.0,-1e-10,-3e-10,-0.0
2022,-1e-10,-9e-10,-7e-10,-0.0
2023,-1e-10,-1e-09,-1.4e-09,-0.0
2024,-3e-10,-1.6e-09,-2e-09,0.0
2025,1.4e-09,1.4e-08,6e-10,-3e-10


## Look a all the modelflow frmls
Notice after the "original" model the equations for the "fitted" values have been added. <br>
Also in the end of the listing the specification of the model which calculates the add factors if a variable is fixed. When processing the equations the ```model``` class will process this this model separately and create a model instance 
which is used to calculate add factors in case 

In [18]:
#print(mbol.equations)

In [19]:
mbol.modeldump('bol.pcim')

In [20]:
!dir *.pcim

 Volume in drive C has no label.
 Volume Serial Number is C2DB-095E

 Directory of c:\modelflow manual\papers\mfbook\content\archived\howto\onboard\eviews

25-11-2024  21:48           220.819 bol.pcim
22-11-2024  23:44           220.847 test.pcim
               2 File(s)        441.666 bytes
               0 Dir(s)  656.559.927.296 bytes free
