:::{index} pair: DataFrame; Modelflow use  
:::
:::{index} pair: DataFrame; Modelflow extensions  
:::

# Pandas dataframes and modelflow extensions to dataframes

Any class can have both properties (data) and methods (functions that operate on the data of the particular instance of the class). With object-oriented programming languages like python, classes can be built as supersets of existing classes. The `modelflow` class ```model``` inherits or encapsulates all of the features of the pandas dataframe and extends it in many  ways.  Some of the methods below are standard pandas methods, others have been added to it by `modelflow` features

Creating scenarios entails first to establish new values for some exogenous variables then to simulate the model. Data in a `dataframe` can be modified directly with built-in pandas functionalities like `.loc[]` and `eval()`. However in order to update and transform variables in a more intuitive way `modelflow` extends these capabilities with the two handy methods: 
 - **`.upd()`** Returns a DataFrame with updated variables (columns)  
 - **`.mfcalc()`** Returns a DataFrame where the variables has been transformed using algebraic expression.  
 
 These methods are injected into the Pandas DataFrame when the `model` class is imported. So they are ready to use after the `from modelclass import model` line. 

In [None]:
# Prepare the notebook for use of modelflow 

# Jupyter magic command to improve the display of charts in the Notebook
%matplotlib inline

# Import pandas 
import pandas as pd

# Import the model class from the modelclass module 
from modelclass import model 

# functions that improve rendering of modelflow outputs
model.widescreen()
model.scroll_off();

:::{index} single: DataFrame; Modelflow specific features
:::
```{index} single: Modelflow; Allowed column names

```

## Modelflow and Pandas DataFrames
When using Modelflow there are a few guidelines which the user has to obey in order to make things happen.  

### Column names in  Modelflow 
```{margin} Modelflow variable names
Modelflow places more restrictions on column names than do pandas *per se*.

```
While pandas `dataframes` are very liberal in what names can be given to columns, ```modelflow``` is more restrictive.

Specifically, in modelflow a variable name must:

* start with a letter
* be upper case

Thus while all these are legal column names in pandas, some are illegal in modelflow.

:::{index} single: DataFrame; Modelflow naming conventions
:::
:::{index} single: Modelflow; DataFrame naming conventions
:::


| Variable Name | Legal in</br> modelfow? | Reason |
|:-------|:-------------|:--------|
| IB | Yes | <span style='color:Green'>Starts with a letter and is uppercase</span> |
| ib | No |<span style='color:red'> lowercase letters are not allowed</span>|
| 42ANSWER | No |<span style='color:Red'> does not start with a letter </span>|
| \_HORSE1 | No |<span style='color:Red'>does not start with a letter </span>|
| A_VERY_LONG_NAME_THAT_IS_LEGAL_3 | Yes |<span style='color:Green'> Starts with a letter and is uppercase </span>|

:::{index} single: Modelflow; time index
:::
:::{index} single: DataFrame; Modelflow time index
:::


### .index and time dimensions in Modelflow
As we saw above, series have indices.  Dataframes also have indices, which are the row names of the dataframe.

In ```modelflow``` the index series is typically understood to represent a date. 

For yearly models a list of integers like in the above example works fine.<br>

For higher frequency models (quarterly, monthly, weekly,daily, etc.) the index can be one of several pandas date types, but users are encouraged to use `pd.period_range()` to create date indexes.

:::{warning}

Not all datetypes work well with the graphics routines of modelflow.  Users are advised to use the ```pd.period_range()``` method to generate date indexes.

For example:
```   
    dates = pd.period_range(start='1975q1',end='2125q4',freq='Q')
    df.index=dates
```

:::

:::{index} single: DataFrame; leads and lags in Modelflow
:::
:::{index} single: Leads in modelflow
:::
:::{index} single: Lags in modelflow
:::



### Leads and lags

`Pandas` does not support the economic idea of leads and lags per se (although the `.shift()` operator can be used to emulate the same idea in ordered `dataframes`.

`Modelflow` explicitly supports the idea of leads and lags. In `Modelflow` leads and lags can be indicated by following the variable with a parenthesis and either -1 or -2 for one or two period lags (where the number following the negative sign indicates the number of time periods that are lagged). Positive numbers are used for forward leads (no +sign required).

When a method defined by the `modelflow` class encounters something like `A(-1)`, it will take the value from the row above the current row. No matter if the index is an integer, a year, quarter or a millisecond. The same goes for leads, `A(+1)` will return the value of `A` in the next row. 

As a result in a quarterly model `B=A(-4)` would assign B the value of A from the same quarter in the previous year. 