# Description of the code

## `model` class
The model is initialized as a class which takes several parameters (e.g. elasticity of substitution etc.) and a data matrix (population sizes, GDP, TFP, etc.). All parameters are called by latin or greek letters. Common parameters are either floats or dictionaries (depending on their dimension). Country specific variables or variables of higher dimensionality (time, region, skill, etc.) are assigned to multiple callable panda dataframes.

At the moment, the model allows for 3 different climate damage scenarios:`intermediate` (+2 °C), `minimalist` (+0 °C) and `maximalist` (+4 °C).

## `correct_south_south`

The first method that is added to the class is `correct_south_south`. It corrects the resident populations (L) in 2010 to account for south-south migration.

## `MidxDataFrame`

`MidxDataFrame` is simply a helper method that is handy while extracting individual variables from the data matrices of different dimensionality.

## `calib_epsilon`

The `calib_epsilon` method calibrates the relation between skill-ratio and TFP ($\overline{A_{r}}, \ \epsilon_{r}, \ \gamma $) for both regions. It takes $L_{r,s,t}$, $A_{r,t}$ (here: for $t = 1980,2010$) as inputs. Outputs are $\overline{A_{r}}, \ \epsilon_{r}, \ \gamma \ \forall \ r  $.

From
\begin{align}
A_{r,t} =& \gamma^t \overline{A_r} G \left(T_{r,t}\right) \left(\Gamma_{r,t}^L\right)^{\epsilon_{r}} \\
\Leftrightarrow \log(A_{r,t}) =& \log \left( \overline{A_{r}} \right) + \gamma \underbrace{\log (t)}_{\mathbb{I} (year = 1980)} + \underbrace{\log \left( G \left(T_{r,t}\right) \right)}_{=0} + \epsilon_{r} \log \left(\Gamma_{r,t}^L\right),\quad  \text{where} \ \Gamma_{r,t}^L = \frac{L_{r,h,t}}{L_{r,l,t}}. 
\end{align}
Estimate for both regions:
\begin{align}
\log(A_{r,t}) =&  \log \left( \overline{A_{r}} \right) +  \epsilon_{r} \ \log(\Gamma_{r,t}^L) + \gamma \ \mathbb{I} (year = 1980).
\end{align}

**Issues:**
* $\gamma$ currently set to 1. Remove that assumption!

## `calib_kappa`
`calib_kappa` calibrates the skill-biased technical change. Inputs are $\Gamma_{n,t}^w$, $L_{r,s,t}$ (here for $t = 2010$), outputs are $\kappa_r$ and $\overline{\Gamma_r^\eta}$.

Estimate
\begin{align}
\Gamma_{r,t}^\eta =& \overline{\Gamma_{r}^\eta} \left( \Gamma_{r,t}^L \right)^{\kappa_r} \\
\Leftrightarrow 
\log \left( \Gamma_{r,t}^\eta \right)=& \log \left( \overline{\Gamma_{r}^\eta} \right) + \kappa_r \log( \left( \Gamma_{r,t}^L \right) ,\quad  \text{where} \ \Gamma_{r,t}^L = \frac{L_{r,h,t}}{L_{r,l,t}}.
\end{align}
We assume that there is no skill-biased technical change in agriculture.

$\kappa_n$ is set to half the correlation between $\Gamma^\eta$ and $\Gamma^L$.

## `calib_migcosts`
This method uses data on resident populations, the migration matrices. It outputs internal and international migration costs.

**Issues:**
* How to handle $x_{ii}$ or $x_{ij}$ < 0?

**Input:**
* $L_{r,s,t}$
* $M_{i,j,r,s,t}$

**Output:**
* $x_{ii}$
* $x_{ij}$


Migration costs are derived seperately for each skill group in $t$ = 2010. For each skill $s$ in $S = \{a, n \}$:

Let $m_{i,i,a,h,t} = 0.3$
\begin{equation}
\hat{I_s} = I_{F,i,a,s,t} - I_{F,i,n,s,t} \\
N^s = L_{a,s,t} + L_{n,s,t} + M_{i,F,a,s,t} + M_{i,F,n,s,t} - \hat{I_s}
\end{equation}


$m_{a,F,s,t} = \frac{M_{a,F,s,t}}{L_{a,s,t} - (1-v_n_s) }
m_aF_s = M_aF_s_hat/(L_a_s_hat - (1-v_n_s)*I_s_hat)
m_nF_s = M_nF_s_hat/(L_n_s_hat - m_an_s * (L_a_s_hat - (1-v_n_s)*I_s_hat) - v_n_s * I_s_hat)
N_a_s = (L_a_s_hat - (1-v_n_s)*I_s_hat) * (1+m_an_s+m_aF_s)
N_n_s = (L_n_s_hat - m_an_s * (L_a_s_hat - (1-v_n_s)*I_s_hat) - v_n_s * I_s_hat) * (1 + m_nF_s)
M_aa_s = N_a_s/(1 + m_an_s + m_aF_s)
M_nn_s = N_n_s/(1 + m_nF_s)
M_an_s = m_an_s * M_aa_s
M_aF_s = m_aF_s * M_aa_s
M_nF_s = m_nF_s * M_nn_s
RES_a = abs(N_a_s - (M_aa_s+M_aF_s+M_an_s))
RES_n = abs(N_n_s - (M_nn_s+M_nF_s))


COMPLETE THIS PART!

## `calib_pop_growth`
We calibrate the population growth process as follows. We take $L_{r,s,1980}$, $\Gamma_{r,1980}^n$ exogenously from the data. Further, we have computed $N_{r,s,2010}$ in the calibration of migration costs. We constrain $p_{r,h,t}$ to be within a certain range. Then, with $t=1980, t+1 = 2010$: 

For $r$ in $\{a, n\}$:

Let $n_{r,h,t} = \Gamma_{r,t}^n n_{r,l,t}$ 

and 

$p_{r,h,t} = p_{r,h}^{max} - (p_{r,h}^{max}-p_{r,h}^{min}) \cdot \frac{w_{r,h,t,i}^{max}-w_{r,h,t,i}}{w_{r,h,t,i}^{max}-w_{r,h,t,i}^{min}}$, where [$p_{r,h}^{min}$, $p_{r,h}^{max}$] = [0.5,  0.8]

By definition:

$N_{r,l,t+1} = L_{r,l,t} (1-p_{r,l,t}) n_{r,l,t} + L_{r,h,t} (1-p_{r,h,t}) \Gamma_{r,t}^n n_{r,l,t}$ (I)


$N_{r,h,t+1} = L_{r,l,t} p_{r,l,t} n_{r,l,t} + L_{r,h,t} p_{r,h,t} \Gamma_{r,t}^n n_{r,l,t}$ (II)

So:


(II) $\Leftrightarrow p_{r,l,t} = \frac{N_{r,h,t+1}-L_{r,h,t} p_{r,h,t} \Gamma_{r,t}^n n_{r,l,t}}{L_{r,l,t} n_{n,l,t}}$ (III)

(III) in (I) $\Rightarrow N_{r,l,t+1} = (L_{r,l,t}+L_{r,h,t} \Gamma_{r,t}^n)n_{r,l,t} - N_{r,h,t+1} + L_{r,h,t} p_{r,h,t} \Gamma_{r,t}^n n_{r,l,t} - L_{r,h,t} p_{r,h,t} \Gamma_{r,t}^n n_{r,l,t}$

$\Leftrightarrow N_{r,l,t+1} + N_{r,h,t+1} = (L_{r,l,t}+L_{r,h,t} \Gamma_{r,t})n_{r,l,t}$

Finally: 

$n_{r,l,t} = \frac{ N_{r,l,t+1} + N_{r,h,t+1}}{L_{r,l,t}+L_{r,h,t} \Gamma_{r,t}} $

$p_{r,l,t} = \frac{N_{r,h,t+1}}{N_{r,l,t+1} + N_{r,h,t+1}} \frac{L_{r,l,t}+L_{r,h,t} \Gamma_{r,t}}{L_{r,l,t}}-p_{r,h,t} \Gamma_{r,t}^n \frac{L_{r,h,t}}{L_{r,l,t}}$



## `simulation method`
Description

# Problems
## Why does predicted L not match the data?
* L is a function of N_2010 and the internal and international emigrant to stayer ratios.
* N_2010 by itself is also a function of internal emigrant to stayer ratios and fertiltiy and probabilities.

# Checks

*Checks to correct the missing match with the migration matrix:*
- [x] Stop calibration loop after v, run simulation 1 time: SUM(v_calib(2010)) == SUM(v_simul(2010)) 
- [x] Check resident population before and after calibration of migcost 
- [ ] Stop calibration loop after Mij:  Mijdata(2010) != Mijcali(2010)
- [ ] Stop calibration after mij: 
- [ ] Compare Ms before and after calibration of xij, xii:
    * Result: Stayer ratios are too high!