# 1. Motivation

We construct a simulation for a random-trend (RT) model as defined by [Wooldrige 2010](https://mitpress.mit.edu/books/econometric-analysis-cross-section-and-panel-data),  section 11.7.1. The model is an extension of the [first-difference (FD) model](https://en.wikipedia.org/wiki/First-difference_estimator) for panel-data and overcomes some of its deficiencies. Our main interest is to find an unbiased estimator after a linear random-trend has already introduced some bias in the classical panel estimation methods.

## 1.1. Basic Panel Data Model

The basic model to estimate a dependent variable follows a linear panel data specification. The specific model is as follows:

<a id='EQ1'></a>

$$
y_{it} = \beta \cdot x_{it} + u_{it} \quad (1)
$$

As we are working with a panel, the index $i$ stands for an individual (person, company, country, etc.), while the index $t$ represents time. The dependent and the independent variables are respectively indicated by $y_{it}$ and $x_{it}$ and vary both over the individuals and over the time dimension. The error term is represented by $u_{it}$. This can be decomposed into three terms:

$$u_{it} = \alpha_i + e_{it}$$

In this context $\alpha_i$ is an unobserved individual specific effect that stays constant for each individual and $e_{it}$ is an idiosyncratic error that changes both over time and over individuals. In practice we are mostly interested in estimating the effect of $x$ on $y$ and so we are looking for an unbiased and consistent estimation of $\beta$. 

Given that both $x$ and $y$ are correlated with the first two parts of the error term, using just normal OLS without any data-transformation (pooled regression) will give biased results due to an [endogeneity bias](https://en.wikipedia.org/wiki/Endogeneity_%28econometrics%29).

Two methods can be used to deal with the endogeneity problem induced by the constant error term: The [fixed-effects](https://en.wikipedia.org/wiki/Fixed_effects_model) (FE) and the first-differences (FD) method.

The fixed-effects (FE) method can handle situations using the so-called within transformation, subtracting the mean from each variable corresponding to each individual. This eliminates the individual-specific effect which is constant over time (here: $\alpha_i$) and allows for a consistent [OLS estimation](https://en.wikipedia.org/wiki/Ordinary_least_squares): 

<a id='EQ2'></a>
$$y_{it} - \bar{y}_{i} = \beta \cdot (x_{it} - \bar{x}_{i}) + \alpha_i - \alpha_i + e_{it} - \bar{e}_{i} \quad (2)$$ 

with $\bar{y}_{i} = \sum_{t=1}^T y_{it}$, $\bar{x}_{i} = \sum_{t=1}^T x_{it}$ and $\bar{e}_{i} = \sum_{t=1}^T e_{it}$

The first-differences manages to control for the bias by subtracting the past observation from the current one:

<a id='EQ3'></a>
$$y_{it} - y_{it-1} = \beta \cdot ( x_{it} - x_{it-1}) + \alpha_i - \alpha_{i} + (e_{it} - e_{it-1}) \quad (3) $$

One can clearly see that in both methods the $\alpha_i$ cancels out of the regression, solving a possible correlation between $\alpha_i$ with $y_{it} $ and $x_{it}$. Still, a crucial requirement for retrieving reliable estimates is that the independent variable and the idiosyncratic term remain uncorrelated in expectations.     

Things get complicated however, when besides the unobserved constant ($\alpha_i$), the variable of interest $x_{it}$ is correlated also with a linear trend ($g_i \cdot t$). Keeping the variable names as before, the error term can now be decomposed as:

<a id='RT'></a>
$$u_{it} = \alpha_i + g_i \cdot t + e_{it} $$

Now, $g_i$ is a linear trend which is specific for each individual. Note that if $y_{it}$ is a logarithm of the original variable, $g_i$ can also be interpreted as roughly the average growth rate over a period. In that case, the equation is usually referred to a random-growth model, otherwise simply as a random-trend. Overall, this presents an additional source of heterogeneity and needs to be dealt with before employing an OLS estimation.      

To solve the possible bias problem due to the RT component, the literature states that in the first step we have to calculate the first-differences in order to transform the linear trend into a constant. To illustrate this more formally, taking the first-difference in the RT set-up gives

$$y_{it} - y_{it-1} = \beta \cdot (x_{it} - x_{it-1}) + \alpha_i - \alpha_{i} +   g_i \cdot t - g_{i} \cdot (t-1) + (e_{it} - e_{it-1})$$

$$ \iff \Delta y_{it} = \beta \cdot \Delta x_{it} + g_i + \Delta  e_{it}$$

Thereafter, it is up to the researcher to continue with either the within-transformation or to first-difference again. Note that even though both are fixed-effect methods, we will be consistent with the literature to call the within-transformation the fixed-effect method. We will investigate if one of the approaches is superior to the other in means of the estimation bias of the coefficient. It would be further of interest to investigate the standard errors and model selection criteria like $R^2$, AIC and BIC, but this will be left out for the future research.

As we saw, the linear trend has been reduced to a constant term which can now be canceled out by a second first-difference or a with-transformation. So, we have two possibilities for estimation given the RT model: 
- The **FD Method**: Taking two times the first-difference, will be named _pure_ in the tables later on
- The **FD-FE Method**: First the first-difference, then the fixed-effect, named _mix_.

First-differencing leads to the FD Method:

<a id='EQ4'></a>
$$\Delta^2 y_{it} = \beta \cdot \Delta^2 x_{it} + \Delta^2 e_{it} \quad (4) $$ where $\Delta^2$ stands for the taking two first-differences. 

The alternative is to do a FE transformation for each variable by subtracting from it the mean corresponding to each individual. This leads to the FD-FE method:

<a id='EQ5'></a>
$$\Delta y'_{it} = \beta \cdot \Delta x'_{it} + \Delta e'_{it} \quad (5) $$

where ' denotes that the variables are demeaned.

One can clearly see that in (4) and (5) we canceled out both terms $\alpha_i$ and $g_i \cdot t$. Thereby $\beta$ will not be biased even though the data initially included two different sorts of bias.

On the other hand, if we had failed to take first differences in the first place, taking first the fixed-effects as in equation (3) would have given us the following:

$$y_{it} - \bar{y}_{i} = \beta \cdot ( x_{it} - \bar{x}_{i}) + \alpha_i - \alpha_{i} +   g_i \cdot (t - \bar{t}) + (e_{it} - \bar{e}_{i})$$

Using a second transformation we will not be able to cancel out the time trend effect (we will confirm this in the simulation [at some point](#MCNonFEFD)). We thereby see that it is crucial to first take the first-difference and not the within-transformation.

Given that our group is new to python but experienced in Stata, we will first do the simulation via Stata by using the package [`ipystata`](https://github.com/TiesdeKok/ipystata). Thereby we have a known language which we can refer to as our benchmark. In a second step we will use only open software packages to replicate the results. Thereby we will comment which python code is comparable to which Stata code. Besides having the results obtained using Stata as a benchmark, we start a nice _translater_ from Stata to Python. We will also comment on the speed of both languages as well as advantages and disadvantages of the coding part.

The assignment will continue as follows: (i) First we explain step by step our [Data Generating Process (DGP)](#DGP). (ii) Then we run a [simulation without a random trend](#SimNoTrend) to see if everything works fine. As a single simulation is not enough to produce definitive answers, from here on we perform a Monte Carlo study with numerous simulation draws. First we do that with the basic DGP model [without a trend](#MCNoTrend) (iii) Next we run the same simulation using a [constant linear trend](#MCConstTrend) and (iv) an [individual specific trend](#MCIndivTrend) in the data generating process. To see how robust the random-trend estimates are, we will further run two simulations using two [non-linear trends](#MCNonLinTrend). Finally that we will reproduce the results of the individual trend using [open-source python packages](#MCPythonRT) only, such as _numpy_, _pandas_ and _random_.

## 1.2. Setting up Python packages  for this study

