From 8e9acfb2595d0262d546fab6589a47a9ce0485c5 Mon Sep 17 00:00:00 2001 From: John Stachurski Date: Sun, 26 Feb 2023 15:19:07 +1100 Subject: [PATCH] misc --- lectures/lln_clt.md | 444 ++++++++++++++++++++++++-------------------- 1 file changed, 244 insertions(+), 200 deletions(-) diff --git a/lectures/lln_clt.md b/lectures/lln_clt.md index a43cdbac9..d3b18627b 100644 --- a/lectures/lln_clt.md +++ b/lectures/lln_clt.md @@ -3,27 +3,32 @@ jupytext: text_representation: extension: .md format_name: myst + format_version: 0.13 + jupytext_version: 1.14.1 kernelspec: - display_name: Python 3 + display_name: Python 3 (ipykernel) language: python name: python3 --- - # LLN and CLT ## Overview -This lecture illustrates two of the most important theorems of probability and statistics: The -law of large numbers (LLN) and the central limit theorem (CLT). +This lecture illustrates two of the most important results in probability and statistics: + +1. the law of large numbers (LLN) and +2. the central limit theorem (CLT). -These beautiful theorems lie behind many of the most fundamental results in econometrics and quantitative economic modeling. +These beautiful theorems lie behind many of the most fundamental results in +econometrics and quantitative economic modeling. The lecture is based around simulations that show the LLN and CLT in action. -We also demonstrate how the LLN and CLT break down when the assumptions they are based on do not hold. +We also demonstrate how the LLN and CLT break down when the assumptions they +are based on do not hold. -This lecture will focus on the univariable case to provide the intuitions for proofs and the generalization to multivariate case [later](https://python.quantecon.org/lln_clt.html#the-multivariate-case). +This lecture will focus on the univariate case (the multivariate case is treated [in a more advanced lecture](https://python.quantecon.org/lln_clt.html#the-multivariate-case)). We'll need the following imports: @@ -34,14 +39,8 @@ import numpy as np import scipy.stats as st ``` -## Relationships - -The LLN gives conditions under which sample moments converge to population moments as sample size increases. - -The CLT provides information about the rate at which sample moments converge to population moments as sample size increases. - (lln_mr)= -## LLN +## The Law of Large Numbers ```{index} single: Law of Large Numbers ``` @@ -60,12 +59,15 @@ This means that $X$ takes values in $\{0,1\}$ and $\mathbb P\{X=1\} = p$. We can think of drawing $X$ as tossing a biased coin where * the coin falls on "heads" with probability $p$ and -* we set $X=1$ if the coin is "heads" and zero otherwise. +* the coin falls on "tails" with probability $1-p$ -The mean of $X$ is +We set $X=1$ if the coin is "heads" and zero otherwise. + +The (population) mean of $X$ is $$ -\mathbb E X = 0 \cdot \mathbb P\{X=0\} + 1 \cdot \mathbb P\{X=1\} = \mathbb P\{X=1\} = p + \mathbb E X + = 0 \cdot \mathbb P\{X=0\} + 1 \cdot \mathbb P\{X=1\} = \mathbb P\{X=1\} = p $$ We can generate a draw of $X$ with `scipy.stats` (imported as `st`) as follows: @@ -76,7 +78,8 @@ X = st.bernoulli.rvs(p) print(X) ``` -In this setting, the LLN tells us if we flip the coin many times, the fraction of heads that we see will be close to $p$. +In this setting, the LLN tells us if we flip the coin many times, the fraction +of heads that we see will be close to the mean $p$. Let's check this: @@ -94,64 +97,77 @@ X_draws = st.bernoulli.rvs(p, size=n) print(X_draws.mean()) ``` -Let's connect this to the discussion above, where we said the sample average converges to the "population mean". +Let's connect this to the discussion above, where we said the sample average +converges to the "population mean". + +Think of $X_1, \ldots, X_n$ as independent flips of the coin. -The population mean is the mean in an infinite sample, which equals the true mean, or $\mathbb E X$. +The population mean is the mean in an infinite sample, which equals the +expectation $\mathbb E X$. The sample mean of the draws $X_1, \ldots, X_n$ is $$ -\bar X_n := \frac{1}{n} \sum_{i=1}^n X_i + \bar X_n := \frac{1}{n} \sum_{i=1}^n X_i $$ -which, in this case, is the fraction of draws that equal one (the number of heads divided by $n$). +In this case, it is the fraction of draws that equal one (the number of heads divided by $n$). Thus, the LLN tells us that for the Bernoulli trials above ```{math} :label: exp - -\bar X_n \to \mathbb E X = p -\qquad (n \to \infty) + \bar X_n \to \mathbb E X = p + \qquad (n \to \infty) ``` This is exactly what we illustrated in the code. + (lln_ksl)= ### Statement of the LLN Let's state the LLN more carefully. -The traditional version of the law of large numbers concerns independent and identically distributed (IID) random variables. - -Let $X_1, \ldots, X_n$ be independent and identically distributed random variables. +Let $X_1, \ldots, X_n$ be random variables, all of which have the same +distribution. These random variables can be continuous or discrete. -For simplicity we will assume they are continuous and we let $f$ denote their density function, so that, for any $i$ in $\{1, \ldots, n\}$ +For simplicity we will + +* assume they are continuous and +* let $f$ denote their common density function + +The last statement means that for any $i$ in $\{1, \ldots, n\}$ and any +numbers $a, b$, $$ \mathbb P\{a \leq X_i \leq b\} = \int_a^b f(x) dx $$ -(For the discrete case, we need to replace densities with probability mass functions and integrals with sums.) +(For the discrete case, we need to replace densities with probability mass +functions and integrals with sums.) + +Let $\mu$ denote the common mean of this sample. -Let $\mu$ denote the common mean of this sample: +Thus, for each $i$, $$ - \mu := \mathbb E X = \int_{-\infty}^{\infty} x f(x) dx + \mu := \mathbb E X_i = \int_{-\infty}^{\infty} x f(x) dx $$ -In addition, let +The sample mean is $$ -\bar X_n := \frac{1}{n} \sum_{i=1}^n X_i + \bar X_n := \frac{1}{n} \sum_{i=1}^n X_i $$ +The next theorem is called Kolmogorov's strong law of large numbers. ````{prf:theorem} -The law of large numbers (specifically, Kolmogorov's strong law) states that, if $\mathbb E |X|$ is finite, then +If $X_1, \ldots, X_n$ are IID and $\mathbb E |X|$ is finite, then ```{math} :label: lln_as @@ -160,40 +176,53 @@ The law of large numbers (specifically, Kolmogorov's strong law) states that, if ``` ```` +Here + +* IID means independent and identically distributed and +* $\mathbb E |X| = \int_{-\infty}^\infty |x| f(x) dx$ + + + + ### Comments on the Theorem -What does this last expression mean? +What does the probability one statement in the theorem mean? Let's think about it from a simulation perspective, imagining for a moment that -our computer can generate perfect random samples (which of course [it can't](https://en.wikipedia.org/wiki/Pseudorandom_number_generator)). +our computer can generate perfect random samples (although this [isn't strictly true](https://en.wikipedia.org/wiki/Pseudorandom_number_generator)). -Let's also imagine that we can generate infinite sequences so that the statement $\bar X_n \to \mu$ can be evaluated. +Let's also imagine that we can generate infinite sequences so that the +statement $\bar X_n \to \mu$ can be evaluated. -In this setting, {eq}`lln_as` should be interpreted as meaning that the probability of the computer producing a sequence where $\bar X_n \to \mu$ fails to occur -is zero. +In this setting, {eq}`lln_as` should be interpreted as meaning that the +probability of the computer producing a sequence where $\bar X_n \to \mu$ +fails to occur is zero. ### Illustration ```{index} single: Law of Large Numbers; Illustration ``` -Let's now illustrate the LLN using simulation. - -When we illustrate it, we will use a key idea: the sample mean $\bar X$ is itself a random variable. +Let's illustrate the LLN using simulation. -In a sense this is obvious but it can be easy to forget. +When we illustrate it, we will use a key idea: the sample mean $\bar X_n$ is +itself a random variable. -The reason $\bar X_n$ is a random variable is that it's a function of the random variables $X_1, \ldots, X_n$. +The reason $\bar X_n$ is a random variable is that it's a function of the +random variables $X_1, \ldots, X_n$. What we are going to do now is -1. Pick some distribution to draw each $X_i$ from -1. Set $n$ to some large number -1. Generate the draws $X_1, \ldots, X_n$ -1. Calculate the sample mean $\bar X_n$ and record its value in an array `sample_means` -1. Go to step 3 +1. pick some fixed distribution to draw each $X_i$ from +1. set $n$ to some large number -We will continue the loop over steps 3-4 a total of $m$ times, where $m$ is some large integer. +and then repeat the following three instructions. + +1. generate the draws $X_1, \ldots, X_n$ +1. calculate the sample mean $\bar X_n$ and record its value in an array `sample_means` +1. go to step 1. + +We will loop over these three steps $m$ times, where $m$ is some large integer. The array `sample_means` will now contain $m$ draws of the random variable $\bar X_n$. @@ -203,186 +232,185 @@ Moreover, if we repeat the exercise with a larger value of $n$, we should see th This is, in essence, what the LLN is telling us. -Let's run some simulations to visualize LLN +To implement these steps, we will use functions. -```{code-cell} ipython3 -def generate_histogram(X_distribution, n, m): - fig, ax = plt.subplots(figsize=(10, 6)) +Our first function generates a sample mean of size $n$ given a distribution. - def draw_means(X_distribution, n): +```{code-cell} ipython3 +def draw_means(X_distribution, # The distribution of each X_i + n): # The size of the sample mean - # Step 3: Generate n draws: X_1, ..., X_n + # Generate n draws: X_1, ..., X_n X_samples = X_distribution.rvs(size=n) - # Step 4: Calculate the sample mean + # Return the sample mean return np.mean(X_samples) - - # Step 5: Loop m times - sample_means = [draw_means(X_distribution, n) for i in range(m)] - print(f'The mean of sample mean is {round(np.mean(sample_means),2)}') - - # Generate a histogram - ax.hist(sample_means, bins=30, alpha=0.5, density=True) - mu = X_distribution.mean() - if not np.isnan(mu): - ax.axvline(x=mu, ls="--", lw=3, label=fr"$\mu = {mu}$") +``` + +Now we write a function to generate $m$ sample means and histogram them. + +```{code-cell} ipython3 +def generate_histogram(X_distribution, n, m): + + # Compute m sample means + + sample_means = np.empty(m) + for j in range(m): + sample_means[j] = draw_means(X_distribution, n) + + # Generate a histogram + + fig, ax = plt.subplots() + ax.hist(sample_means, bins=30, alpha=0.5, density=True) + μ = X_distribution.mean() # Get the population mean + σ = X_distribution.std() # and the standard deviation + ax.axvline(x=μ, ls="--", c="k", label=fr"$\mu = {μ}$") - ax.set_xlim(min(sample_means), max(sample_means)) - ax.set_xlabel(r'$\bar X_n$', size=12) - ax.set_ylabel('density', size=12) - ax.legend() - plt.show() + ax.set_xlim(μ - σ, μ + σ) + ax.set_xlabel(r'$\bar X_n$', size=12) + ax.set_ylabel('density', size=12) + ax.legend() + plt.show() ``` +Now we call the function. + ```{code-cell} ipython3 -#Step 1: Pick some distribution to draw each $X_i$ from -#Step 2: Set $n$ to some large number -generate_histogram(st.norm(loc=5, scale=2), n=50_000, m=1000) +# pick a distribution to draw each $X_i$ from +X_distribution = st.norm(loc=5, scale=2) +# Call the function +generate_histogram(X_distribution, n=1_000, m=1000) ``` -We can see that the distribution of $\bar X$ is clustered around $\mathbb E X$ as expected. +We can see that the distribution of $\bar X$ is clustered around $\mathbb E X$ +as expected. + +Let's vary `n` to see how the distribution of the sample mean changes. + +We will use a violin plot to show the different distributions. -We can vary values for `n` to see how the distribution changes +Each distribution in the violin plot represents the distribution of $X_n$ for some $n$, calculated by simulation. ```{code-cell} ipython3 -def generate_multiple_hist(X_distribution, ns, m, log_scale=False): - _, ax = plt.subplots(figsize=(10, 6)) +def means_violin_plot(distribution, + ns = [1_000, 10_000, 100_000], + m = 10_000): - def draw_means(X_distribution, n): - X_samples = X_distribution.rvs(size=n) - return np.mean(X_samples) - + data = [] for n in ns: - sample_means = [draw_means(X_distribution, n) for i in range(m)] - if log_scale: - plt.xscale('symlog') - ax.hist(sample_means, bins=40, alpha=0.4, density=True, label=fr'$n = {n}$') + sample_means = [draw_means(distribution, n) for i in range(m)] + data.append(sample_means) - mu = X_distribution.mean() - if not np.isnan(mu): - ax.axvline(x=mu, ls="--", lw=3, label=fr"$\mu = {mu}$") + fig, ax = plt.subplots() + + ax.violinplot(data) + μ = distribution.mean() + ax.axhline(y=μ, ls="--", c="k", label=fr"$\mu = {μ}$") + + labels=[fr'$n = {n}$' for n in ns] + + ax.set_xticks(np.arange(1, len(labels) + 1), labels=labels) + ax.set_xlim(0.25, len(labels) + 0.75) + + + plt.subplots_adjust(bottom=0.15, wspace=0.05) - ax.set_xlim(min(sample_means), max(sample_means)) - ax.set_xlabel(r'$\bar X_n$', size=12) ax.set_ylabel('density', size=12) ax.legend() plt.show() ``` +Let's try with a normal distribution. + ```{code-cell} ipython3 -generate_multiple_hist(st.norm(loc=5, scale=2), - ns=[20_000, 50_000, 100_000], - m=10_000) +means_violin_plot(st.norm(loc=5, scale=2)) ``` -The histogram gradually converges to $\mu$ as the sample size n increases. +As $n$ gets large, more probability mass clusters around the population mean $\mu$. + +Now let's try with a Beta distribution. -You can imagine the result when extrapolating this trend for $n \to \infty$. +```{code-cell} ipython3 +means_violin_plot(st.beta(6, 6)) +``` +We get a similar result. + ++++ ## Breaking the LLN -We have to pay attention to the assumptions in the statement of the LLN when we apply it. +We have to pay attention to the assumptions in the statement of the LLN. -As indicated by {eq}`lln_as`, LLN can break when $\mathbb E |X|$ is not finite or is not well defined. +If these assumptions do not hold, then the LLN might fail. -We can demonstrate this using a simple simulation using a [Cauchy distribution](https://en.wikipedia.org/wiki/Cauchy_distribution) for which it does not have a well-defined $\mu$. +### Infinite First Moment +As indicated by the theorem, the LLN can break when $\mathbb E |X|$ is not finite. -We lost the convergence we have seen before with normal distribution +We can demonstrate this using the [Cauchy distribution](https://en.wikipedia.org/wiki/Cauchy_distribution). -```{code-cell} ipython3 -fig, axes = plt.subplots(1, 2, figsize=(15, 6)) - -def scattered_mean(distribution, burn_in, n, jump, ax, title, color, ylog=False): - - #Set a jump to reduce simulation complexity - sample_means = [np.mean(distribution.rvs(size=i)) - for i in range(burn_in, n+1, jump)] - - ax.scatter(range(burn_in, n+1, jump), sample_means, s=10, c=color) - - #Change the y-axis to log scale if necessary - if ylog: - ax.set_yscale("symlog") - ax.set_title(title, size=10) - ax.set_xlabel(r"$n$", size=12) - ax.set_ylabel(r"$\bar X_n$", size=12) - yabs_max = max(ax.get_ylim()) - ax.set_ylim(ymin=-yabs_max, ymax=yabs_max) - return ax - -scattered_mean(distribution=st.cauchy(), - burn_in=1000, - n=1_000_000, - ax=axes[0], - jump=2000, - title="Cauchy Distribution", - color='#1f77b4', - ylog=True) - -scattered_mean(distribution=st.norm(), - burn_in=1000, - n=1_000_000, - ax=axes[1], - jump=2000, - title="Normal Distribution", - color='#ff7f0e') - -fig.suptitle('Sample Mean with Different Sample Sizes') -plt.show() -``` +The Cauchy distribution has the following property: -We find that unlike normal distribution, Cauchy distribution does not have the convergence that LLN implies. +If $X_1, \ldots, X_n$ are IID and Cauchy, then so is $\bar X_n$. -It is also not hard to conjecture that LLN can be broken when the independence assumption is violated. +This means that the distribution of $\bar X_n$ does not eventually concentrate on a single number. -Let's go through a very simple example where LLN fails with IID violated: +Hence the LLN does not hold. -Assume +The LLN fails to hold here because the assumpton $\mathbb E|X| = \infty$ is violated by the Cauchy distribution. -$$ -X_0 \sim N(0,1) -$$ ++++ + + +### Failure of the IID Condition + +The LLN can also fail to hold when the IID assumption is violated. -In addition, assume +For example, suppose that $$ -X_t = X_{t-1} \quad \text{for} \quad t = 1, ..., n + X_0 \sim N(0,1) + \quad \text{and} \quad + X_i = X_{i-1} \quad \text{for} \quad i = 1, ..., n $$ -We can then see that +In this case, $$ -\bar X_n := \frac{1}{n} \sum_{t=1}^n X_i = X_0 \sim N(0,1) + \bar X_n = \frac{1}{n} \sum_{i=1}^n X_i = X_0 \sim N(0,1) $$ -Therefore, the distribution of the mean of $X$ follows $N(0,1)$. +Therefore, the distribution of $\bar X_n$ is $N(0,1)$ for all $n$! -However, +Does this contradict the LLN, which says that the distribution of $\bar X_n$ +collapses to the single point $\mu$? -$$ -\mathbb E X_t = \mathbb E X_0 = 0 -$$ +No, the LLN is correct --- the issue is that its assumptions are not +satisfied. -Since the distribution of $\bar X$ follows a standard normal distribution, but the expectation $\mathbb E X_t$ is a single number. +In particular, the sequence $X_1, \ldots, X_n$ is not independent. -This violates {eq}`exp`, and thus breaks LLN. ```{note} :name: iid_violation -Although in this case, the violation of IID breaks LLN, it is not always the case for correlated data. +Although in this case the violation of IID breaks the LLN, there *are* situations +where IID fails but the LLN still holds. We will show an example in the [exercise](lln_ex3). ``` -## CLT ++++ + +## Central Limit Theorem ```{index} single: Central Limit Theorem ``` -Next, we turn to the central limit theorem, which tells us about the distribution of the deviation between sample averages and population means. +Next, we turn to the central limit theorem (CLT), which tells us about the +distribution of the deviation between sample averages and population means. + ### Statement of the Theorem @@ -394,8 +422,8 @@ In the IID setting, it tells us the following: ````{prf:theorem} :label: statement_clt -If the sequence $X_1, \ldots, X_n$ is IID, with common mean -$\mu$ and common variance $\sigma^2 \in (0, \infty)$, then +If $X_1, \ldots, X_n$ is IID with common mean $\mu$ and common variance +$\sigma^2 \in (0, \infty)$, then ```{math} :label: lln_clt @@ -408,18 +436,17 @@ n \to \infty Here $\stackrel { d } {\to} N(0, \sigma^2)$ indicates [convergence in distribution](https://en.wikipedia.org/wiki/Convergence_of_random_variables#Convergence_in_distribution) to a centered (i.e., zero mean) normal with standard deviation $\sigma$. -### Intuition - -```{index} single: Central Limit Theorem; Intuition -``` The striking implication of the CLT is that for **any** distribution with finite [second moment](https://en.wikipedia.org/wiki/Moment_(mathematics)), the simple operation of adding independent copies **always** leads to a Gaussian curve. + + + ### Simulation 1 -Since the CLT seems almost magical, running simulations that verify its implications is one good way to build intuition. +Since the CLT seems almost magical, running simulations that verify its implications is one good way to build understanding. To this end, we now perform the following simulation @@ -434,6 +461,7 @@ $F(x) = 1 - e^{- \lambda x}$. (Please experiment with other choices of $F$, but remember that, to conform with the conditions of the CLT, the distribution must have a finite second moment.) (sim_one)= + ```{code-cell} ipython3 # Set parameters n = 250 # Choice of n @@ -464,7 +492,7 @@ ax.legend() plt.show() ``` -(Notice the absence of for loops --- every operation is vectorized, meaning that the major calculations are all shifted to optimized C code.) +(Notice the absence of for loops --- every operation is vectorized, meaning that the major calculations are all shifted to fast C code.) The fit to the normal density is already tight and can be further improved by increasing `n`. @@ -476,7 +504,7 @@ The fit to the normal density is already tight and can be further improved by in ```{exercise} :label: lln_ex1 -Repeat the simulation in [simulation 1](sim_one) with [beta distribution](https://en.wikipedia.org/wiki/Beta_distribution). +Repeat the simulation [above1](sim_one) with the [Beta distribution](https://en.wikipedia.org/wiki/Beta_distribution). You can choose any $\alpha > 0$ and $\beta > 0$. ``` @@ -519,7 +547,11 @@ plt.show() ````{exercise} :label: lln_ex2 -Although NumPy doesn't give us a `bernoulli` function, we can generate a draw of $X$ using NumPy via +At the start of this lecture we discussed Bernoulli random variables. + +NumPy doesn't provide a `bernoulli` function that we can sample from. + +However, we can generate a draw of Bernoulli $X$ using NumPy via ```python3 U = np.random.rand() @@ -534,7 +566,9 @@ Explain why this provides a random variable $X$ with the right distribution. :class: dropdown ``` -We can write $X$ as $X = \mathbf 1\{U < p\}$ where $\mathbf 1$ is the [indicator function](https://en.wikipedia.org/wiki/Indicator_function) (i.e., 1 if the statement is true and zero otherwise). +We can write $X$ as $X = \mathbf 1\{U < p\}$ where $\mathbf 1$ is the +[indicator function](https://en.wikipedia.org/wiki/Indicator_function) (i.e., +1 if the statement is true and zero otherwise). Here we generated a uniform draw $U$ on $[0,1]$ and then used the fact that @@ -556,22 +590,29 @@ We mentioned above that LLN can still hold sometimes when IID is violated. Let's investigate this claim further. -Assume we have a AR(1) process as below: +Consider the AR(1) process $$ -X_{t+1} = \alpha + \beta X_t + \sigma \epsilon _{t+1} + X_{t+1} = \alpha + \beta X_t + \sigma \epsilon _{t+1} $$ -and +where $\alpha, \beta, \sigma$ are constants and $\epsilon_1, \epsilon_2, +\ldots$ is IID and standard norma. + +Suppose that $$ -X_0 \sim N \left(\frac{\alpha}{1-\beta}, \frac{\sigma^2}{1-\beta^2}\right) + X_0 \sim N \left(\frac{\alpha}{1-\beta}, \frac{\sigma^2}{1-\beta^2}\right) $$ -where $\epsilon_t \sim N(0,1)$ +This process violates the independence assumption of the LLN +(since $X_{t+1}$ depends on the value of $X_t$). + +However, the next exercise teaches us that LLN type convergence of the sample +mean to the population mean still occurs. -1. Prove this process violated the independence assumption but not the identically distributed assumption; -2. Show LLN holds using simulations with $\alpha = 0.8$, $\beta = 0.2$. +1. Prove that the sequence $X_1, X_2, \ldots$ is identically distributed. +2. Show that LLN convergence holds using simulations with $\alpha = 0.8$, $\beta = 0.2$. ``` @@ -581,43 +622,46 @@ where $\epsilon_t \sim N(0,1)$ **Q1 Solution** -Given $X_{t+1}$ is dependent on the value of $X_t$, this process is not independent. +Regarding part 1, we claim that $X_t$ has the same distribution as $X_0$ for +all $t$. -To check whether it is identically distributed, we need to check whether the distribution in $T={0...n}$ +To construct a proof, we suppose that the claim is true for $X_t$. -Let's verify the expectation and variance of this AR(1) process using pen and paper first. +Now we claim it is also true for $X_{t+1}$. + +Observe that we have the correct mean: $$ \begin{aligned} -\mathbb E X_{t+1} &= \alpha + \beta \mathbb E X_t \\ -&= \alpha + \beta \frac{\alpha}{1-\beta} \\ -&= \frac{\alpha}{1-\beta} + \mathbb E X_{t+1} &= \alpha + \beta \mathbb E X_t \\ + &= \alpha + \beta \frac{\alpha}{1-\beta} \\ + &= \frac{\alpha}{1-\beta} \end{aligned} $$ +We also have the correct variance: $$ \begin{aligned} -\mathrm{Var}(X_{t+1}) &= \beta^2 \mathrm{Var}(X_{t}) + \sigma^2\\ -&= \frac{\beta^2\sigma^2}{1-\beta^2} + \sigma^2 \\ -&= \frac{\sigma^2}{1-\beta^2} + \mathrm{Var}(X_{t+1}) &= \beta^2 \mathrm{Var}(X_{t}) + \sigma^2\\ + &= \frac{\beta^2\sigma^2}{1-\beta^2} + \sigma^2 \\ + &= \frac{\sigma^2}{1-\beta^2} \end{aligned} $$ -We find that expectation and variance are the same $t = 0, ..., n$. - -Given both $X_0$ and $\epsilon _{0}$ are normally distributed and independent from each other, the weighted sum is also normally distributed. +Finally, since both $X_t$ and $\epsilon_0$ are normally distributed and +independent from each other, any linear combinary of these two variables is +also normally distributed. -This holds true for all $X_t$ and $\epsilon _{t}$ where $t = 0, ..., n$ - -Therefore, +We have now shown that $$ -X_t \sim N \left(\frac{\alpha}{1-\beta}, \frac{\sigma^2}{1-\beta^2}\right) \quad t = 0, ..., n + X_{t+1} \sim + N \left(\frac{\alpha}{1-\beta}, \frac{\sigma^2}{1-\beta^2}\right) $$ - -We can conclude this AR(1) process violates the independence assumption but is identically distributed. +We can conclude this AR(1) process violates the independence assumption but is +identically distributed. **Q2 Solution**