diff --git a/lectures/ak2.md b/lectures/ak2.md index e1a516725..cdcf78c0b 100644 --- a/lectures/ak2.md +++ b/lectures/ak2.md @@ -209,7 +209,7 @@ Units of the rental rates are: * for $r_t$, output at time $t$ per unit of capital at time $t$ -We take output at time $t$ as *numeraire*, so the price of output at time $t$ is one. +We take output at time $t$ as **numeraire**, so the price of output at time $t$ is one. The firm's profits at time $t$ are diff --git a/lectures/cake_eating_stochastic.md b/lectures/cake_eating_stochastic.md index ff1187f0b..246b69b54 100644 --- a/lectures/cake_eating_stochastic.md +++ b/lectures/cake_eating_stochastic.md @@ -164,13 +164,13 @@ In summary, the agent's aim is to select a path $c_0, c_1, c_2, \ldots$ for cons 1. nonnegative, 1. feasible in the sense of {eq}`outcsdp0`, 1. optimal, in the sense that it maximizes {eq}`texs0_og2` relative to all other feasible consumption sequences, and -1. *adapted*, in the sense that the action $c_t$ depends only on +1. **adapted**, in the sense that the action $c_t$ depends only on observable outcomes, not on future outcomes such as $\xi_{t+1}$. In the present context -* $x_t$ is called the *state* variable --- it summarizes the "state of the world" at the start of each period. -* $c_t$ is called the *control* variable --- a value chosen by the agent each period after observing the state. +* $x_t$ is called the **state** variable --- it summarizes the "state of the world" at the start of each period. +* $c_t$ is called the **control** variable --- a value chosen by the agent each period after observing the state. ### The Policy Function Approach diff --git a/lectures/cake_eating_time_iter.md b/lectures/cake_eating_time_iter.md index 21f30141f..9fe5d4ad9 100644 --- a/lectures/cake_eating_time_iter.md +++ b/lectures/cake_eating_time_iter.md @@ -237,7 +237,7 @@ whenever $\sigma \in \mathscr P$. It is possible to prove that there is a tight relationship between iterates of $K$ and iterates of the Bellman operator. -Mathematically, the two operators are *topologically conjugate*. +Mathematically, the two operators are **topologically conjugate**. Loosely speaking, this means that if iterates of one operator converge then so do iterates of the other, and vice versa. diff --git a/lectures/career.md b/lectures/career.md index 63cb10626..c8fed1268 100644 --- a/lectures/career.md +++ b/lectures/career.md @@ -66,8 +66,8 @@ from matplotlib import cm In what follows we distinguish between a career and a job, where -* a *career* is understood to be a general field encompassing many possible jobs, and -* a *job* is understood to be a position with a particular firm +* a **career** is understood to be a general field encompassing many possible jobs, and +* a **job** is understood to be a position with a particular firm For workers, wages can be decomposed into the contribution of job and career diff --git a/lectures/cass_fiscal.md b/lectures/cass_fiscal.md index fdf5c274d..b7e3646df 100644 --- a/lectures/cass_fiscal.md +++ b/lectures/cass_fiscal.md @@ -147,8 +147,8 @@ $$ (eq:gov_budget) Given a budget-feasible government policy $\{g_t\}_{t=0}^\infty$ and $\{\tau_{ct}, \tau_{kt}, \tau_{nt}, \tau_{ht}\}_{t=0}^\infty$ subject to {eq}`eq:gov_budget`, -- *Household* chooses $\{c_t\}_{t=0}^\infty$, $\{n_t\}_{t=0}^\infty$, and $\{k_{t+1}\}_{t=0}^\infty$ to maximize utility{eq}`eq:utility` subject to budget constraint{eq}`eq:house_budget`, and -- *Frim* chooses sequences of capital $\{k_t\}_{t=0}^\infty$ and $\{n_t\}_{t=0}^\infty$ to maximize profits +- **Household** chooses $\{c_t\}_{t=0}^\infty$, $\{n_t\}_{t=0}^\infty$, and $\{k_{t+1}\}_{t=0}^\infty$ to maximize utility{eq}`eq:utility` subject to budget constraint{eq}`eq:house_budget`, and +- **Firm** chooses sequences of capital $\{k_t\}_{t=0}^\infty$ and $\{n_t\}_{t=0}^\infty$ to maximize profits $$ \sum_{t=0}^\infty q_t [F(k_t, n_t) - \eta_t k_t - w_t n_t] diff --git a/lectures/kalman.md b/lectures/kalman.md index a516a8eb2..fa089320f 100644 --- a/lectures/kalman.md +++ b/lectures/kalman.md @@ -85,7 +85,7 @@ One way to summarize our knowledge is a point prediction $\hat x$ * Then it is better to summarize our initial beliefs with a bivariate probability density $p$ * $\int_E p(x)dx$ indicates the probability that we attach to the missile being in region $E$. -The density $p$ is called our *prior* for the random variable $x$. +The density $p$ is called our **prior** for the random variable $x$. To keep things tractable in our example, we assume that our prior is Gaussian. @@ -317,7 +317,7 @@ We have obtained probabilities for the current location of the state (missile) g This is called "filtering" rather than forecasting because we are filtering out noise rather than looking into the future. -* $p(x \,|\, y) = N(\hat x^F, \Sigma^F)$ is called the *filtering distribution* +* $p(x \,|\, y) = N(\hat x^F, \Sigma^F)$ is called the **filtering distribution** But now let's suppose that we are given another task: to predict the location of the missile after one unit of time (whatever that may be) has elapsed. @@ -331,7 +331,7 @@ Let's suppose that we have one, and that it's linear and Gaussian. In particular x_{t+1} = A x_t + w_{t+1}, \quad \text{where} \quad w_t \sim N(0, Q) ``` -Our aim is to combine this law of motion and our current distribution $p(x \,|\, y) = N(\hat x^F, \Sigma^F)$ to come up with a new *predictive* distribution for the location in one unit of time. +Our aim is to combine this law of motion and our current distribution $p(x \,|\, y) = N(\hat x^F, \Sigma^F)$ to come up with a new **predictive** distribution for the location in one unit of time. In view of {eq}`kl_xdynam`, all we have to do is introduce a random vector $x^F \sim N(\hat x^F, \Sigma^F)$ and work out the distribution of $A x^F + w$ where $w$ is independent of $x^F$ and has distribution $N(0, Q)$. @@ -356,7 +356,7 @@ $$ $$ The matrix $A \Sigma G' (G \Sigma G' + R)^{-1}$ is often written as -$K_{\Sigma}$ and called the *Kalman gain*. +$K_{\Sigma}$ and called the **Kalman gain**. * The subscript $\Sigma$ has been added to remind us that $K_{\Sigma}$ depends on $\Sigma$, but not $y$ or $\hat x$. @@ -373,7 +373,7 @@ Our updated prediction is the density $N(\hat x_{new}, \Sigma_{new})$ where \end{aligned} ``` -* The density $p_{new}(x) = N(\hat x_{new}, \Sigma_{new})$ is called the *predictive distribution* +* The density $p_{new}(x) = N(\hat x_{new}, \Sigma_{new})$ is called the **predictive distribution** The predictive distribution is the new density shown in the following figure, where the update has used parameters. diff --git a/lectures/likelihood_bayes.md b/lectures/likelihood_bayes.md index c9b203f11..da81a560f 100644 --- a/lectures/likelihood_bayes.md +++ b/lectures/likelihood_bayes.md @@ -129,8 +129,8 @@ $$ where we use the conventions that $f(w^t) = f(w_1) f(w_2) \ldots f(w_t)$ and $g(w^t) = g(w_1) g(w_2) \ldots g(w_t)$. -Notice that the likelihood process satisfies the *recursion* or -*multiplicative decomposition* +Notice that the likelihood process satisfies the **recursion** or +**multiplicative decomposition** $$ L(w^t) = \ell (w_t) L (w^{t-1}) . diff --git a/lectures/linear_algebra.md b/lectures/linear_algebra.md index 9e89b2a33..33f3dc53e 100644 --- a/lectures/linear_algebra.md +++ b/lectures/linear_algebra.md @@ -85,7 +85,7 @@ from scipy.linalg import inv, solve, det, eig ```{index} single: Linear Algebra; Vectors ``` -A *vector* of length $n$ is just a sequence (or array, or tuple) of $n$ numbers, which we write as $x = (x_1, \ldots, x_n)$ or $x = [x_1, \ldots, x_n]$. +A **vector** of length $n$ is just a sequence (or array, or tuple) of $n$ numbers, which we write as $x = (x_1, \ldots, x_n)$ or $x = [x_1, \ldots, x_n]$. We will write these sequences either horizontally or vertically as we please. @@ -225,15 +225,15 @@ x + y ```{index} single: Vectors; Norm ``` -The *inner product* of vectors $x,y \in \mathbb R ^n$ is defined as +The **inner product** of vectors $x,y \in \mathbb R ^n$ is defined as $$ x' y := \sum_{i=1}^n x_i y_i $$ -Two vectors are called *orthogonal* if their inner product is zero. +Two vectors are called **orthogonal** if their inner product is zero. -The *norm* of a vector $x$ represents its "length" (i.e., its distance from the zero vector) and is defined as +The **norm** of a vector $x$ represents its "length" (i.e., its distance from the zero vector) and is defined as $$ \| x \| := \sqrt{x' x} := \left( \sum_{i=1}^n x_i^2 \right)^{1/2} @@ -273,7 +273,7 @@ np.linalg.norm(x) # Norm of x, take three Given a set of vectors $A := \{a_1, \ldots, a_k\}$ in $\mathbb R ^n$, it's natural to think about the new vectors we can create by performing linear operations. -New vectors created in this manner are called *linear combinations* of $A$. +New vectors created in this manner are called **linear combinations** of $A$. In particular, $y \in \mathbb R ^n$ is a linear combination of $A := \{a_1, \ldots, a_k\}$ if @@ -282,9 +282,9 @@ y = \beta_1 a_1 + \cdots + \beta_k a_k \text{ for some scalars } \beta_1, \ldots, \beta_k $$ -In this context, the values $\beta_1, \ldots, \beta_k$ are called the *coefficients* of the linear combination. +In this context, the values $\beta_1, \ldots, \beta_k$ are called the **coefficients** of the linear combination. -The set of linear combinations of $A$ is called the *span* of $A$. +The set of linear combinations of $A$ is called the **span** of $A$. The next figure shows the span of $A = \{a_1, a_2\}$ in $\mathbb R ^3$. @@ -349,7 +349,7 @@ plt.show() If $A$ contains only one vector $a_1 \in \mathbb R ^2$, then its span is just the scalar multiples of $a_1$, which is the unique line passing through both $a_1$ and the origin. -If $A = \{e_1, e_2, e_3\}$ consists of the *canonical basis vectors* of $\mathbb R ^3$, that is +If $A = \{e_1, e_2, e_3\}$ consists of the **canonical basis vectors** of $\mathbb R ^3$, that is $$ e_1 := @@ -399,8 +399,8 @@ The condition we need for a set of vectors to have a large span is what's called In particular, a collection of vectors $A := \{a_1, \ldots, a_k\}$ in $\mathbb R ^n$ is said to be -* *linearly dependent* if some strict subset of $A$ has the same span as $A$. -* *linearly independent* if it is not linearly dependent. +* **linearly dependent** if some strict subset of $A$ has the same span as $A$. +* **linearly independent** if it is not linearly dependent. Put differently, a set of vectors is linearly independent if no vector is redundant to the span and linearly dependent otherwise. @@ -469,19 +469,19 @@ Often, the numbers in the matrix represent coefficients in a system of linear eq For obvious reasons, the matrix $A$ is also called a vector if either $n = 1$ or $k = 1$. -In the former case, $A$ is called a *row vector*, while in the latter it is called a *column vector*. +In the former case, $A$ is called a **row vector**, while in the latter it is called a **column vector**. -If $n = k$, then $A$ is called *square*. +If $n = k$, then $A$ is called **square**. -The matrix formed by replacing $a_{ij}$ by $a_{ji}$ for every $i$ and $j$ is called the *transpose* of $A$ and denoted $A'$ or $A^{\top}$. +The matrix formed by replacing $a_{ij}$ by $a_{ji}$ for every $i$ and $j$ is called the **transpose** of $A$ and denoted $A'$ or $A^{\top}$. -If $A = A'$, then $A$ is called *symmetric*. +If $A = A'$, then $A$ is called **symmetric**. -For a square matrix $A$, the $i$ elements of the form $a_{ii}$ for $i=1,\ldots,n$ are called the *principal diagonal*. +For a square matrix $A$, the $i$ elements of the form $a_{ii}$ for $i=1,\ldots,n$ are called the **principal diagonal**. -$A$ is called *diagonal* if the only nonzero entries are on the principal diagonal. +$A$ is called **diagonal** if the only nonzero entries are on the principal diagonal. -If, in addition to being diagonal, each element along the principal diagonal is equal to 1, then $A$ is called the *identity matrix* and denoted by $I$. +If, in addition to being diagonal, each element along the principal diagonal is equal to 1, then $A$ is called the **identity matrix** and denoted by $I$. ### Matrix Operations @@ -641,9 +641,9 @@ See [here](https://python-programming.quantecon.org/numpy.html#matrix-multiplica Each $n \times k$ matrix $A$ can be identified with a function $f(x) = Ax$ that maps $x \in \mathbb R ^k$ into $y = Ax \in \mathbb R ^n$. -These kinds of functions have a special property: they are *linear*. +These kinds of functions have a special property: they are **linear**. -A function $f \colon \mathbb R ^k \to \mathbb R ^n$ is called *linear* if, for all $x, y \in \mathbb R ^k$ and all scalars $\alpha, \beta$, we have +A function $f \colon \mathbb R ^k \to \mathbb R ^n$ is called **linear** if, for all $x, y \in \mathbb R ^k$ and all scalars $\alpha, \beta$, we have $$ f(\alpha x + \beta y) = \alpha f(x) + \beta f(y) @@ -773,7 +773,7 @@ In particular, the following are equivalent 1. The columns of $A$ are linearly independent. 1. For any $y \in \mathbb R ^n$, the equation $y = Ax$ has a unique solution. -The property of having linearly independent columns is sometimes expressed as having *full column rank*. +The property of having linearly independent columns is sometimes expressed as having **full column rank**. #### Inverse Matrices @@ -788,7 +788,7 @@ solution is $x = A^{-1} y$. A similar expression is available in the matrix case. In particular, if square matrix $A$ has full column rank, then it possesses a multiplicative -*inverse matrix* $A^{-1}$, with the property that $A A^{-1} = A^{-1} A = I$. +**inverse matrix** $A^{-1}$, with the property that $A A^{-1} = A^{-1} A = I$. As a consequence, if we pre-multiply both sides of $y = Ax$ by $A^{-1}$, we get $x = A^{-1} y$. @@ -800,11 +800,11 @@ This is the solution that we're looking for. ``` Another quick comment about square matrices is that to every such matrix we -assign a unique number called the *determinant* of the matrix --- you can find +assign a unique number called the **determinant** of the matrix --- you can find the expression for it [here](https://en.wikipedia.org/wiki/Determinant). If the determinant of $A$ is not zero, then we say that $A$ is -*nonsingular*. +**nonsingular**. Perhaps the most important fact about determinants is that $A$ is nonsingular if and only if $A$ is of full column rank. @@ -929,8 +929,8 @@ $$ A v = \lambda v $$ -then we say that $\lambda$ is an *eigenvalue* of $A$, and -$v$ is an *eigenvector*. +then we say that $\lambda$ is an **eigenvalue** of $A$, and +$v$ is an **eigenvector**. Thus, an eigenvector of $A$ is a vector such that when the map $f(x) = Ax$ is applied, $v$ is merely scaled. @@ -1034,7 +1034,7 @@ to one. ### Generalized Eigenvalues -It is sometimes useful to consider the *generalized eigenvalue problem*, which, for given +It is sometimes useful to consider the **generalized eigenvalue problem**, which, for given matrices $A$ and $B$, seeks generalized eigenvalues $\lambda$ and eigenvectors $v$ such that @@ -1076,10 +1076,10 @@ $$ $$ The norms on the right-hand side are ordinary vector norms, while the norm on -the left-hand side is a *matrix norm* --- in this case, the so-called -*spectral norm*. +the left-hand side is a **matrix norm** --- in this case, the so-called +**spectral norm**. -For example, for a square matrix $S$, the condition $\| S \| < 1$ means that $S$ is *contractive*, in the sense that it pulls all vectors towards the origin [^cfn]. +For example, for a square matrix $S$, the condition $\| S \| < 1$ means that $S$ is **contractive**, in the sense that it pulls all vectors towards the origin [^cfn]. (la_neumann)= #### {index}`Neumann's Theorem ` @@ -1112,7 +1112,7 @@ $$ \rho(A) = \lim_{k \to \infty} \| A^k \|^{1/k} $$ -Here $\rho(A)$ is the *spectral radius*, defined as $\max_i |\lambda_i|$, where $\{\lambda_i\}_i$ is the set of eigenvalues of $A$. +Here $\rho(A)$ is the **spectral radius**, defined as $\max_i |\lambda_i|$, where $\{\lambda_i\}_i$ is the set of eigenvalues of $A$. As a consequence of Gelfand's formula, if all eigenvalues are strictly less than one in modulus, there exists a $k$ with $\| A^k \| < 1$. @@ -1128,8 +1128,8 @@ Let $A$ be a symmetric $n \times n$ matrix. We say that $A$ is -1. *positive definite* if $x' A x > 0$ for every $x \in \mathbb R ^n \setminus \{0\}$ -1. *positive semi-definite* or *nonnegative definite* if $x' A x \geq 0$ for every $x \in \mathbb R ^n$ +1. **positive definite** if $x' A x > 0$ for every $x \in \mathbb R ^n \setminus \{0\}$ +1. **positive semi-definite** or **nonnegative definite** if $x' A x \geq 0$ for every $x \in \mathbb R ^n$ Analogous definitions exist for negative definite and negative semi-definite matrices. diff --git a/lectures/linear_models.md b/lectures/linear_models.md index 43ae68038..feb76976a 100644 --- a/lectures/linear_models.md +++ b/lectures/linear_models.md @@ -112,7 +112,7 @@ The primitives of the model are Given $A, C, G$ and draws of $x_0$ and $w_1, w_2, \ldots$, the model {eq}`st_space_rep` pins down the values of the sequences $\{x_t\}$ and $\{y_t\}$. -Even without these draws, the primitives 1--3 pin down the *probability distributions* of $\{x_t\}$ and $\{y_t\}$. +Even without these draws, the primitives 1--3 pin down the **probability distributions** of $\{x_t\}$ and $\{y_t\}$. Later we'll see how to compute these distributions and their moments. @@ -259,7 +259,7 @@ C = \begin{bmatrix} \end{bmatrix} $$ -The matrix $A$ has the form of the *companion matrix* to the vector +The matrix $A$ has the form of the **companion matrix** to the vector $\begin{bmatrix}\phi_1 & \phi_2 & \phi_3 & \phi_4 \end{bmatrix}$. The next figure shows the dynamics of this process when @@ -301,7 +301,7 @@ Now suppose that * $\phi_j$ is a $k \times k$ matrix and * $w_t$ is $k \times 1$ -Then {eq}`eq_ar_rep` is termed a *vector autoregression*. +Then {eq}`eq_ar_rep` is termed a **vector autoregression**. To map this into {eq}`st_space_rep`, we set @@ -345,8 +345,8 @@ where $I$ is the $k \times k$ identity matrix and $\sigma$ is a $k \times k$ mat We can use {eq}`st_space_rep` to represent -1. the *deterministic seasonal* $y_t = y_{t-4}$ -1. the *indeterministic seasonal* $y_t = \phi_4 y_{t-4} + w_t$ +1. the **deterministic seasonal** $y_t = y_{t-4}$ +1. the **indeterministic seasonal** $y_t = \phi_4 y_{t-4} + w_t$ In fact, both are special cases of {eq}`eq_ar_rep`. @@ -376,7 +376,7 @@ The *indeterministic* seasonal produces recurrent, but aperiodic, seasonal fluct ```{index} single: Linear State Space Models; Time Trends ``` -The model $y_t = a t + b$ is known as a *linear time trend*. +The model $y_t = a t + b$ is known as a **linear time trend**. We can represent this model in the linear state space form by taking @@ -462,7 +462,7 @@ $x_0, w_1, w_2, \ldots, w_t$ can be found by using {eq}`st_space_rep` repeatedl \end{aligned} ``` -Representation {eq}`eqob5` is a *moving average* representation. +Representation {eq}`eqob5` is a **moving average** representation. It expresses $\{x_t\}$ as a linear function of @@ -503,7 +503,7 @@ The first term on the right is a cumulated sum of martingale differences and is The second term is a translated linear function of time. -For this reason, $x_{1t}$ is called a *martingale with drift*. +For this reason, $x_{1t}$ is called a **martingale with drift**. ## Distributions and Moments @@ -548,8 +548,8 @@ As with $\mu_0$, the matrix $\Sigma_0$ is a primitive given in {eq}`st_space_rep As a matter of terminology, we will sometimes call -* $\mu_t$ the *unconditional mean* of $x_t$ -* $\Sigma_t$ the *unconditional variance-covariance matrix* of $x_t$ +* $\mu_t$ the **unconditional mean** of $x_t$ +* $\Sigma_t$ the **unconditional variance-covariance matrix** of $x_t$ This is to distinguish $\mu_t$ and $\Sigma_t$ from related objects that use conditioning information, to be defined below. @@ -763,8 +763,8 @@ In the preceding figure, we approximated the population distribution of $y_T$ by 1. recording each observation $y^i_T$ 1. histogramming this sample -Just as the histogram approximates the population distribution, the *ensemble* or -*cross-sectional average* +Just as the histogram approximates the population distribution, the **ensemble** or +**cross-sectional average** $$ \bar y_T := \frac{1}{I} \sum_{i=1}^I y_T^i @@ -870,7 +870,7 @@ $$ #### Autocovariance Functions -An important object related to the joint distribution is the *autocovariance function* +An important object related to the joint distribution is the **autocovariance function** ```{math} :label: eqnautodeff @@ -958,11 +958,11 @@ the distribution at $T$. Apparently, the distributions of $y_t$ converge to a fixed long-run distribution as $t \to \infty$. -When such a distribution exists it is called a *stationary distribution*. +When such a distribution exists it is called a **stationary distribution**. ### Stationary Distributions -In our setting, a distribution $\psi_{\infty}$ is said to be *stationary* for $x_t$ if +In our setting, a distribution $\psi_{\infty}$ is said to be **stationary** for $x_t$ if $$ x_t \sim \psi_{\infty} @@ -1016,7 +1016,7 @@ Moreover, in view of {eq}`eqnautocov`, the autocovariance function takes the for This motivates the following definition. -A process $\{x_t\}$ is said to be *covariance stationary* if +A process $\{x_t\}$ is said to be **covariance stationary** if * both $\mu_t$ and $\Sigma_t$ are constant in $t$ * $\Sigma_{t+j,t}$ depends on the time gap $j$ but not on time $t$ @@ -1246,7 +1246,7 @@ $$ The right-hand side follows from $x_{t+1} = A x_t + C w_{t+1}$ and the fact that $w_{t+1}$ is zero mean and independent of $x_t, x_{t-1}, \ldots, x_0$. -That $\mathbb{E}_t [x_{t+1}] = \mathbb{E}[x_{t+1} \mid x_t]$ is an implication of $\{x_t\}$ having the *Markov property*. +That $\mathbb{E}_t [x_{t+1}] = \mathbb{E}[x_{t+1} \mid x_t]$ is an implication of $\{x_t\}$ having the **Markov property**. The one-step-ahead forecast error is @@ -1313,7 +1313,7 @@ $V_j$ defined in {eq}`eqob9a` can be calculated recursively via $V_1 = CC'$ and V_j = CC^\prime + A V_{j-1} A^\prime, \quad j \geq 2 ``` -$V_j$ is the *conditional covariance matrix* of the errors in forecasting +$V_j$ is the **conditional covariance matrix** of the errors in forecasting $x_{t+j}$, conditioned on time $t$ information $x_t$. Under particular conditions, $V_j$ converges to @@ -1324,7 +1324,7 @@ Under particular conditions, $V_j$ converges to V_\infty = CC' + A V_\infty A' ``` -Equation {eq}`eqob10` is an example of a *discrete Lyapunov* equation in the covariance matrix $V_\infty$. +Equation {eq}`eqob10` is an example of a **discrete Lyapunov** equation in the covariance matrix $V_\infty$. A sufficient condition for $V_j$ to converge is that the eigenvalues of $A$ be strictly less than one in modulus. diff --git a/lectures/lln_clt.md b/lectures/lln_clt.md index 7aa6954ae..d2839bede 100644 --- a/lectures/lln_clt.md +++ b/lectures/lln_clt.md @@ -84,7 +84,7 @@ will converge to their population means. The classical law of large numbers concerns independent and identically distributed (IID) random variables. -Here is the strongest version of the classical LLN, known as *Kolmogorov's strong law*. +Here is the strongest version of the classical LLN, known as **Kolmogorov's strong law**. Let $X_1, \ldots, X_n$ be independent and identically distributed scalar random variables, with common distribution $F$. @@ -563,7 +563,7 @@ $$ \right) =: \boldsymbol \mu $$ -The *variance-covariance matrix* of random vector $\mathbf X$ is defined as +The **variance-covariance matrix** of random vector $\mathbf X$ is defined as $$ \mathop{\mathrm{Var}}[\mathbf X] diff --git a/lectures/markov_asset.md b/lectures/markov_asset.md index 1739dde0f..4421b37b4 100644 --- a/lectures/markov_asset.md +++ b/lectures/markov_asset.md @@ -264,7 +264,7 @@ $$ p_t = \frac{1 + \kappa}{ \rho - \kappa} d_t $$ -This is called the *Gordon formula*. +This is called the **Gordon formula**. (mass_mg)= ### Example 3: Markov Growth, Risk-Neutral Pricing @@ -473,7 +473,7 @@ where $u$ is a concave utility function and $c_t$ is time $t$ consumption of a r Assume the existence of an endowment that follows growth process {eq}`mass_fmce`. -The asset being priced is a claim on the endowment process, i.e., the *Lucas tree* described above. +The asset being priced is a claim on the endowment process, i.e., the **Lucas tree** described above. Following {cite}`Lucas1978`, we suppose that in equilibrium the representative consumer's consumption equals the aggregate endowment, so that $d_t = c_t$ for all $t$. @@ -748,7 +748,7 @@ We'll study an option that gives the owner the right to purchase a consol at a #### An Infinite Horizon Call Option -We want to price an *infinite horizon* option to purchase a consol at a price $p_S$. +We want to price an **infinite horizon** option to purchase a consol at a price $p_S$. The option entitles the owner at the beginning of a period either @@ -757,7 +757,7 @@ The option entitles the owner at the beginning of a period either Thus, the owner either *exercises* the option now or chooses *not to exercise* and wait until next period. -This is termed an infinite-horizon *call option* with *strike price* $p_S$. +This is termed an infinite-horizon **call option** with **strike price** $p_S$. The owner of the option is entitled to purchase the consol at price $p_S$ at the beginning of any period, after the coupon has been paid to the previous owner of the bond. diff --git a/lectures/markov_perf.md b/lectures/markov_perf.md index c7b99c7ad..923cd07af 100644 --- a/lectures/markov_perf.md +++ b/lectures/markov_perf.md @@ -140,7 +140,7 @@ v_i(q_i, q_{-i}) = \max_{\hat q_i} \left\{\pi_i (q_i, q_{-i}, \hat q_i) + \beta v_i(\hat q_i, f_{-i}(q_{-i}, q_i)) \right\} ``` -**Definition** A *Markov perfect equilibrium* of the duopoly model is a pair of value functions $(v_1, v_2)$ and a pair of policy functions $(f_1, f_2)$ such that, for each $i \in \{1, 2\}$ and each possible state, +**Definition** A **Markov perfect equilibrium** of the duopoly model is a pair of value functions $(v_1, v_2)$ and a pair of policy functions $(f_1, f_2)$ such that, for each $i \in \{1, 2\}$ and each possible state, * The value function $v_i$ satisfies Bellman equation {eq}`game4`. * The maximizer on the right side of {eq}`game4` equals $f_i(q_i, q_{-i})$. diff --git a/lectures/mle.md b/lectures/mle.md index 929eed27c..7a1942d42 100644 --- a/lectures/mle.md +++ b/lectures/mle.md @@ -183,7 +183,7 @@ In Treisman's paper, the dependent variable --- the number of billionaires $y_i$ Hence, the distribution of $y_i$ needs to be conditioned on the vector of explanatory variables $\mathbf{x}_i$. -The standard formulation --- the so-called *Poisson regression* model --- is as follows: +The standard formulation --- the so-called **Poisson regression** model --- is as follows: ```{math} :label: poissonreg @@ -861,7 +861,7 @@ f(y_i; \boldsymbol{\beta}) = \mu_i^{y_i} (1-\mu_i)^{1-y_i}, \quad y_i = 0,1 \\ \end{aligned} $$ -$\Phi$ represents the *cumulative normal distribution* and +$\Phi$ represents the **cumulative normal distribution** and constrains the predicted $y_i$ to be between 0 and 1 (as required for a probability). diff --git a/lectures/odu.md b/lectures/odu.md index 656b141db..c62519468 100644 --- a/lectures/odu.md +++ b/lectures/odu.md @@ -111,7 +111,7 @@ v(w) ``` The optimal policy has the form $\mathbf{1}\{w \geq \bar w\}$, where -$\bar w$ is a constant called the *reservation wage*. +$\bar w$ is a constant called the **reservation wage**. ### Offer Distribution Unknown @@ -545,7 +545,7 @@ and using $\circ$ for composition of functions yields Equation {eq}`odu_mvf4` can be understood as a functional equation, where $\bar w$ is the unknown function. -* Let's call it the *reservation wage functional equation* (RWFE). +* Let's call it the **reservation wage functional equation** (RWFE). * The solution $\bar w$ to the RWFE is the object that we wish to compute. ## Solving the RWFE diff --git a/lectures/ols.md b/lectures/ols.md index 22b5a0844..bb4cc68af 100644 --- a/lectures/ols.md +++ b/lectures/ols.md @@ -169,7 +169,7 @@ The most common technique to estimate the parameters ($\beta$'s) of the linear model is Ordinary Least Squares (OLS). As the name implies, an OLS model is solved by finding the parameters -that minimize *the sum of squared residuals*, i.e. +that minimize **the sum of squared residuals**, i.e. $$ \underset{\hat{\beta}}{\min} \sum^N_{i=1}{\hat{u}^2_i} diff --git a/lectures/rational_expectations.md b/lectures/rational_expectations.md index dbc6e628a..ec8be4bc8 100644 --- a/lectures/rational_expectations.md +++ b/lectures/rational_expectations.md @@ -309,7 +309,7 @@ Y_{t+1} = H(Y_t) where $Y_0$ is a known initial condition. -The *belief function* $H$ is an equilibrium object, and hence remains to be determined. +The **belief function** $H$ is an equilibrium object, and hence remains to be determined. #### Optimal Behavior Given Beliefs @@ -364,7 +364,7 @@ $$ v_y(y,Y) = a_0 - a_1 Y + \gamma (y' - y) $$ -Substituting this equation into {eq}`comp5` gives the *Euler equation* +Substituting this equation into {eq}`comp5` gives the **Euler equation** ```{math} :label: ree_comp7 @@ -377,7 +377,7 @@ The firm optimally sets an output path that satisfies {eq}`ree_comp7`, taking { * the initial conditions for $(y_0, Y_0)$. * the terminal condition $\lim_{t \rightarrow \infty } \beta^t y_t v_y(y_{t}, Y_t) = 0$. -This last condition is called the *transversality condition*, and acts as a first-order necessary condition "at infinity". +This last condition is called the **transversality condition**, and acts as a first-order necessary condition "at infinity". A representative firm's decision rule solves the difference equation {eq}`ree_comp7` subject to the given initial condition $y_0$ and the transversality condition. @@ -388,7 +388,7 @@ a decision rule that automatically imposes both the Euler equation {eq}`ree_comp As we've seen, a given belief translates into a particular decision rule $h$. -Recalling that in equilbrium $Y_t = y_t$, the *actual law of motion* for market-wide output is then +Recalling that in equilbrium $Y_t = y_t$, the **actual law of motion** for market-wide output is then ```{math} :label: ree_comp9a @@ -401,7 +401,7 @@ Thus, when firms believe that the law of motion for market-wide output is {eq}`r (ree_def)= ### Definition of Rational Expectations Equilibrium -A *rational expectations equilibrium* or *recursive competitive equilibrium* of the model with adjustment costs is a decision rule $h$ and an aggregate law of motion $H$ such that +A **rational expectations equilibrium** or **recursive competitive equilibrium** of the model with adjustment costs is a decision rule $h$ and an aggregate law of motion $H$ such that 1. Given belief $H$, the map $h$ is the firm's optimal policy function. 1. The law of motion $H$ satisfies $H(Y)= h(Y,Y)$ for all @@ -469,7 +469,7 @@ s(Y_t, Y_{t+1}) The first term is the area under the demand curve, while the second measures the social costs of changing output. -The *planning problem* is to choose a production plan $\{Y_t\}$ to maximize +The **planning problem** is to choose a production plan $\{Y_t\}$ to maximize $$ \sum_{t=0}^\infty \beta^t s(Y_t, Y_{t+1}) diff --git a/lectures/re_with_feedback.md b/lectures/re_with_feedback.md index 48a0aae94..da5a647dc 100644 --- a/lectures/re_with_feedback.md +++ b/lectures/re_with_feedback.md @@ -78,14 +78,14 @@ first-order and second-order linear difference equations. ## Linear Difference Equations -We'll use the *backward shift* or *lag* operator $L$. +We'll use the **backward shift** or **lag** operator $L$. The lag operator $L$ maps a sequence $\{x_t\}_{t=0}^\infty$ into the sequence $\{x_{t-1}\}_{t=0}^\infty$ We'll deploy $L$ by using the equality $L x_t \equiv x_{t-1}$ in algebraic expressions. -Further, the inverse $L^{-1}$ of the lag operator is the *forward shift* +Further, the inverse $L^{-1}$ of the lag operator is the **forward shift** operator. We'll often use the equality $L^{-1} x_t \equiv x_{t+1}$ below. @@ -345,7 +345,7 @@ F = (1-\lambda) G (I - \lambda A)^{-1} ``` ```{note} -As mentioned above, an *explosive solution* of difference +As mentioned above, an **explosive solution** of difference equation {eq}`equation_1` can be constructed by adding to the right hand of {eq}`equation_4` a sequence $c \lambda^{-t}$ where $c$ is an arbitrary positive constant. diff --git a/lectures/samuelson.md b/lectures/samuelson.md index 44aaf3b91..8616328e9 100644 --- a/lectures/samuelson.md +++ b/lectures/samuelson.md @@ -86,7 +86,7 @@ equal amount of *aggregate supply*. Samuelson used the model to analyze how particular values of the marginal propensity to consume and the accelerator coefficient might -give rise to transient *business cycles* in national output. +give rise to transient **business cycles** in national output. Possible dynamic properties include @@ -100,7 +100,7 @@ adds a random shock to the right side of the national income identity representing random fluctuations in aggregate demand. This modification makes national output become governed by a second-order -*stochastic linear difference equation* that, with appropriate parameter values, +**stochastic linear difference equation** that, with appropriate parameter values, gives rise to recurrent irregular business cycles. (To read about stochastic linear difference equations see chapter XI of @@ -152,7 +152,7 @@ and the national income identity Y_t = C_t + I_t + G_t ``` -- The parameter $\alpha$ is peoples' *marginal propensity to consume* +- The parameter $\alpha$ is peoples' **marginal propensity to consume** out of income - equation {eq}`consumption` asserts that people consume a fraction of $\alpha \in (0,1)$ of each additional dollar of income. - The parameter $\beta > 0$ is the investment accelerator coefficient - equation @@ -193,7 +193,7 @@ a constant value as $t$ becomes large. We are interested in studying - the transient fluctuations in $Y_t$ as it converges to its - *steady state* level + **steady state** level - the *rate* at which it converges to a steady state level The deterministic version of the model described so far --- meaning that @@ -235,7 +235,7 @@ Y_{t+2} - \rho_1 Y_{t+1} - \rho_2 Y_t = 0 ``` To discover the properties of the solution of {eq}`second_stochastic2`, -it is useful first to form the *characteristic polynomial* +it is useful first to form the **characteristic polynomial** for {eq}`second_stochastic2`: ```{math} @@ -246,7 +246,7 @@ z^2 - \rho_1 z - \rho_2 where $z$ is possibly a complex number. -We want to find the two *zeros* (a.k.a. *roots*) -- namely +We want to find the two **zeros** (a.k.a. **roots**) -- namely $\lambda_1, \lambda_2$ -- of the characteristic polynomial. These are two special values of $z$, say $z= \lambda_1$ and diff --git a/lectures/sir_model.md b/lectures/sir_model.md index 24682939e..5b0c5305c 100644 --- a/lectures/sir_model.md +++ b/lectures/sir_model.md @@ -108,9 +108,9 @@ dynamics are In these equations, -* $\beta(t)$ is called the *transmission rate* (the rate at which individuals bump into others and expose them to the virus). -* $\sigma$ is called the *infection rate* (the rate at which those who are exposed become infected) -* $\gamma$ is called the *recovery rate* (the rate at which infected people recover or die). +* $\beta(t)$ is called the **transmission rate** (the rate at which individuals bump into others and expose them to the virus). +* $\sigma$ is called the **infection rate** (the rate at which those who are exposed become infected) +* $\gamma$ is called the **recovery rate** (the rate at which infected people recover or die). * the dot symbol $\dot y$ represents the time derivative $dy/dt$. We do not need to model the fraction $r$ of the population in state $R$ separately because the states form a partition. @@ -141,7 +141,7 @@ As in Atkeson's note, we set The transmission rate is modeled as -* $\beta(t) := R(t) \gamma$ where $R(t)$ is the *effective reproduction number* at time $t$. +* $\beta(t) := R(t) \gamma$ where $R(t)$ is the **effective reproduction number** at time $t$. (The notation is slightly confusing, since $R(t)$ is different to $R$, the symbol that represents the removed state.) diff --git a/lectures/uncertainty_traps.md b/lectures/uncertainty_traps.md index 2cef53ace..52aa5b4dd 100644 --- a/lectures/uncertainty_traps.md +++ b/lectures/uncertainty_traps.md @@ -323,7 +323,7 @@ at once, for a given set of shocks Notice how the traps only take hold after a sequence of bad draws for the fundamental. -Thus, the model gives us a *propagation mechanism* that maps bad random draws into long downturns in economic activity. +Thus, the model gives us a **propagation mechanism** that maps bad random draws into long downturns in economic activity. ## Exercises diff --git a/lectures/von_neumann_model.md b/lectures/von_neumann_model.md index b26016286..70c7afd9a 100644 --- a/lectures/von_neumann_model.md +++ b/lectures/von_neumann_model.md @@ -364,11 +364,11 @@ respectively. A pair $(A,B)$ of $m\times n$ non-negative matrices defines an economy. -- $m$ is the number of *activities* (or sectors) -- $n$ is the number of *goods* (produced and/or consumed). -- $A$ is called the *input matrix*; $a_{i,j}$ denotes the +- $m$ is the number of **activities** (or sectors) +- $n$ is the number of **goods** (produced and/or consumed) +- $A$ is called the **input matrix**; $a_{i,j}$ denotes the amount of good $j$ consumed by activity $i$ -- $B$ is called the *output matrix*; $b_{i,j}$ represents +- $B$ is called the **output matrix**; $b_{i,j}$ represents the amount of good $j$ produced by activity $i$ Two key assumptions restrict economy $(A,B)$: @@ -388,28 +388,28 @@ Two key assumptions restrict economy $(A,B)$: ``` ```` -A semi-positive *intensity* $m$-vector $x$ denotes levels at which +A semi-positive **intensity** $m$-vector $x$ denotes levels at which activities are operated. Therefore, -- vector $x^\top A$ gives the total amount of *goods used in - production* -- vector $x^\top B$ gives *total outputs* +- vector $x^\top A$ gives the total amount of **goods used in + production** +- vector $x^\top B$ gives **total outputs** -An economy $(A,B)$ is said to be *productive*, if there exists a +An economy $(A,B)$ is said to be **productive**, if there exists a non-negative intensity vector $x \geq 0$ such that $x^\top B > x^\top A$. The semi-positive $n$-vector $p$ contains prices assigned to the $n$ goods. -The $p$ vector implies *cost* and *revenue* vectors +The $p$ vector implies **cost** and **revenue** vectors -- the vector $Ap$ tells *costs* of the vector of activities -- the vector $Bp$ tells *revenues* from the vector of activities +- the vector $Ap$ tells **costs** of the vector of activities +- the vector $Bp$ tells **revenues** from the vector of activities -Satisfaction of a property of an input-output pair $(A,B)$ called *irreducibility* +Satisfaction of a property of an input-output pair $(A,B)$ called **irreducibility** (or indecomposability) determines whether an economy can be decomposed into multiple "sub-economies".