QuantEcon · mmcky · Nov 21, 2025 · Nov 20, 2025 · Nov 20, 2025 · Nov 20, 2025
diff --git a/lectures/ak2.md b/lectures/ak2.md
@@ -209,7 +209,7 @@ Units of the rental rates are:
 * for $r_t$,  output at time $t$  per unit of capital at time $t$ 
 
 
-We take output at time $t$ as *numeraire*, so the price of output at time $t$ is one.
+We take output at time $t$ as **numeraire**, so the price of output at time $t$ is one.
 
 The firm's profits at time $t$ are 
 

diff --git a/lectures/cake_eating_stochastic.md b/lectures/cake_eating_stochastic.md
@@ -164,13 +164,13 @@ In summary, the agent's aim is to select a path $c_0, c_1, c_2, \ldots$ for cons
 1. nonnegative,
 1. feasible in the sense of {eq}`outcsdp0`,
 1. optimal, in the sense that it maximizes {eq}`texs0_og2` relative to all other feasible consumption sequences, and
-1. *adapted*, in the sense that the action $c_t$ depends only on
+1. **adapted**, in the sense that the action $c_t$ depends only on
    observable outcomes, not on future outcomes such as $\xi_{t+1}$.
 
 In the present context
 
-* $x_t$ is called the *state* variable --- it summarizes the "state of the world" at the start of each period.
-* $c_t$ is called the *control* variable --- a value chosen by the agent each period after observing the state.
+* $x_t$ is called the **state** variable --- it summarizes the "state of the world" at the start of each period.
+* $c_t$ is called the **control** variable --- a value chosen by the agent each period after observing the state.
 
 ### The Policy Function Approach
 

diff --git a/lectures/cake_eating_time_iter.md b/lectures/cake_eating_time_iter.md
@@ -237,7 +237,7 @@ whenever $\sigma \in \mathscr P$.
 It is possible to prove that there is a tight relationship between iterates of
 $K$ and iterates of the Bellman operator.
 
-Mathematically, the two operators are *topologically conjugate*.
+Mathematically, the two operators are **topologically conjugate**.
 
 Loosely speaking, this means that if iterates of one operator converge then
 so do iterates of the other, and vice versa.

diff --git a/lectures/career.md b/lectures/career.md
@@ -66,8 +66,8 @@ from matplotlib import cm
 
 In what follows we distinguish between a career and a job, where
 
-* a *career* is understood to be a general field encompassing many possible jobs, and
-* a *job*  is understood to be a position with a particular firm
+* a **career** is understood to be a general field encompassing many possible jobs, and
+* a **job**  is understood to be a position with a particular firm
 
 For workers, wages can be decomposed into the contribution of job and career
 

diff --git a/lectures/cass_fiscal.md b/lectures/cass_fiscal.md
@@ -147,8 +147,8 @@ $$ (eq:gov_budget)
 
 Given a budget-feasible government policy $\{g_t\}_{t=0}^\infty$ and $\{\tau_{ct}, \tau_{kt}, \tau_{nt}, \tau_{ht}\}_{t=0}^\infty$ subject to {eq}`eq:gov_budget`,
 
-- *Household* chooses $\{c_t\}_{t=0}^\infty$, $\{n_t\}_{t=0}^\infty$, and $\{k_{t+1}\}_{t=0}^\infty$ to maximize utility{eq}`eq:utility` subject to budget constraint{eq}`eq:house_budget`, and 
-- *Frim* chooses sequences of capital $\{k_t\}_{t=0}^\infty$ and $\{n_t\}_{t=0}^\infty$ to maximize profits
+- **Household** chooses $\{c_t\}_{t=0}^\infty$, $\{n_t\}_{t=0}^\infty$, and $\{k_{t+1}\}_{t=0}^\infty$ to maximize utility{eq}`eq:utility` subject to budget constraint{eq}`eq:house_budget`, and 
+- **Firm** chooses sequences of capital $\{k_t\}_{t=0}^\infty$ and $\{n_t\}_{t=0}^\infty$ to maximize profits
 
     $$
          \sum_{t=0}^\infty q_t [F(k_t, n_t) - \eta_t k_t - w_t n_t]

diff --git a/lectures/kalman.md b/lectures/kalman.md
@@ -85,7 +85,7 @@ One way to summarize our knowledge is a point prediction $\hat x$
 * Then it is better to summarize our initial beliefs with a bivariate probability density $p$
   * $\int_E p(x)dx$ indicates the probability that we attach to the missile being in region $E$.
 
-The density $p$ is called our *prior* for the random variable $x$.
+The density $p$ is called our **prior** for the random variable $x$.
 
 To keep things tractable in our example,  we  assume that our prior is Gaussian.
 
@@ -317,7 +317,7 @@ We have obtained probabilities for the current location of the state (missile) g
 This is called "filtering" rather than forecasting because we are filtering
 out noise rather than looking into the future.
 
-* $p(x \,|\, y) = N(\hat x^F, \Sigma^F)$ is called the *filtering distribution*
+* $p(x \,|\, y) = N(\hat x^F, \Sigma^F)$ is called the **filtering distribution**
 
 But now let's suppose that we are given another task: to predict the location of the missile after one unit of time (whatever that may be) has elapsed.
 
@@ -331,7 +331,7 @@ Let's suppose that we have one, and that it's linear and Gaussian. In particular
 x_{t+1} = A x_t + w_{t+1}, \quad \text{where} \quad w_t \sim N(0, Q)
 ```
 
-Our aim is to combine this law of motion and our current distribution $p(x \,|\, y) = N(\hat x^F, \Sigma^F)$ to come up with a new *predictive* distribution for the location in one unit of time.
+Our aim is to combine this law of motion and our current distribution $p(x \,|\, y) = N(\hat x^F, \Sigma^F)$ to come up with a new **predictive** distribution for the location in one unit of time.
 
 In view of {eq}`kl_xdynam`, all we have to do is introduce a random vector $x^F \sim N(\hat x^F, \Sigma^F)$ and work out the distribution of $A x^F + w$ where $w$ is independent of $x^F$ and has distribution $N(0, Q)$.
 
@@ -356,7 +356,7 @@ $$
 $$
 
 The matrix $A \Sigma G' (G \Sigma G' + R)^{-1}$ is often written as
-$K_{\Sigma}$ and called the *Kalman gain*.
+$K_{\Sigma}$ and called the **Kalman gain**.
 
 * The subscript $\Sigma$ has been added to remind us that  $K_{\Sigma}$ depends on $\Sigma$, but not $y$ or $\hat x$.
 
@@ -373,7 +373,7 @@ Our updated prediction is the density $N(\hat x_{new}, \Sigma_{new})$ where
 \end{aligned}
 ```
 
-* The density $p_{new}(x) = N(\hat x_{new}, \Sigma_{new})$ is called the *predictive distribution*
+* The density $p_{new}(x) = N(\hat x_{new}, \Sigma_{new})$ is called the **predictive distribution**
 
 The predictive distribution is the new density shown in the following figure, where
 the update has used parameters.

diff --git a/lectures/likelihood_bayes.md b/lectures/likelihood_bayes.md
@@ -129,8 +129,8 @@ $$
 where we use the conventions 
 that $f(w^t) = f(w_1) f(w_2) \ldots f(w_t)$ and $g(w^t) = g(w_1) g(w_2) \ldots g(w_t)$.
 
-Notice that the likelihood process satisfies the *recursion* or
-*multiplicative decomposition*
+Notice that the likelihood process satisfies the **recursion** or
+**multiplicative decomposition**
 
 $$
 L(w^t) = \ell (w_t) L (w^{t-1}) .

diff --git a/lectures/linear_algebra.md b/lectures/linear_algebra.md
@@ -85,7 +85,7 @@ from scipy.linalg import inv, solve, det, eig
 ```{index} single: Linear Algebra; Vectors
 ```
 
-A *vector* of length $n$ is just a sequence (or array, or tuple) of $n$ numbers, which we write as $x = (x_1, \ldots, x_n)$ or  $x = [x_1, \ldots, x_n]$.
+A **vector** of length $n$ is just a sequence (or array, or tuple) of $n$ numbers, which we write as $x = (x_1, \ldots, x_n)$ or  $x = [x_1, \ldots, x_n]$.
 
 We will write these sequences either horizontally or vertically as we please.
 
@@ -225,15 +225,15 @@ x + y
 ```{index} single: Vectors; Norm
 ```
 
-The *inner product* of vectors $x,y \in \mathbb R ^n$ is defined as
+The **inner product** of vectors $x,y \in \mathbb R ^n$ is defined as
 
 $$
 x' y := \sum_{i=1}^n x_i y_i
 $$
 
-Two vectors are called *orthogonal* if their inner product is zero.
+Two vectors are called **orthogonal** if their inner product is zero.
 
-The *norm* of a vector $x$ represents its "length" (i.e., its distance from the zero vector) and is defined as
+The **norm** of a vector $x$ represents its "length" (i.e., its distance from the zero vector) and is defined as
 
 $$
 \| x \| := \sqrt{x' x} := \left( \sum_{i=1}^n x_i^2 \right)^{1/2}
@@ -273,7 +273,7 @@ np.linalg.norm(x)      # Norm of x, take three
 
 Given a set of vectors $A := \{a_1, \ldots, a_k\}$ in $\mathbb R ^n$, it's natural to think about the new vectors we can create by performing linear operations.
 
-New vectors created in this manner are called *linear combinations* of $A$.
+New vectors created in this manner are called **linear combinations** of $A$.
 
 In particular, $y \in \mathbb R ^n$ is a linear combination of $A := \{a_1, \ldots, a_k\}$ if
 
@@ -282,9 +282,9 @@ y = \beta_1 a_1 + \cdots + \beta_k a_k
 \text{ for some scalars } \beta_1, \ldots, \beta_k
 $$
 
-In this context, the values $\beta_1, \ldots, \beta_k$ are called the *coefficients* of the linear combination.
+In this context, the values $\beta_1, \ldots, \beta_k$ are called the **coefficients** of the linear combination.
 
-The set of linear combinations of $A$ is called the *span* of $A$.
+The set of linear combinations of $A$ is called the **span** of $A$.
 
 The next figure shows the span of $A = \{a_1, a_2\}$ in $\mathbb R ^3$.
 
@@ -349,7 +349,7 @@ plt.show()
 If $A$ contains only one vector $a_1 \in \mathbb R ^2$, then its
 span is just the scalar multiples of $a_1$, which is the unique line passing through both $a_1$ and the origin.
 
-If $A = \{e_1, e_2, e_3\}$ consists  of the *canonical basis vectors* of $\mathbb R ^3$, that is
+If $A = \{e_1, e_2, e_3\}$ consists  of the **canonical basis vectors** of $\mathbb R ^3$, that is
 
 $$
 e_1 :=
@@ -399,8 +399,8 @@ The condition we need for a set of vectors to have a large span is what's called
 
 In particular, a collection of vectors $A := \{a_1, \ldots, a_k\}$ in $\mathbb R ^n$ is said to be
 
-* *linearly dependent* if some strict subset of $A$ has the same span as $A$.
-* *linearly independent* if it is not linearly dependent.
+* **linearly dependent** if some strict subset of $A$ has the same span as $A$.
+* **linearly independent** if it is not linearly dependent.
 
 Put differently, a set of vectors is linearly independent if no vector is redundant to the span and linearly dependent otherwise.
 
@@ -469,19 +469,19 @@ Often, the numbers in the matrix represent coefficients in a system of linear eq
 
 For obvious reasons, the matrix $A$ is also called a vector if either $n = 1$ or $k = 1$.
 
-In the former case, $A$ is called a *row vector*, while in the latter it is called a *column vector*.
+In the former case, $A$ is called a **row vector**, while in the latter it is called a **column vector**.
 
-If $n = k$, then $A$ is called *square*.
+If $n = k$, then $A$ is called **square**.
 
-The matrix formed by replacing $a_{ij}$ by $a_{ji}$ for every $i$ and $j$ is called the *transpose* of $A$ and denoted $A'$ or $A^{\top}$.
+The matrix formed by replacing $a_{ij}$ by $a_{ji}$ for every $i$ and $j$ is called the **transpose** of $A$ and denoted $A'$ or $A^{\top}$.
 
-If $A = A'$, then $A$ is called *symmetric*.
+If $A = A'$, then $A$ is called **symmetric**.
 
-For a square matrix $A$, the $i$ elements of the form $a_{ii}$ for $i=1,\ldots,n$ are called the *principal diagonal*.
+For a square matrix $A$, the $i$ elements of the form $a_{ii}$ for $i=1,\ldots,n$ are called the **principal diagonal**.
 
-$A$ is called *diagonal* if the only nonzero entries are on the principal diagonal.
+$A$ is called **diagonal** if the only nonzero entries are on the principal diagonal.
 
-If, in addition to being diagonal, each element along the principal diagonal is equal to 1, then $A$ is called the *identity matrix* and denoted by $I$.
+If, in addition to being diagonal, each element along the principal diagonal is equal to 1, then $A$ is called the **identity matrix** and denoted by $I$.
 
 ### Matrix Operations
 
@@ -641,9 +641,9 @@ See [here](https://python-programming.quantecon.org/numpy.html#matrix-multiplica
 
 Each $n \times k$ matrix $A$ can be identified with a function $f(x) = Ax$ that maps $x \in \mathbb R ^k$ into $y = Ax \in \mathbb R ^n$.
 
-These kinds of functions have a special property: they are *linear*.
+These kinds of functions have a special property: they are **linear**.
 
-A function $f \colon \mathbb R ^k \to \mathbb R ^n$ is called *linear* if, for all $x, y \in \mathbb R ^k$ and all scalars $\alpha, \beta$, we have
+A function $f \colon \mathbb R ^k \to \mathbb R ^n$ is called **linear** if, for all $x, y \in \mathbb R ^k$ and all scalars $\alpha, \beta$, we have
 
 $$
 f(\alpha x + \beta y) = \alpha f(x) + \beta f(y)
@@ -773,7 +773,7 @@ In particular, the following are equivalent
 1. The columns of $A$ are linearly independent.
 1. For any $y \in \mathbb R ^n$, the equation $y = Ax$ has a unique solution.
 
-The property of having linearly independent columns is sometimes expressed as having *full column rank*.
+The property of having linearly independent columns is sometimes expressed as having **full column rank**.
 
 #### Inverse Matrices
 
@@ -788,7 +788,7 @@ solution is $x = A^{-1} y$.
 A similar expression is available in the matrix case.
 
 In particular, if square matrix $A$ has full column rank, then it possesses a multiplicative
-*inverse matrix* $A^{-1}$, with the property that $A A^{-1} = A^{-1} A = I$.
+**inverse matrix** $A^{-1}$, with the property that $A A^{-1} = A^{-1} A = I$.
 
 As a consequence, if we pre-multiply both sides of $y = Ax$ by $A^{-1}$, we get $x = A^{-1} y$.
 
@@ -800,11 +800,11 @@ This is the solution that we're looking for.
 ```
 
 Another quick comment about square matrices is that to every such matrix we
-assign a unique number called the *determinant* of the matrix --- you can find
+assign a unique number called the **determinant** of the matrix --- you can find
 the expression for it [here](https://en.wikipedia.org/wiki/Determinant).
 
 If the determinant of $A$ is not zero, then we say that $A$ is
-*nonsingular*.
+**nonsingular**.
 
 Perhaps the most important fact about determinants is that $A$ is nonsingular if and only if $A$ is of full column rank.
 
@@ -929,8 +929,8 @@ $$
 A v = \lambda v
 $$
 
-then we say that $\lambda$ is an *eigenvalue* of $A$, and
-$v$ is an *eigenvector*.
+then we say that $\lambda$ is an **eigenvalue** of $A$, and
+$v$ is an **eigenvector**.
 
 Thus, an eigenvector of $A$ is a vector such that when the map $f(x) = Ax$ is applied, $v$ is merely scaled.
 
@@ -1034,7 +1034,7 @@ to one.
 
 ### Generalized Eigenvalues
 
-It is sometimes useful to consider the *generalized eigenvalue problem*, which, for given
+It is sometimes useful to consider the **generalized eigenvalue problem**, which, for given
 matrices $A$ and $B$, seeks generalized eigenvalues
 $\lambda$ and eigenvectors $v$ such that
 
@@ -1076,10 +1076,10 @@ $$
 $$
 
 The norms on the right-hand side are ordinary vector norms, while the norm on
-the left-hand side is a *matrix norm* --- in this case, the so-called
-*spectral norm*.
+the left-hand side is a **matrix norm** --- in this case, the so-called
+**spectral norm**.
 
-For example, for a square matrix $S$, the condition $\| S \| < 1$ means that $S$ is *contractive*, in the sense that it pulls all vectors towards the origin [^cfn].
+For example, for a square matrix $S$, the condition $\| S \| < 1$ means that $S$ is **contractive**, in the sense that it pulls all vectors towards the origin [^cfn].
 
 (la_neumann)=
 #### {index}`Neumann's Theorem <single: Neumann's Theorem>`
@@ -1112,7 +1112,7 @@ $$
 \rho(A) = \lim_{k \to \infty} \| A^k \|^{1/k}
 $$
 
-Here $\rho(A)$ is the *spectral radius*, defined as $\max_i |\lambda_i|$, where $\{\lambda_i\}_i$ is the set of eigenvalues of $A$.
+Here $\rho(A)$ is the **spectral radius**, defined as $\max_i |\lambda_i|$, where $\{\lambda_i\}_i$ is the set of eigenvalues of $A$.
 
 As a consequence of Gelfand's formula, if all eigenvalues are strictly less than one in modulus,
 there exists a $k$ with $\| A^k \| < 1$.
@@ -1128,8 +1128,8 @@ Let $A$ be a symmetric $n \times n$ matrix.
 
 We say that $A$ is
 
-1. *positive definite* if $x' A x > 0$ for every $x \in \mathbb R ^n \setminus \{0\}$
-1. *positive semi-definite* or *nonnegative definite* if $x' A x \geq 0$ for every $x \in \mathbb R ^n$
+1. **positive definite** if $x' A x > 0$ for every $x \in \mathbb R ^n \setminus \{0\}$
+1. **positive semi-definite** or **nonnegative definite** if $x' A x \geq 0$ for every $x \in \mathbb R ^n$
 
 Analogous definitions exist for negative definite and negative semi-definite matrices.