# Knowing the Forecast of Others


\citeN{lucas75}, \citeN{kasa}, and \citeN{townsend} showed
that the assumption that decision makers have to
extract signals about hidden persistent state variables is a potential
source both for additional impulses and for elongated impulse
response functions in business cycle models.

This theme has been
pursued in recent analyses in which decision maker's imperfect
information forces them into pursuing an infinite recursion of
forming beliefs about the beliefs of other (e.g., \citeN{ams}).


\citeN{lucas75} side stepped the  problem of forecasting the
forecasts of others by  letting decision makers pool their
information before forecasting. \citeN{townsend} bit the bullet,
didn't assume pooling, and
 tackled  the
forecasting the forecasts of others problem.  He  proposed an
approximate equilibrium of a model in which decision makers
extract signals from endogenous variables (prices).




By applying technical machinery of \citeN{PCL},
 \citeN{PS2005}  showed that there is a recursive representation
of the equilibrium
of the perpetually and symmetrically uninformed model formulated but
not completely solved in section
8 of \citeN{townsend}.  Their  computational method  is recursive: it combines
the Kalman filter with invariant subspace methods for solving systems
of Euler equations.\footnote{See \citeN{ahms} for an account of invariant
subspace methods.}
As \citeN{singleton}, \citeN{kasa}, and \citeN{sargent91}
also found, the equilibrium is fully revealing:
observed prices
tell participants in industry $i$ all of the information
held by participants in market $-i$ (i.e., not $i$).  This means that higher-order beliefs
play no role: seeing equilibrium prices in effect lets decision makers
pool their  information sets.\footnote{See \citeN{ams} for a discussion
of the information assumptions needed to create a situation
in which higher order beliefs appear in equilibrium decision rules.  The way
to read our findings in light of \citeN{ams} is that Townsend's
section 8 model  has too few sources of random shocks relative
to sources of signals to permit higher order beliefs to
play a role.}
The disappearance of higher
order beliefs means that decision makers in this model do not really face
a problem of  forecasting the forecasts of others.  They know those
forecasts because  they are the same as their own.

The presence of
a common  hidden state variable is the only thing that inspires
decision makers in one market to condition their decisions on the
history of prices in the other market. 

 \citeN{townsend} noted in the version of his model with
perpetually and symmetrically uninformed decision makers,  the dimension of the state space seemed to explode because it is necessary for
decision makers to keep track of an infinite history of  vectors
of observables. 

That _curse of dimensionality_ deterred 
Townsend from characterizing  or
computing an equilibrium  of 
that model.

Instead he constructed another model and computed its  equilibrium.
To construct this model he assumed that  after a finite number $j$ periods, the (lagged) value
of the key hidden state variable is revealed to the decision maker.  

\citeN{sargent91}  proposed
a way to compute an equilibrium without making Townsend's
approximation. Extending the reasoning of
\citeN{muth60}, Sargent noticed that   it is possible to summarize
the relevant history with a low dimensional object, namely, a
small number of current and lagged forecasting errors.  Positing
an equilibrium in a space of perceived laws of motion for
endogenous variables that takes the form of a vector
autoregressive, moving average,  Sargent described an equilibrium
as a fixed point of a  mapping from the perceived law of motion to
the actual law of motion of that form.  Sargent worked in the time
domain and had to guess and verify the appropriate orders of the
autoregressive and moving average pieces of the equilibrium representation. However, by
working in the frequency domain \citeN{kasa}  showed how to
discover the appropriate orders of the autoregressive and moving average
 parts, and also
how to compute an equilibrium.


  Our recursive computational method, which stays in the time domain, also
discovers  the appropriate orders of the autoregressive and moving
average pieces.  In
addition, by displaying equilibrium representations in the form of
\citeN{PCL}, \citeN{PS2005} showed how the moving average
piece is linked to the innovation
process  of the hidden persistent component of the demand shock.
That scalar innovation process  is the additional state variable
contributed by the problem of extracting a signal from equilibrium
prices that decision makers face in Townsend's model.






This notebook describes components of Robert Townsend's models of two  industries that are 
linked in a single way: shocks to the demand curves for their products have a common component.

We put these components of the model together in several ways that help us appreciate
the structure of the  model that ultimately concerns us -- a model that goes beyond even what Townsend described in his paper.

While keeping all other aspects of the model the same, our procedure is to  consequences
of alternative assumptions about what decision makers observe.

### The setting

Firms in each of two industries $i=1,2$ employ a single factor
of production capital, $k_t^i$, to produce output of a single good,
$y_t^i$. 

We let capital letters denote market wide objects and lower
case letters
denote objects chosen by a representative  firm.


A representative firm in industry
$i$  has production function $y_t^i = f k_t^i$, $f >0$, acts as a price
taker with respect to output price $P_t^i$, and maximizes

\begin{equation} \label{town1}
E_0^i \sum_{t=0}^\infty \beta^t \left\{ P_t^i f k_t^i - .5
   h (k_{t+1}^i - k_t^i)^2 \right\} ,
\quad h >0  .\end{equation}

Demand in industry $i$ obeys

\begin{equation} \label{town2}
P_t^i = - b Y_t^i + \theta_t + \epsilon_t^i  , \quad b >0,
\end{equation}

where $Y_t^i = f K_t^i$ is output in market $i$,
$\theta_t$ is a persistent component of a demand shock  that is common across
the two industries, and $\epsilon_t^i$ is an industry specific component of
the demand shock that is i.i.d.\ whose time $t$ marginal distributon
is ${\mathcal N}(0, \sigma_{\epsilon}^2)$.


We assume that $\theta_t$ is governed by

\begin{equation} \label{town2a}
\theta_{t+1} = \rho \theta_t + v_{t}
\end{equation}

where $\{v_{t}\}$ is an i.i.d.\ sequence of Gaussian  shocks with mean
zero and variance $\sigma_v^2$.

To simplify notation, we'll set $h=f=1$.

#### Big K, little k convention

In equilibrium, $k_t^i = K_t^i$, but as usual we must distinguish between
$k_t^i$ and $K_t^i$ when we pose the firm's optimization problem.

Townsend wanted to assume that at time $t$ firms in industry $i$
observe $k_t^i, Y_t^i, P_t^i, (P^{-i})^t$, where $(P^{-i})^t$ is
the history of prices in the other market up to time $t$.

(Because that turned out to be too challenging, Townsend made an alternative
assumption that eased his calculations: that after a large number $S$ of periods,
firms in industry $i$  observe the noisy signals received by firms in industry $-i$
$S$ periods ago and earlier.  __TOM: ACTUALLY, I THINK HE ASSUMES THAT THEY SEE $\theta_{t-S}$__)

Because the representative firm $i$ sees the aggregate state
variable $Y_t^i$ in its own industry,  as well as the price, it
can infer the total demand shock $\theta_t + \epsilon_{t}^i$.

However, at time $t$, the firm sees only $P_t^{-i}$ and does not
see $Y_t^{-i}$, so that firm $i$ does not appear to see
$\theta_t + \epsilon_t^{-i}$.

### Punch line

Nevertheless it turns out that in the equilibrium that ultimately interests us
a firm in industry $i$ will be able to infer the
composite shock $\theta_t + \epsilon_t^{-i}$ from the history of random variable that it
does observe at $t$.

We shall proceed  to establish this result in
steps.

#### Strategy

To prepare to solve Townsend's model, we shall first
compute a law  for capital in industry $i$ under a sequence of
assumptions about what a firm observes that make its information
increasingly obscure. 

We begin  with the most information,
then gradually withdraw information in a way that approaches  and eventually
reaches the
 model that we are ultimately interested in. 

Thus,  we
shall consider the following information assumptions:


  * __Perfect foresight:__ future values of
$\theta_t, \epsilon_{t}^i$ are observed in industry $i$.


  * __Observed but stochastic $\theta_t$:__  while
$\{\theta_t,  \epsilon_{t}^i\}$ are realizations from a stochastic
process, current  and past values of each are observed at time $t$.

  * __One noise-ridden observation on $\theta_t$:__  At time $t$, a history $w^t$ of
a scalar noise-ridden observations on $\theta_t$ is observed at time $t$.

  * __Two noise-ridden observations on $\theta_t$__ At time $t$, a history $w^t$ of _two_
 noise-ridden  observations on $\theta_t$ is observed at time $t$.



Computations for  these problems build one on the other.

We proceed by first finding the solution under perfect foresight.

Then to get the solution with $\theta_t$ observed, we use a _certainty
equivalence principle_ to modify the perfect foresight solution by
replacing future values of $\theta_s, \epsilon_{s}^i, s \geq t$
with mathematical expectations conditioned on $\theta_t$. 

This provides the
solution when $\theta_t$ is observed at $t$ but future $\theta_{t+j}$s and $\epsilon_{t+j}^i$s are not observed. 

To find solutions when only
a history of a noise ridden observation $w_t$ on $\theta_t$ is
observed, we again apply a certainty equivalence principle and
replace future values of $\theta_s, \epsilon_{s}^i, s \geq t$ with
 expectations conditioned on $w^t$.

In this way, we construct benchmarks against which we can interpret the equilibrium to
Townsend's model XXX that we shall compute  in section \ref{PCL2} by
applying the machinery of \citeN{PCL}. 

Our solution with two
noise-ridden observations on $\theta_t$ will be a __pooling equilibrium__ that turns out to
match
the equilibrium that we seek because equilibrium prices in that equilibrium completely reveal
to firms in industry $i$ the noisy signal about the demand shock received by firms
in industry $-i$.



### Equilibrium conditions

It is
convenient to formulate the  firm's problem as a discrete time Hamiltonian
by forming the Lagrangian for the problem without uncertainty:

\begin{equation}
J=\sum_{t=0}^\infty \beta^t \left\{
 P_t^i  k_t^i - .5   (\mu_t^i)^2  + \phi_t^i \left[
   k_t^i + \mu_t^i - k_{t+1}^i \right]  \right\} \end{equation}
   
where $\{\phi_t^i\}$ is a sequence of Lagrange multipliers on the transition
law for $k_{t+1}^i$.
First order conditions for the nonstochastic problem are

\begin{eqnarray}
   \phi_t^i & = & \beta \phi_{t+1}^i + \beta  P_{t+1}^i  \label{town4} \\
   \mu_t^i & = & \phi_t^i .  \label{town41} \end{eqnarray}

Substituting the demand function (\ref{town2}) for $P_t^i$,
imposing the condition that the representative firm is
representative ( $k_t^i = K_t^i$),  and using the definition below
of $g_t^i$, the Euler equation (\ref{town4}), lagged by one
period, can be expressed as $- b k_t^i + \theta_t + \epsilon_t^i +
(k_{t+1}^i - k_t^i) - g_t^i =0 $ or

\begin{equation} \label{pcl11}
k_{t+1}^i = (b+1) k_t^i - \theta_t - \epsilon_t^i + g_t^i
\end{equation}

where we define $g_t^i$ by

\begin{equation}\label{town7}
g_t^i = \beta^{-1} (k_t^i
   - k_{t-1}^i)   .
\end{equation}

We can write the Euler equation (\ref{town4}) in terms of $g_t^i$:

\begin{equation} \label{pcl10}
  g_t^i = P_t^i + \beta g_{t+1}^i  .
\end{equation}

In addition, we have the law of motion for $\theta_t$, (\ref{town2a}), and
the demand equation (\ref{town2}).


In summary, with perfect foresight,
 equilibrium conditions for industry $i$  consist of the following
system of difference equations:

\begin{eqnarray}
k_{t+1}^i & = & (1+b)k_t^i - \epsilon_t^i -\theta_t + g_t^i \label{sol1;a} \\
\theta_{t+1} & = & \rho \theta_t + v_t \label{sol1;b} \\
g_{t+1}^i  & = & \beta^{-1} (g_t^i - P_t^i)  \label{sol1;c} \\
P_t^i & = & -b k_t^i + \epsilon_t^i + \theta_t \label{sol1;d} \end{eqnarray}

Without perfect foresight, the same system prevails except that
the following equation replaces
(\ref{sol1;c}):

\begin{equation}
 g_{t+1,t}^i = \beta^{-1} (g_t^i - P_t^i) \end{equation}
 where $x_{t+1,t}$ denotes the mathematical expectation of  $x_{t+1}$ conditional on 
 information at  time $t$.



#### Solution under perfect foresight

Our first step is to compute the equilibrium law of motion
for $k_t^i$ under perfect foresight.
Let $L$ be the lag operator.\footnote{See \citeN{sargent87}, especially
chapters IX and XIV, for the methods used in this section.}
Equations (\ref{pcl10}) and   (\ref{pcl11}) imply the
second order difference equation in $k_t^i$:\footnote{As noted
by \citeN{sargent87}, this difference equation is the Euler equation for
the planning problem   of maximizing the discounted sum of consumer plus
producer surplus.}
\begin{equation} \label{euler1}
\left[ (L^{-1} - (1+b))(1-\beta L^{-1}) + b\right] k_t^i
  = \beta L^{-1} \epsilon_t^i + \beta L^{-1} \theta_t .
\end{equation}
Factor the polynomial in $L$ on the left side as:
\begin{equation}
-\beta [L^{-2} -(\beta^{-1} + (1+b))L^{-1} + \beta^{-1}]
 = \tilde \lambda^{-1}(L^{-1} - \tilde \lambda)(1-\tilde \lambda \beta L^{-1})
\end{equation}
where $|\tilde \lambda | < 1$ is the smaller root and $\lambda$ is
the larger root of $(\lambda-1)(\lambda-1/\beta)=b\lambda$.
Therefore, (\ref{euler1}) can be expressed as
\begin{equation}
\tilde \lambda^{-1}(L^{-1} - \tilde \lambda) (1-\tilde \lambda \beta L^{-1})
k_t^i = \beta L^{-1} \epsilon_t^i + \beta L^{-1} \theta_t .
\end{equation}
Solving the stable root backwards and the unstable root forwards gives
\begin{equation}
k_{t+1}^i = \tilde \lambda k_t^i + {\tilde \lambda \beta \over 1 -\tilde
\lambda \beta L^{-1}}
  (\epsilon_{t+1}^i + \theta_{t+1}  )
\end{equation}
Thus under perfect foresight the capital stock satisfies
\begin{equation} \label{town5}
k_{t+1}^i = \tilde \lambda k_t^i + \sum_{j=1}^\infty (\tilde \lambda \beta)^j
  (\epsilon_{t+j}^i +  \theta_{t+j}) .
\end{equation}
Next, we shall  use alternative forecasting formulae in
 (\ref{town5}) to compute the equilibrium decision rule under alternative
assumptions about the information available to decision makers in
market $i$.



### Solution with $\theta_t$ stochastic but observed at $t$


If future $\theta$'s are unknown at $t$, it is appropriate to
replace all random variables on the right side of (\ref{town5})
with their conditional expectations based on the information
available to decision makers in market $i$. 

For now,  we
assume that this information set $I_t^p =
\begin{bmatrix} \theta^t &  \epsilon^{it} \end{bmatrix}$, where
$z^t$ represents the infinite history of variable $z_s$ up to time
$t$. 

Later we shall  give   firms  less
 information about $\theta_t$.

To obtain the counterpart to (\ref{town5}) under our current
assumption about information, we apply a certainty equivalence principle.
In particular, it is legitimate to  take (\ref{town5}) and replace each term
$( \epsilon_{t+j}^i+ \theta_{t+j}  )$ on the
right side
with
$E[ (\epsilon_{t+j}^i+ \theta_{t+j})  \vert \theta^t ]$.
  After using
(\ref{town2a}) and the i.i.d.\ assumption about $\{\epsilon_t^i\}$, this gives

\begin{equation}
k_{t+1}^i = \tilde \lambda k_t^i + {\tilde \lambda \beta \rho \over 1 -
\tilde \lambda \beta \rho }
\theta_t
\end{equation}

or

\begin{equation} \label{solution1}
k_{t+1}^i = \tilde \lambda k_t^i  + {\rho \over  \lambda - \rho} \theta_t
  \end{equation}

where
$  \lambda \equiv (\beta \tilde \lambda)^{-1} $.

For future purposes, it is useful to represent the solution for $k_t^i$
recursively as

\begin{eqnarray}
k_{t+1}^i  & = & \tilde \lambda k_t^i  + {1 \over \lambda - \rho}
\hat \theta_{t+1} \label{sol0;a} \\
\hat \theta_{t+1}  & = & \rho \theta_t \label{sol0;b} \\
\theta_{t+1} & = & \rho \theta_t + v_t.  \label{sol0;c} \end{eqnarray}





## Filtering


#### One noisy signal

We get closer to the model that we ultimately want to study  by now assuming
  that firms in market $i$ do not observe
 $\theta_t$, but instead observe a history of noisy
signals $w^t$. 

In particular, assume that

\begin{eqnarray}
 w_t  & =  & \theta_t + e_t  \label{kf1}  \\
 \theta_{t+1} & = & \rho \theta_t + v_t \label{kf2}
\end{eqnarray}

where $e_t$ and $v_t$  are  mutually independent i.i.d.\ Gaussian
shock processes with means of zero and variances $\sigma_e^2$ and
$\sigma_v^2$, respectively.




Define

\begin{equation} \hat \theta_{t+1} = E(\theta_{t+1} | w^t)
\end{equation}

where $w^t$ denotes the history of the $w_s$ process up  to and including
$t$.   

Associated with the state-space  representation
(\ref{kf1}),(\ref{kf2}) is the _innovations representation_

\begin{eqnarray}
\hat \theta_{t+1}  & =   &  \rho \hat \theta_t + k a_t \label{kf3} \\
 w_t & = & \hat \theta_t + a_t \label{kf4}
\end{eqnarray}

where $a_t \equiv w_t  - E(w_t | w^{t-1}) $ is  the _innovations_
process in $w_t$ and the Kalman gain $k$ is

\begin{equation}\label{kal1}
k = {\rho p \over p + \sigma_e^2} \end{equation} 

and where $p$
satisfies the Riccati equation

\begin{equation}\label{kf6}
p = \sigma_v^2   + { p \rho^2 \sigma_e^2 \over \sigma_e^2 + p}.
\end{equation}

Define the state _reconstruction error_ $\tilde \theta_t$ by

\begin{equation}
\tilde \theta_t = \theta_t - \hat \theta_t .
\end{equation}

Then $p = E \tilde \theta_t^2$.
Equations (\ref{kf2}) and (\ref{kf3}) imply

\begin{equation} \label{kf7}
\tilde \theta_{t+1} = (\rho - k) \tilde \theta_t + v_t - k e_t .
\end{equation}

Now notice that we can express $\hat \theta_{t+1}$ as

\begin{equation} \label{kf8}
\hat \theta_{t+1} = [\rho \theta_t + v_t]
  + [ ke_t - (\rho -k) \tilde \theta_t - v_t]  ,
\end{equation}

where the first term in braces in the first line equals
$\theta_{t+1}$ and the second term in braces equals $-\tilde \theta_{t+1}$.


####  additional state variable: $\theta$-reconstruction error:

We can express (\ref{solution1}) as

\begin{equation} \label{solution2}
 k_{t+1}^i = \tilde \lambda k_t^i + {1 \over \lambda - \rho}
  E \theta_{t+1} | \theta^t .
\end{equation}

An application of a certainty equivalence principle asserts that
when only $w^t$ is observed, the appropriate solution
is found by   replacing the information set
$\theta^t$ with $w^t$ in
(\ref{solution2}). Making this substitution and using  (\ref{kf8})
leads to

\begin{equation} \label{kf9}
k_{t+1}^i = \tilde \lambda k_t^i + {\rho \over  \lambda - \rho} \theta_t
  + {k \over  \lambda - \rho} e_t
  - {\rho - k \over  \lambda - \rho} \tilde \theta_t .
\end{equation}

Simplifying equation (\ref{kf8}), we also have

\begin{equation} \label{kf8a}
\hat \theta_{t+1} = \rho \theta_t
  +  ke_t - (\rho -k) \tilde \theta_t  .
\end{equation}

Equations (\ref{kf9}), (\ref{kf8a}) describe the solution when
$w^t$ is observed.  

Relative to (\ref{solution1}), the solution
acquires a new state variable, namely, the
$\theta$--reconstruction error, $\tilde \theta_t$.

For future
purposes, by using (\ref{kal1}), it is useful to write (\ref{kf9})
as

\begin{equation}
k_{t+1}^i = \tilde \lambda k_t^i + {\rho \over  \lambda - \rho } \theta_t
+ {1 \over  \lambda - \rho} {p \rho \over p + \sigma_e^2} e_t
  - {1 \over \lambda - \rho} {\rho \sigma_e^2 \over p + \sigma_e^2}
  \tilde \theta_t \label{sol2a}
\end{equation}

In summary, when decision makers in market $i$ observe a noisy signal $w_t$ 
on $\theta_t$ at $t$, we can represent an equilibrium law of motion
for $k_t^i$ as

\begin{eqnarray}
k_{t+1}^i & = & \tilde \lambda k_t^i + {1 \over \lambda - \rho}
  \hat \theta_{t+1} \label{sol4;a} \\
\hat \theta_{t+1} & = & \rho \theta_t + {\rho p \over p + \sigma_e^2} e_t
   - {\rho \sigma_e^2 \over p + \sigma_e^2} \tilde \theta_t
  \label{sol4;b} \\
\tilde \theta_{t+1} & = & { \rho \sigma_e^2 \over p + \sigma_e^2} \tilde
    \theta_t - {p \rho \over p + \sigma_e^2} e_t + v_t
  \label{sol4;c} \\
  \theta_{t+1} & = & \rho \theta_t + v_t .  \label{sol4;d} \end{eqnarray}



### Two noisy signals

We now construct a __pooling equilibrium__ 
by assuming that a firm in industry $i$ receives a vector $w_t$ of _two_ noisy signals  on $\theta_t$:
 
\begin{eqnarray}
  \theta_{t+1} & = & \rho \theta_t + v_t  \label{kf20} \\
  w_t   & = &
 \begin{bmatrix} 1 \\ 1 \end{bmatrix}
 \theta_t
   + \begin{bmatrix} e_{1t} \\ e_{2t} \label{kf21} \end{bmatrix}
\end{eqnarray}

To justify that we are constructing is  a __pooling equilibrium__ we can
assume that

\begin{equation}
\begin{bmatrix} e_{1t} \\ e_{2t}  \end{bmatrix} =
\begin{bmatrix} \epsilon_{t}^1 \\ \epsilon_{t}^2  \end{bmatrix}
\end{equation}

so that a firm in industry $i$ observes the noisy signals on that $\theta_t$
presented to firms in both industries $i$ and $-i$.


The appropriate innovations representation becomes

\begin{eqnarray}
  \hat \theta_{t+1} & = &  \rho
  \hat \theta_t + k a_t \label{kf22} \\
  w_t  & = &
 \begin{bmatrix} 1 \\ 1 \end{bmatrix} \hat \theta_t + a_t
  \label{kf23}
\end{eqnarray}

where $a_t \equiv w_t - E [w_t | w^{t-1}]$ is a $(2 \times 1)$
vector of innovations in $w_t$ and $k$ is now a $(1 \times 2)$
vector of Kalman gains.  

Formulas for the Kalman filter imply
that

\begin{equation} \label{kf24}
  k ={ \rho  p \over 2 p + \sigma_e^2}
\begin{bmatrix}1 & 1 \end{bmatrix}
\end{equation}

where $p = E \tilde \theta_t \tilde \theta_t^T$ now satisfies the
Riccati equation
\begin{equation}\label{ricc2}
  p = \sigma_v^2 + {p \rho^2 \sigma_e^2 \over 2 p + \sigma_e^2}.
\end{equation}



Thus, when the representative firm in industry $i$
observes _two_ noisy signals on $\theta_t$,
we can express the equilibrium law of motion for capital recursively as

\begin{eqnarray}
k_{t+1}^i & = & \tilde \lambda k_t^i + {1 \over \lambda - \rho}
  \hat \theta_{t+1} \label{sol3;a} \\
\hat \theta_{t+1} & = & \rho \theta_t + {\rho p \over 2  p + \sigma_e^2}
(e_{1t}+e_{2t})
   - {\rho \sigma_e^2 \over 2 p + \sigma_e^2} \tilde \theta_t
  \label{sol3;b} \\
\tilde \theta_{t+1} & = & { \rho \sigma_e^2 \over 2 p + \sigma_e^2} \tilde
    \theta_t - {p \rho \over 2 p + \sigma_e^2}
(e_{1t}+e_{2t}) +v_t
  \label{sol3;c} \\
  \theta_{t+1} & = & \rho \theta_t + v_t .  \label{sol3;d} \end{eqnarray}
  
  
Pearlman and Sargent verify that the above representation is equivalent with
what one obtains by using the machinery of PCL XXXXX

#### TOM REWRITE THE FOLLOWING

We shall encounter versions of precisely these formulae again in
section \ref{PCL2} where we compute the equilibrium of Townsend's
model in which the representative firm in industry $i$ receives a
second noisy signal on $\theta_t$ by inferring it from $P_t^{-i}$
and the other information that it has at time $t$. By extracting
signals from the endogenous state variables, it will turn out that
the firm recovers exactly the same process for the key additional
state variable, the state reconstruction error $\tilde \theta_t$,
that imperfect information contributes to the dynamics.





## System Description

\begin{eqnarray*}
k_{t+1}^{i} & = & \tilde{\lambda}k_{t}^{i}+\frac{1}{\lambda-\rho}\hat{\theta}_{t+1}\\
\hat{\theta}_{t+1} & = & \rho\theta_{t}+\frac{\rho p}{2p+\sigma_{e}^{2}}\left(e_{1,t}+e_{2,t}\right)-\frac{\rho\sigma_{e}^{2}}{2p+\sigma_{e}^{2}}\tilde{\theta}_{t}\\
\tilde{\theta}_{t+1} & = & \frac{\rho\sigma_{e}^{2}}{2p+\sigma_{e}^{2}}\tilde{\theta}_{t}-\frac{p\rho}{2p+\sigma_{e}^{2}}\left(e_{1,t}+e_{2,t}\right)+v_{t}\\
\theta_{t+1} & = & \rho\theta_{t}+v_{t}\\
e_{1,t},e_{2,t} & \sim & \mathcal{N}\left(0,\sigma_{e}^{2}\right)\\
v_{t} & \sim & \mathcal{N}\left(0,\sigma_{v}^{2}\right)
\end{eqnarray*}

where:

\begin{eqnarray*}
\left(\tilde{\lambda}-1\right)\left(\tilde{\lambda}-\frac{1}{\beta}\right) & = & b\tilde{\lambda}\\
\left(\lambda-1\right)\left(\lambda-\frac{1}{\beta}\right) & = & b\lambda\\
\tilde{\lambda} & \leq & \lambda\\
p & = & \sigma_{v}^{2}+\frac{p\rho^{2}\sigma_{e}^{2}}{2p+\sigma_{e}^{2}}
\end{eqnarray*}

Parameters: $\beta$, $\rho$, $b$, $\sigma_v$, and $\sigma_e$

## Computational Strategy

#### Step 1: Solve for $\tilde{\lambda}$ and $\lambda$

1. Cast $\left(\lambda-1\right)\left(\lambda-\frac{1}{\beta}\right)=b\lambda$ as $p\left(\lambda\right)=0$ where $p$ is a polynomial function of $\lambda$.
2. Use `numpy.roots` to solve for the roots of $p$
3. Verify $\lambda \approx \frac{1}{\beta\tilde{\lambda}}$

Note that $p\left(\lambda\right)=\lambda^{2}-\left(1+b+\frac{1}{\beta}\right)\lambda+\frac{1}{\beta}$. 

#### Step 2: Solve for $p$

1. Cast $p=\sigma_{v}^{2}+\frac{p\rho^{2}\sigma_{e}^{2}}{2p+\sigma_{e}^{2}}$ as a discrete matrix Riccati equation.
2. Use `quantecon.solve_discrete_riccati` to solve for $p$
3. Verify $p \approx\sigma_{v}^{2}+\frac{p\rho^{2}\sigma_{e}^{2}}{2p+\sigma_{e}^{2}}$

Note that:

\begin{eqnarray*}
A & = & \left[\begin{array}{c}
\rho\end{array}\right]\\
B & = & \left[\begin{array}{c}
\sqrt{2}\end{array}\right]\\
R & = & \left[\begin{array}{c}
\sigma_{e}^{2}\end{array}\right]\\
Q & = & \left[\begin{array}{c}
\sigma_{v}^{2}\end{array}\right]\\
N & = & \left[\begin{array}{c}
0\end{array}\right]
\end{eqnarray*}

#### Step 3: Represent the system using `quantecon.LinearStateSpace`

\begin{eqnarray*}
\left[\begin{array}{c}
k_{t+1}^{i}\\
\hat{\theta}_{t+1}\\
\tilde{\theta}_{t+1}\\
\theta_{t+1}
\end{array}\right] & = & \underbrace{\left[\begin{array}{cccc}
\tilde{\lambda} & 0 & \frac{1}{\lambda-\rho}\frac{-\rho\sigma_{e}^{2}}{2p+\sigma_{e}^{2}} & \frac{\rho}{\lambda-\rho}\\
0 & 0 & \frac{-\rho\sigma_{e}^{2}}{2p+\sigma_{e}^{2}} & \rho\\
0 & 0 & \frac{\rho\sigma_{e}^{2}}{2p+\sigma_{e}^{2}} & 0\\
0 & 0 & 0 & \rho
\end{array}\right]}_{A}\left[\begin{array}{c}
k_{t}^{i}\\
\hat{\theta}_{t}\\
\tilde{\theta}_{t}\\
\theta_{t}
\end{array}\right]+\underbrace{\left[\begin{array}{ccc}
\frac{\sigma_{e}}{\lambda-\rho}\frac{\rho p}{2p+\sigma_{e}^{2}} & \frac{\sigma_{e}}{\lambda-\rho}\frac{\rho p}{2p+\sigma_{e}^{2}} & 0\\
\sigma_{e}\frac{\rho p}{2p+\sigma_{e}^{2}} & \sigma_{e}\frac{\rho p}{2p+\sigma_{e}^{2}} & 0\\
-\sigma_{e}\frac{\rho p}{2p+\sigma_{e}^{2}} & -\sigma_{e}\frac{\rho p}{2p+\sigma_{e}^{2}} & \sigma_{v}\\
0 & 0 & \sigma_{v}
\end{array}\right]}_{C}\left[\begin{array}{c}
z_{1,t+1}\\
z_{2,t+1}\\
z_{3,t+1}
\end{array}\right]\\
G & = & \left[\begin{array}{cccc}
0 & 0 & 0 & 0\end{array}\right]\\
H & = & \left[\begin{array}{c}
0\end{array}\right]\\
\left[\begin{array}{c}
z_{1,t+1}\\
z_{2,t+1}\\
z_{3,t+1}
\end{array}\right] & \sim & \mathcal{N}\left(0,I\right)
\end{eqnarray*}

Initial state: $\left[\begin{array}{ccccc}
0 & 0 & 0 & 0 & 0\end{array}\right]'$

As usual, this representation is one of many possible representations.

In [92]:
import numpy as np
import quantecon as qe
from plotly.subplots import make_subplots
import plotly.graph_objects as go
import plotly.express as px
import plotly.offline as pyo


pyo.init_notebook_mode(connected=True)

In [93]:
β = 0.9  # Discount factor
ρ = 0.8  # Persistence parameter for the hidden state
b = 0.5  # Demand curve parameter
σ_v = 0.5  # Standard deviation of shock to θ_t 
σ_e = 0.6  # Standard deviation of shocks to w_t

In [94]:
# Compute λ
poly = np.array([1, -(1 + β + b) / β, 1 / β])
roots_poly = np.roots(poly)
λ_tilde = roots_poly.min()
λ = roots_poly.max()

In [95]:
# Verify that λ = (βλ_tilde) ^ (-1)
tol = 1e-12
np.max(np.abs(λ - 1 / (β * λ_tilde))) < tol

True

In [96]:
A_ricc = np.array([[ρ]])
B_ricc = np.array([[np.sqrt(2)]])
R_ricc = np.array([[σ_e ** 2]])
Q_ricc = np.array([[σ_v ** 2]])
N_ricc = np.zeros((1, 1))
p = qe.solve_discrete_riccati(A_ricc, B_ricc, Q_ricc, R_ricc, N_ricc).item()

In [97]:
# Verify that p = σ_v^2 + (pρ^2σ_e^2) / (2p + σ_e^2)
tol = 1e-12
np.abs(p - (σ_v ** 2 + p * ρ ** 2 * σ_e ** 2 / (2 * p + σ_e ** 2))) < tol

True

In [98]:
term_0 = -ρ * σ_e ** 2 / (2 * p + σ_e ** 2)
term_1 = ρ * p / (2 * p + σ_e ** 2)

A_lss = np.array([[λ_tilde, 0., term_0 / (λ - ρ), ρ / (λ - ρ)],
                 [0., 0., term_0, ρ],
                 [0., 0., -term_0, 0.,],
                 [0., 0., 0., ρ],])

C_lss = np.array([[term_1 * σ_e / (λ - ρ), term_1 * σ_e / (λ - ρ), 0.],
                 [term_1 * σ_e, term_1 * σ_e, 0.],
                 [-term_1 * σ_e, -term_1 * σ_e, σ_v],
                 [0., 0., σ_v]])

G_lss = np.zeros((1, 4))

In [99]:
mu_0 = np.array([0., 0., 0., 0.])

lss = qe.LinearStateSpace(A_lss, C_lss, G_lss, mu_0=mu_0)

In [100]:
ts_length = 100_000
x, y = lss.simulate(ts_length)

In [101]:
# Plot sample time path
t = 300

subplot_titles = [r'$k^{i}_t$',
                  r'$\hat{\theta}_t$',
                  r'$\tilde{\theta}_t$',
                  r'$\theta_t$']

fig = make_subplots(rows=x.shape[0], cols=1, subplot_titles=subplot_titles)

for idx in range(x.shape[0]):
    fig.add_trace(go.Scatter(y=x[idx, :t],
                         legendgroup='trend'),
             row=idx+1,
             col=1)
    
fig.update_layout(height=1200)    
    
fig.show()

In [102]:
# Verify: \theta = \hat{\theta} + \tilde{\theta} 
np.max(np.abs(x[1] + x[2] - x[3]))

4.440892098500626e-16

In [103]:
fig = px.histogram(x[2], title=r'$\mathrm{Histogram: }\: \tilde{\theta}_{t}$')
fig.update_layout(height=500)   

In [104]:
# Compute the mean of \tilde{\theta}
x[2].mean()

-0.0043980058497289165

#### Comment for Tom

For all $k_{t}^{i},\theta_{t},\tilde{\theta}_{t}\in\mathbb{R}$, the event $\left.k_{t+1}^{i}<0\right|k_{t}^{i},\theta_{t},\tilde{\theta}_{t}$ does not have probability 0 given that the support of the normal distribution is the real line. How should this be reconciled with interpreting $k_{t}^{i}$ as capital?