<br><font color = darkblue size=6><strong>CQF Master Notes</strong></font> 

# The Random Behavior of Assets

## The three main types of analysis used in finance

1. Fundamental analysis: Not good, as the market can stay irrational for longer than you can stay solvent. 

2. Technical analysis: Not good

3. Quantitative analysis:
    * The unpredictability of asset prices is the most important feature of financial modeling
    * Because there is so much randomness, the most successful mathematical models of financial assets have a probabilistic foundation.
    * the absolute value of the investment is of less interest, we are more interested in **Return**, which means the ‘relative’ growth in the value of an asset, together with accumulated cashflows (such as dividends), over some period, based on the value that the asset started with.
    * In finance theory this return is usually treated as being **random**.
    * The random return is often assumed to be Normally distributed. This is not perfect but is a good starting point
    * The asset price can then be modeled as a lognormal random walk

## Modeling Returns

### Discrete Time

If the asset value on the ith day is denoted by $S_i$, then the return from day i to day i + 1 is given by:

$R_i = \frac{S_{i+1}-S_i}{S_i}$

We will model the returns each day as random, and independent from one day to the next.

* Mean = $\overline{R} = \frac{1}{M}\sum_{i=1}^{M}R_i$

* Sample standard deviation = $\sqrt{\frac{1}{M-1}\sum_{i=1}^{M}(R_i-\overline{R})^2}$
* Where M is the number of returns in the sample

We are then curious about the shape of the distribution (i.e. PDF):
* Step 1: standardize the distribution to give it a mean of zero and a standard deviation of one.
* Step 2: compare it to the standardized normal distribution: 
    $\frac{1}{\sqrt{2\pi}}e^{-\frac{1}{2}\phi^2}$, where $\phi$ is a random variable drawn from a Gaussian distribution.
    
If we assume that the empirical returns can be modeled by a Normal distribution then we can model R:

$R_i = \frac{S_{i+1}-S_i}{S_i}$ = mean + std x $\phi$, where mean and std are known, constant, and non-zero.

    Note: Commodities may show seasonal behaviour, so the mean and standard deviation may vary with time.
    
    
Assume $\delta t$ is the time step where $\delta t =\frac{t}{M}$ and $\delta t \rightarrow 0$:

* mean = $\mu\delta t = \frac{S_{i+1}-S_i}{S_i}$
   
   so, $S_{i+1} = S_i(1+\mu\delta t)$
   
   and $S_M = S_0(1+\mu\delta t)^M = S_0e^{Mlog(1+\mu\delta t)} \approx S_0e^{M\mu\delta t} = S_0e^{\mu M\delta t} = S_0e^{\mu t}$
   
   Therefore, $S(t) = S_0e^{\mu t}$
   
   $\mu$ is called the growth rate or drift rate.
   
   
* standard deviation = $\sigma\delta t^{1/2}$, where $\sigma$ is a parameter measuring the amount of randomness, or volatility, or annualized standard deviation of returns.

    note: $Std[X+Y] \neq Std[X]+Std[Y]$, but $Var[X+Y] = Var[X] + Var[Y]$ when X and Y are independent
    
Therefore, 
$R_i = \frac{S_{i+1}-S_i}{S_i}$ = mean + std x $\phi = \mu\delta t+\sigma\phi\delta t^{1/2}$

$\rightarrow S_{i+1}-S_i = \mu S_i\delta t+\sigma S_i\phi\delta t^{1/2}$

$\rightarrow S_{i+1}= (1+\mu\delta t)S_i+\sigma S_i\phi\delta t^{1/2}$, equations in this form are the basis for **Monte Carlo simulations**. This is a discrete-time model for a **random walk** of the asset.


Because of their different scalings with time, the growth and volatility have different effects on the asset path.
* The growth is not apparent over short timescales. The volatility dominates in the short term.
* Over long timescales, for instance decades, the growth becomes important.


### Continuous Time

We want to get to a continuous time model because maths is easier in continuous time.

#### The Wiener process

d· means ‘the change in’ some quantity. So dS is the ‘change in the asset price.’ 

But this change will be in continuous time. In effect, we will go to the limit δt = 0.

According to Wiener process: $\phi\delta t^{1/2} = dX$

Here dX is a random variable, drawn from a Normal distribution with mean zero and variance dt:
* $E[dX] = 0$ and $E[dX^2] = dt$.


#### Stochastic Differential Equation (SDE)

Using the Wiener process notation, the asset price model can be written as $S_{i+1}-S_i = \mu S_i\delta t+\sigma S_i\phi\delta t^{1/2} = \mu S_idt+\sigma S_idX$

It is a continuous-time model of an asset price. It is the most widely accepted model for equities, currencies, commodities and indices, and the foundation of much finance theory.


# Binomial Model (Need to review the additional notes)

The most ‘accessible’ approach to option pricing is the binomial model.It is a very useful tool for conveying the ideas of delta hedging and no arbitrage, in addition to the subtle concept of
risk neutrality and option pricing.

## Option Pricing Theory

Key assumptions
* Short selling allowed
* No arbitrage opportunities

Relaxable assumptions:

* Frictionless markets - no transaction costs, limits to trading or taxes
* Perfect liquidity
* Known volatility and interest rates
* No dividends on the underlying

## Binomial Model

Key assumptions:

* an asset value changes only at discrete time intervals
* fractional trading is allowed
* an assetís worth can change to one of only two possible new values at each time step.


The stock volatility is of key important in the pricing of options, but the drift is not.

Assume $\Delta$ is the number of shares, then:

$\Delta = \frac{V^+-V^-}{S^+-S^-} = \frac{\mbox{Range of option payoffs}}{\mbox{Range of stock prices}}$

We can think of ∆ as the sensitivity of the option to changes in the stock

Delta hedging means choosing ∆ such that the portfolio value does not depend on the direction of the stock.

The above is in discrete time, discrete stock. When we go to continuous time continuous stock $\Delta$ will become $\frac{\partial V}{\partial S}$

If interest rate $r$ is not zero, then we need a discount factor.

Consider a portfolio $\Pi$, long an option and short $\Delta$ assets. $V^+$ and $V^-$ denotes the
option value corresponding to asset price $S^+$ and $S^-$. 

No-arbitrate suggests that:

$\Pi e^{rT} = \Pi^- = \Pi^+$
$=V^- -\Delta S$
$= V^- - (\frac{V^+-V^-}{S^+-S^-})S^-$
$=\frac{V^-(S^+ - S^-)-S^-(V^+-V^-)}{S^+ - S^-}$
$=\frac{V^-S^+ - S^-V^+}{S^+ - S^-}$


$(V-\Delta S)e^{rT}=\frac{V^-S^+ - S^-V^+}{S^+ - S^-}$

$Ve^{rT}=\frac{V^-S^+ - S^-V^+}{S^+ - S^-} +(\frac{V^+-V^-}{S^+-S^-})Se^{rT} $

$=\frac{V^-S^+ - S^-V^+}{S^+ - S^-} +(\frac{V^+-V^-}{S^+-S^-})Se^{rT} $

$=(\frac{e^{rT}S-S^-}{S^+ - S^-})V^+ +(\frac{S^+-e^{rT}S}{S^+-S^-})V^- $

$ = qV^+ + (1-q)V^-$

$V = e^{-rT}(qV^+ + (1-q)V^-)$

Where we define $q = \frac{e^{rT}S-S^-}{S^+ - S^-}$ with $0<q<1$

If compounding is discrete,

$ q = \frac{(1+rT)S-S^-}{S^+ - S^-}$

and $V = (1-rT)(qV^+ + (1-q)V^-)$

q is a risk-neutral probability which come about from insistence on no arbitrage. It has nothing to do with the real probabilities of $S^+$ and $S^-$ occurring. Pricing a call using the real probability, p, will probably make you a profit, but a loss is also possible. Pricing an option using the risk neutral probability, q, you will certainly make neither a profit nor a loss.

### Application

Calculate Option value V:
    
* Step 1: Calculate $\Delta$ using $\Delta = \frac{V^+-V^-}{S^+-S^-}$
* Step 2: Calculate $V$ using $V-\Delta S = e^{-rT}(V^+ - \Delta S^+)$ 

## Introducing Symbols

In the binomial model we assume that the asset, which initially has the value S, can, during a time step δt, either:
* rise to a value $u × S$ or
* fall to a value $v × S$,

with $0 < v < 1 < u$.

The 'real-world' probability of a rise is p and so the probability of a fall is 1 − p.

How should we choose u, v and p?
Let’s choose them so that they have the same mean and standard deviation as our earlier lognormal random walk!


* Equation 1 (mean of the change in asset price): $\mu\delta t = p\mu S +(1-p)vS-S$

* Equation 2 (variance of the change in asset price): $\sigma^2S^2\delta t = \frac{\sum(x_i - \bar x)^2}{n-1} = (p(u-1)S-mean)^2+((1-p)(v-1)S-mean)^2$
$= (p(u-1)S-(p\mu S +(1-p)vS-S))^2+((1-p)(v-1)S-(p\mu S +(1-p)vS-S))^2$
$=S^2(p(u-1-(pu+(1-p)v-1))^2+(1-p)((v-1-(pu+(1-p)v-1))^2)$

Equations 1 and 2 will yield infinite number of solutions because we have two equations for 3 variables. Paul Wilmott chose:

* $u = 1 + \sigma\sqrt{\delta t}$

* $v = 1 - \sigma\sqrt{\delta t}$

* $p = \frac{1}{2} +\frac{\mu\sqrt{\delta t}}{2\sigma}$

In this case:

$\Delta = \frac{V^+-V^-}{(u-v)S}$

$V = \frac{1}{1+r\delta t}(p'V^+ + (1-p')V^-)$, where $p' = \frac{1}{2} + \frac{r\sqrt{\delta t}}{2\sigma}$

The option value at any time is the present value of the risk-neutral expected value at any later time.


## Black-Scholes equation

If $\delta t \rightarrow 0$, we’ll end up with what’s known as a partial differential equation, the famous Black–Scholes equation, which is continuous time.

Now let’s represent the option prices, not by a collection of numbers associated with places in a tree but by a function of stock price S and time t. Call that function V (S, t). Think of V (S, t) being the option value at the root of a single
branch.

Then,

* $V^+ = V(uS,t+\delta t) = V((1+\sigma\sqrt{\delta t})S, t+\delta t)$
$\approx V(S,t) + \frac{\partial V}{\partial t}\delta t + \frac{\partial V}{\partial S}\sigma\sqrt{\delta t}S+\frac{1}{2}\frac{\partial^2V}{\partial S^2}\sigma^2\delta tS^2 + o(\delta t)$

* $V^- = V(vS,t+\delta t) = V((1-\sigma\sqrt{\delta t})S, t+\delta t)$
$\approx V(S,t) + \frac{\partial V}{\partial t}\delta t - \frac{\partial V}{\partial S}\sigma\sqrt{\delta t}S+\frac{1}{2}\frac{\partial^2V}{\partial S^2}\sigma^2\delta tS^2 + o(\delta t)$



* $\Delta = \frac{V^+-V^-}{(u-v)S}$
$= \frac{V((1+\sigma\sqrt{\delta t})S, t+\delta t) - V((1-\sigma\sqrt{\delta t})S, t+\delta t)}{2\sigma\sqrt{\delta t}S}$
$\approx\frac{2\frac{\partial V}{\partial S}\sigma\sqrt{\delta t}S}{2\sigma\sqrt{\delta t}S}$
$=\frac{\partial V}{\partial S}$, as $\delta t \rightarrow 0$
* pricing equation: $V-\Delta S = \frac{1}{1+r\delta t}(V^+ - \Delta S^+)$ 

$\rightarrow V-\frac{\partial V}{\partial S}S = \frac{1}{1+r\delta t}((V(S,t) + \frac{\partial V}{\partial t}\delta t + \frac{\partial V}{\partial S}\sigma\sqrt{\delta t}S+\frac{1}{2}\frac{\partial^2V}{\partial S^2}\sigma^2\delta tS^2) - \frac{\partial V}{\partial S}S(1+\sigma\sqrt{\delta t}))$


$ = \frac{1}{1+r\delta t}(V + \frac{\partial V}{\partial t}\delta t + \frac{\partial V}{\partial S}S+\frac{1}{2}\frac{\partial^2V}{\partial S^2}\sigma^2\delta tS^2)$

$\rightarrow V+Vr\delta t -\frac{\partial V}{\partial S}S - \frac{\partial V}{\partial S}Sr\delta t = V + \frac{\partial V}{\partial t}\delta t + \frac{\partial V}{\partial S}S+\frac{1}{2}\frac{\partial^2V}{\partial S^2}\sigma^2\delta tS^2$

$\frac{\partial V}{\partial t}+\frac{1}{2}\sigma^2S^2\frac{\partial^2V}{\partial S^2} + rS\frac{\partial V}{\partial S} -rV=0$, this is the Black-Scholes equation.

That is, The continuous-time limit of the binomial model is the Black–Scholes equation

# Transition Probability Density Function for Random Walk

Modern finance theory, especially derivatives theory, is based on the random movement of financial quantities.
We are now going to explore the simple idea of the random walk and see its relationship to differential equations.
This is achieved via the concept of a transition probability density function.

* Random walks have associated differential equations for their probability density functions, and are naturally related to the normal distribution: $p(y,t;y',t') = \frac{1}{2c\sqrt{\pi (t'-t)}}exp(-\frac{(y'-y)^2}{4c^2(t'-t)})$


## The trinomial random walk

y is the value of our random variable. The variable y can either rise, fall or take the same value after a time step δt. These movements have certain probabilities associated with them. α is a probability for up and down. δy is the size of the move in y.

Suppose the top branch is chosen after one time step. After the second there are three places that y could be.

After lots of time steps we end up with a random walk.

Often we are interested in the probabilistic properties of the random walk rather than the outcome of a single realization.

## The transition probability density function

To analyze the probabilistic properties of the random walk, we introduce the transition probability density function p(y, t; y'
, t') defined by:

$Prob(a < y' < \mbox{b at time t'}| \mbox{y at time t}) = \int_a^bp(y,t; y',t')dy'$

In words this is “the probability that the random variable y' lies between a and b at time t' in the future, given that it started out with value y at time t.”

The transition probability density function can be used to answer the question, “What is the probability of the variable y' being in a specified range at time t' in the future given that it started out with value y at time t?”

The transition probability density function p(y, t; y', t') satisfies two equations:

    (1) Forward Kolmogorov Equation:  involving derivatives with respect to the future state and time (y' and t'). that is, (y', t') varies and (y, t) fixed

    (2) Backward Kolmogorov Equation: involving derivatives with respect to the current state and time (y and t). That is, (y, t) varies, and (y', t') fixed

### Forward Kolmogorov Equation: $\frac{\partial p}{\partial t'} = c^2\frac{\partial^2p}{\partial y'^2}$


The variable y takes the value y' at time t', but how did it get there?

In our trinomial walk we can only get to the point y' from the three values y' + δy, y' and y' − δy.

$p(y, t; y', t') = α p(y, t; y' + δy, t' − δt) +(1 − 2α)p(y, t; y', t' − δt) + α p(y, t; y' − δy, t' − δt)$

Now do T.S.E on each of the terms, and assume we use shorthand notation p(y,t;y',t')=p:

$p = $
$\alpha (p - \frac{\partial p}{\partial t'}\delta t + \frac{\partial p}{\partial y'}\delta y + \frac{1}{2}\frac{\partial^2p}{\partial y'^2}\delta y^2)$ 
$+ (1-2\alpha)(p-\frac{\partial p}{\partial t'}\delta t)$
$+ \alpha (p - \frac{\partial p}{\partial t'}\delta t - \frac{\partial p}{\partial y'}\delta y + \frac{1}{2}\frac{\partial^2p}{\partial y'^2}\delta y^2)$

$\delta t\frac{\partial p}{\partial t'}= \alpha\delta y^2\frac{\partial^2p}{\partial y'^2}$

Let's define $\frac{\alpha\delta y^2}{\delta t} = c^2$, for some finite, non-zero c.

The final equation is now:

$\frac{\partial p}{\partial t'} = c^2\frac{\partial^2p}{\partial y'^2}$

This is the Fokker–Planck or forward Kolmogorov equation. It is a forward parabolic partial differential equation, requiring
initial conditions at time t and to be solved for t' > t

This equation is to be used if there is some special state now and you want to know what could happen later. For example, you
know the current value of y and want to know the distribution of values at some later date.

Observations:
* This is a partial differential equation for p as a function of two independent variables y' and t'
* It is an example of a diffusion equation / heat equation
* y and t are rather like parameters in this problem, think of them as starting quantities for the random walk. i.e. fixed.
* This is a diffusion equation. You need that special relationship between α, δt and δy to get this equation. $\delta y \sim O(\sqrt{\delta t})$
* This is also an example of Brownian motion.
* When we get on to financial applications the quantity c will be related to volatility: $\frac{\partial p}{\partial t'} = c^2\frac{\partial^2p}{\partial y'^2} \rightarrow \frac{\partial w}{\partial t}=\frac{1}{2}\sigma^2\frac{\partial^2w}{\partial x^2}$



### Backward Kolmogorov Equation: $\frac{\partial p}{\partial t} = - c^2\frac{\partial^2p}{\partial y^2}$


Now we come to find the backward equation. This will be useful if we want to calculate probabilities of reaching a specified final state from various initial states. It will be a backward parabolic partial differential equation requiring conditions imposed in the future, and solved backwards in time.

Whereas the forward equation had independent variable t' and y' the backward equation has variables t and y.

$p(y, t; y', t') = α p(y + δy, t + δt; y', t') +(1 − 2α)p(y, t + δt; y', t') + α p(y − δy, t + δt; y', t')$

$p = \alpha (p + \frac{\partial p}{\partial t}\delta t + \frac{\partial p}{\partial y}\delta y + \frac{1}{2}\frac{\partial^2p}{\partial y^2}\delta y^2)$ 
$+ (1-2\alpha)(p+\frac{\partial p}{\partial t}\delta t)$
$+ \alpha (p + \frac{\partial p}{\partial t}\delta t - \frac{\partial p}{\partial y}\delta y + \frac{1}{2}\frac{\partial^2p}{\partial y^2}\delta y^2)$

$\rightarrow \frac{\partial p}{\partial t}\delta t = -\alpha\frac{\partial^2p}{\partial y^2}\delta y^2$

$\rightarrow \frac{\partial p}{\partial t} = - c^2\frac{\partial^2p}{\partial y^2}$

Warning: More general random walks lead to significantly more complicated forward and backward equations, and their relationship is no longer as simple as a change of sign.


## Similarity Solutions

Generally partial differential equations are hard to solve explicitly, but sometimes they can be simplified to ordinary differential equations.

* Step 1: Create a new variable combining y' and t' in a special way
* Step 2: Reduce the dimension of the problem

The equation to be solved is:

$\frac{\partial p}{\partial t'} = c^2\frac{\partial^2p}{\partial y'^2}$

This equation has an infinite number of solutions. It has different solutions for different initial conditions and different boundary conditions.

The initial condition tells you how the solution starts off. We must specify p as a function of y' at some point in time, t'.

Boundary conditions tell you how the function behaves on specified y' boundaries. Diffusion equations typically need two boundary conditions.

We are now going to find a very simple solution. It is very simple and very special because unlike most solutions of the diffusion it does not depend on two independent variable y' and t' but on a combination of them.

Let us seek a solution of the form:

$p = t^{'a}f(\frac{y'}{t^{'b}}) = t^{'a}f(\xi)$ 

Here a and b are constants. Note that f is a function of only the one variable

$\xi = \frac{y'}{t^{'b}} = y't^{'-b}$

$\frac{\partial\xi}{\partial y'}= t^{'-b}$

$\frac{\partial\xi}{\partial t'} = -by't^{'-b-1}$

$\frac{\partial p}{\partial t'} = \frac{\partial}{\partial t'}[t^{'a}f(\xi)]$
$=at^{'a-1}f(\xi) + t^{'a}\frac{df}{d\xi}\frac{\partial\xi}{\partial t'}$
$=at^{'a-1}f(\xi) + t^{'a}\frac{df}{d\xi}(-by't^{'-b-1})$
$=at^{'a-1}f(\xi) -by't^{'a-b-1}\frac{df}{d\xi}$

$\frac{\partial p}{\partial y'} = t^{'a}\frac{df}{d\xi}\frac{\partial\xi}{\partial y'}$
$ = t^{'a}\frac{df}{d\xi}t'^{-b}$
$ = t^{'a-b}\frac{df}{d\xi}$

$\frac{\partial^2p}{\partial y'^2} = t^{'a-b}\frac{d^2f}{d\xi^2}\frac{\partial\xi}{\partial y'}$
$ = t^{'a-b}\frac{d^2f}{d\xi^2}t^{'-b}$
$= t^{'a-2b}\frac{d^2f}{d\xi^2}$

Let’s substitute these into F.K.E:
$at^{'a-1}f(\xi) -by't^{'a-b-1}\frac{df}{d\xi}$
$=c^2t^{'a-2b}\frac{d^2f}{d\xi^2}$

$at^{'a-1}f(\xi) -b\xi t^{'a-1}\frac{df}{d\xi}$
$=c^2t^{'a-2b}\frac{d^2f}{d\xi^2}$

$af(\xi) -b\xi\frac{df}{d\xi}$
$=c^2t^{'1-2b}\frac{d^2f}{d\xi^2}$



Therefore,

The left-hand side of this equation is only a function of ξ, whereas the right-hand side depends on both ξ and t'. This is only possible if the right-hand side is also independent of t'.


$1-2b =0 \rightarrow b =\frac{1}{2}$

$af(\xi) -\frac{1}{2}\xi\frac{df}{d\xi}$
$=c^2\frac{d^2f}{d\xi^2}$

If we can solve the above, then we have found a solution of our original equation in the form

$p = t^{'a}f(\frac{y'}{\sqrt{t'}} )$

And this isn’t just a single solution, it is a whole family of solutions because we can choose the constant a.

However, for our present problem, only one value of a is relevant. Remember that p represents a probability. That means that its integral must be one:

$\int_{-\infty}^{\infty}p(y',t')dy' = 1 = \int_{-\infty}^{\infty}t^{'a}f(\frac{y'}{\sqrt{t'}})dy'$  ---(1)

Let $u = \frac{y'}{\sqrt{t'}}$

$\frac{du}{dy'} = t^{-1/2}$

$\rightarrow dy' = dut^{1/2}$


Equation (1) then become $t'^{a+1/2}\int_{-\infty}^{\infty}f(u)du =1$

$\because \int_{-\infty}^{\infty}f(u)du =1$
$\therefore$ This is only possible if $a = \frac{−1}{2}$

The ordinary differential equation is now:

$-\frac{1}{2}f(\xi) -\frac{1}{2}\xi\frac{df}{d\xi}$
$=c^2\frac{d^2f}{d\xi^2}$

$-\frac{1}{2}(f(\xi) +\xi\frac{df}{d\xi})$
$=c^2\frac{d^2f}{d\xi^2}$

$-\frac{1}{2}(\frac{d}{d\xi}\xi f(\xi))$
$=c^2\frac{d^2f}{d\xi^2}$



Integrate once:

$-\frac{1}{2}\xi f(\xi) =c^2\frac{df}{d\xi}$+Constant

Here, Constant = 0 because as $\xi \rightarrow \infty$, $f(\xi) \rightarrow 0$

$\frac{df}{f} = -\frac{1}{2c^2}\xi d\xi$

$\int\frac{df}{f} = -\frac{1}{2c^2}\int\xi d\xi$

$log f = -\frac{1}{2c^2}\frac{1}{2}\xi^2$ + Constant

$f = Ae^{-\xi^2/4c^2}$, where A is an arbitrary constant

The constant A is chosen so that the integral of f is one

$\int_{R}f(\xi)d\xi = 1

$\int Ae^{-\xi^2/4c^2} = 1$

$A\int e^{-\xi^2/4c^2} = 1$ ---(2)

Assume $x = \frac{\xi}{2c}$

$\frac{dx}{d\xi} = \frac{1}{2c}

$d\xi = 2cdx$

Equation (2) becomes $2cA\int e^{-x^2}dx = 1$

$2cA\sqrt{\pi} = 1$

$A= \frac{1}{2c\sqrt{\pi}}$


Therefore, 

$f = \frac{1}{2c\sqrt{\pi}}e^{-\xi^2/4c^2}$

$p = t^{'-1/2}\frac{1}{2c\sqrt{\pi}}e^{-(y'/\sqrt{t})^2/4c^2}$
$=\frac{1}{2c\sqrt{\pi t'}}exp(-\frac{y'^2}{4c^2t'})$

So, $y' \sim N(0, 2c^2t')$

Minor generalization. . . suppose that y' has value y at time t then we have:

$p(y,t;y',t') = \frac{1}{2c\sqrt{\pi (t'-t)}}exp(-\frac{(y'-y)^2}{4c^2(t'-t)})$

And this is our transition probability density function for our random walk!


## Apply Transition Probability Density Function to Ito

Let’s look at the equations governing the probability distribution for an arbitrary random walk:

$dG_t = A(t, G_t)dt+B(t, G_t)dX_t$, for the variable G

Need to define $\phi^+(y,t), \phi^-(y,t)$ such that the mean of Trinomial Random Walk matches that of the continuous time SDE. Here $\phi^+, \phi^-$ are the probability of going up and down, so the probability for no change is $1-\phi^+-\phi^-$. Also need to ensure the variance of discrete time random variable matches that of the continuous time SDE. 

$E[dG] = E[Adt]+E[BdX]$
$= A\delta t$

$V[dG] = V[Adt] + V[BdX] = 0 + B^2dt$

$V[Adt] = 0$ because $Adt$ is a scalar and the variance of a scaler is 0. 

### Forward Kolmogorov Equation (or Fokker-Plan)

$\frac{\partial p}{\partial t'}=\frac{1}{2}\frac{\partial^2}{\partial y'^2}(B(y',t')^2p)-\frac{\partial}{\partial y'}(A(y',t')p)$ (not derived in class)

In M1L3, $B(y',t')^2 = 1$, and $A(y',t')=0$

This is a linear parabolic differential equation. y, and t are fixed.

For example, in the case of $dS=\mu S dt+\sigma SdX$:

$A(s,t)=\mu S$

$B(s,t)=\sigma S$

So, 

$\frac{\partial p}{\partial t'}=\frac{1}{2}\frac{\partial^2}{\partial y'^2}(B(y',t')^2p)-\frac{\partial}{\partial y'}(A(y',t')p)$

$=\frac{1}{2}\frac{\partial^2}{\partial S'^2}\sigma^2 S'^2p-\frac{\partial}{\partial S'}(\mu S'p)$

to solve for $p(S,t;S',t')$ we need to reduce the above equation to a 1-dimensional heat equation:

(1) Change time

(2) log S

(3) transformation of x-axis

...we will derive this in Module 3:


$p(S,t;S',t') = \frac{1}{\sigma S'\sqrt{2\pi(t'-t)}}e^{-(log(S/S')+(\mu-\frac{1}{2}\sigma^2)(t'-t))^2/2\sigma^2(t'-t)}$

This is the transition PDF for stock price. So if you want to figure out what's the probability for what the stock price might be in the future, simply integrate the above, assuming you know the price today.

#### The steady-state distribution

Some random walks have a steady-state distribution. That is, in the long run as $t' \rightarrow \infty$ the distribution p(y, t; y', t') as a function of y' settles down to be independent of the starting state y and time t. That is, $P=P(y,t) \rightarrow P=P(y)$. Therefore, $\frac{\partial p}{\partial t'} \rightarrow 0$, and $\frac{\partial}{\partial y'} \rightarrow \frac{d}{dy'}$. Possible examples are stochastic differential equation models for interest rates, inflation, volatility.

Some random walks have no such steady state even though they have a time-independent equation. For example the lognormal random walk either grows without bound or decays to zero.

If there is a steady-state distribution p∞(y') then it satisfies the ordinary differential equation $\frac{1}{2}\frac{d^2}{dy'^2}(B^2p\infty)-\frac{d}{dy'}(Ap\infty)=0$   (FKT for steady state distribution)

Here, $p\infty$ represents steady state distribution because $t' \rightarrow \infty$.

Example: The Vasicek model

$dr = γ (\overline r − r) dt+σdX$

$= - γ (r - \overline r) dt+σdX$


The steady-state distribution p∞(r') satisfies:

$\frac{1}{2}\sigma^2\frac{d^2p\infty}{dr'^2}-\gamma\frac{d}{dr'}((\overline r-r')p\infty)=0$

Now, let's solve for $p\infty$:

$\frac{1}{2}\sigma^2\frac{d^2p\infty}{dr'^2} = -\gamma\frac{d}{dr'}((r'- \overline r)p\infty)$

Integrate once:

$\frac{1}{2}\sigma^2\frac{dp\infty}{dr'} = -\gamma((r'- \overline r)p\infty+ Constant$

$\because$ p is a pdf & r is a random variable

$\therefore$ as r $\rightarrow \infty$, $p \rightarrow 0$ and $\frac{dp}{dr} \rightarrow 0$. So Constant = 0.

$\frac{1}{2}\sigma^2\frac{dp\infty}{dr'} = -\gamma((r'- \overline r)p\infty$


$\frac{dp}{p}=-\frac{2\gamma}{\sigma^2}(r-\overline r)dr$

integrate:

$\int\frac{dp}{p}=-\frac{2\gamma}{\sigma^2}\int(r-\overline r)dr$

$log p = -\frac{2\gamma}{\sigma^2}(\frac{1}{2}(r-\overline r)^2) + Constant$

$log p = -\frac{\gamma}{\sigma^2}(r-\overline r)^2 + Constant$

$p = Ae^{-\frac{\gamma}{\sigma^2}(r-\overline r)^2}$

To calculate A, use $\int_{R}pdr=1$ and use integration by substitution:

put $x=\frac{\sqrt{\gamma}}{\sigma}(r-\overline r)$

$\frac{dx}{dr} = \frac{\sqrt{\gamma}}{\sigma}$

$\frac{\sigma}{\sqrt{\gamma}}dX=dr$

$\int_{R}pdr=1$

$A\int e^{-X^2}dr =1$

$A\int e^{-X^2}\frac{\sigma}{\sqrt{\gamma}}dX =1$

$\frac{\sigma}{\sqrt{\gamma}}A\int e^{-X^2}dX = 1$

$\frac{\sigma}{\sqrt{\gamma}}A\sqrt{\pi} = 1$

$A=\frac{1}{\sigma}{\sqrt{\frac{\gamma}{\pi}}}$


$p_\infty (r) = \frac{1}{\sigma}{\sqrt{\frac{\gamma}{\pi}}}e^{-\frac{\gamma}{\sigma^2}(r-\overline r)^2}$

In other words, the interest rate r is Normally distributed with mean $\overline r$ and standard deviation $\sigma/\sqrt{2\gamma}$

### Backward Kolmogorov Equation

$\frac{\partial p}{\partial t} + \frac{1}{2}B(y,t)^2\frac{\partial^2p}{\partial y^2}+A(y,t)\frac{\partial p}{\partial y}=0$


# Applied Stochastic Calculus

## Brownian Motion: $X_t$

It is the continuous-time limit of our discrete time random walk. It is denoted as $X(t), or X_t, or W, or W(t), or W_t$. 

**Classical definition of a Brownian motion:**

1. $X_0$ = 0
2. the sample path $t \rightarrow X_t$ are continuous
3. independent increments: for $t_1 < t_2 < t_3 < t_4$ the increments $X_{t_4}-X_{t_3}, X_{t_2}-X_{t_1}$ are independent
4. normally distributed increments: $X_t-X_s$ ~ $N(0,|t-s|)$  (note here we are using the left hand integral, i.e. non-antcipatory.)

### Construction

Toss a coin n times, within a fixed and finite period t, and bet size for each time being +\\$$\sqrt{t/n}$ 
or -\\$$\sqrt{t/n}$. For the $i$th toss:

$E[R_i]=\sum x_if_i = (\sqrt{t/n})\times(1/2)+(-\sqrt{t/n})\times(1/2)=0$

$V[R_i]=E[R_i^2] - E[R_i]^2=\sum x_i^2f_i = (\sqrt{t/n})^2 = t/n$

$E[R_iR_j]=0$

Denote $S_i = \sum_{j=1}^i R_j$, subject to $S_0 =0$, then:
        
$E[S(t)] = E[\lim_{n\rightarrow\infty}\sum_{i=1}^n R_i]
=\lim_{n\rightarrow\infty}\sum_{i=1}^n E[R_i] = n\times0=0$

$V[S_i] = V[\lim_{n\rightarrow\infty}\sum_{i=1}^n R_n]
=\lim_{n\rightarrow\infty}\sum_{i=1}^n V[R_i] = n\times t/n=t$

    * Mean and Variance (Math Primer p.167)
        * Mean: $u = E[X] = \sum_{i=1}^n x_if(x_i)$
        * Variance: $V[X] = E[(X-u)^2] = E[X^2] - u^2 = \sum_{i=1}^n (x_i)^2f(x_i) - u^2$

As you toss coins more times, the path follows a random walk.

As $n \rightarrow \infty$, $t/n \rightarrow 0$. The limiting process (i.e. the bet size is scaled down so won't S won't shoot up to infinity) for this random walk as the time steps go to zero is called Brownian Motion, denoted as $X(t), or X_t, or W, or W(t), or W_t$. It is the continuous-time limit of our discrete time random walk.

### Properties

**1. $X_0 = 0$**

**2. Continuity**: The paths are continuous everywhere

**3. Finiteness**

**4. The Markov Property**

i.e. Memoryless State

The conditional distribution of $X(t)$ given information up until $\tau < t$ depends only on $X(\tau)$.
    
**5. The Martingale Property**

i.e. $W_t$ is a martinagle (Asset prices don't follow this property until they are discounted.)

Discrete version: $E[S_{n+1}|S_n] = S_n$

Continuous version:
$E_s[W_t|F_s] = E_s[W_t|W_s]$, where $s<t$

$=E_s[W_t-W_s+W_s|W_s]$

$=E_s[W_t-W_s|W_s] + E_s[W_s|W_s]$, we know $W_t-W_s \sim N(0, |t-s|)$, so:

$=0 + W_s = W_s$

**6. Quadratic Variation**

The quadratic variation of the random walk is: $Q[X(t)] = V^2 = \sum_{j=1}^n (X(t_j)-X(t_{j-1}))^2 = n \times (\sqrt{t/n})^2 = t$ 

(1) Proof of Quadratic Variation Property:

Consider $F(t,X_t)=X^2_T$

By Ito I:

$X^2_T = X^2_0 + \frac{1}{2}\int_{0}^{T}2dt + \int_{0}^{T}2X_tdX_t$

$= 0+ \int_{0}^{T}dt + 2\int_{0}^{T}X_tdX_t$

Taking the expectation:

$E[X^2_T] = E[\int_{0}^{T}dt] + 2E[\int_{0}^{T}X_tdX_t]$

Because Ito is a martingale, $E[\int_{0}^{T}X_tdX_t] = 0$.

We get $E[X^2(T)]= T$   ---Quadratic variation property


(2) Now, prove $E[\int_{0}^{T}X_tdX_t] = 0$:

We know that  $X_{i+1} - X_i$ is $dX$ and is a martingale, i.e. $E[X_{i+1}-X_i]=0$ (because $dX$ ~ $N(0,dt)$)


Therefore, $E[\int_{0}^{T}X_tdX_t] = E[\sum_{i=0}^{N-1}f(t_i,X_i)(X_{i+1}-X_i)] $
$= \sum_{i=0}^{N-1}f(t_i,X_i)E[(X_{i+1}-X_i)] = 0$ 



**7. Normality**

Increments are Gaussian (because of CLT. Coin toss is Bernoulli distribution, but because we are doing so many of them, according to CLT, it's Gaussian distribution.), i.e. $X(t_i) - X(t_{i-1})$ ~ $N(0, |t_i-t_{i_1}|)$

Therefore:

(1) $E[dX_t] = 0$

(2) $V[dX_t]=dt$

(3) The pdf for $X_t-X_s$ if t > s, is $\frac{1}{{\sqrt {2\pi |t-s|}}}e^\frac{-(X_t-X_s)^2}{2|t - s|}$

(4) And increments are independent from each other

(5) $dX$ ~ $\phi\sqrt{dt}$, where $\phi$ ~ $N(0,1)$, i.e. they both have the same mean and variance. Therefore, we can say the movement for each timestep is O($\sqrt{dt})$ (the order of square root $dt$. The reason we say order is because $dt$ is multiplied by the $\phi$.) 

(6) $X_t$ is also normally distributed because $X_t = X_t-X_0$ ~ $N(0,t)$

## Brownian Motion with Drift: $dS = \mu dt+ \sigma dX$

$dX$ is an increment of Brownian motion, known as a Wiener process and is a Normally distributed random variable such that dX ∼ N (0, dt)

* Math Primer Review: 
    * Function: A function denoted f(x) of a single variable of x is a rule that assings each element of a set X (written x $\epsilon$ X) to exactly one element y of a set Y (y $\epsilon$ Y). y = f(x); x $\rightarrow$ f(x); f: x $\rightarrow$ y
    * Random variables: Random variables assign numbers to events. Thus a random variable (RV) X is a function which maps from the sample space $\Omega$ to the set of real numbers. i.e. X: $\omega$ $\epsilon$ $\Omega$ $\rightarrow$ R
    * Stochastic variable: time-dependent random variable
    * Stochastic calculus is very important because of the underlying random nature of financial markets. 






$\sigma dX$ is also a Brownian motion

If we add a drift term $\mu dt$ then $dS = \mu dt+ \sigma dX$. This is called Brownian Motion with Drift, or Generalized Wiener Process.


## Geometric Brownian Motion: $\frac{dS}{S}=\mu dt+\sigma dX$

Now, with the above equation, S (stock price) can go to negative, which is not realistic, that's why we introduce Geometric Brownian Motion (or lognormal random walk, or exponential Brownian Motion), which is: $\frac{dS}{S}=\mu dt+\sigma dX$. This will never be negative. Here, drift scales with timestep, and diffusion scales with $\sqrt{timestep}$. It's called geometric or lognormal because if we make things simple and make it zero volatility, i.e. $\sigma =0$, then $\frac{dS}{S}=\mu dt$

To solve S we need to integrate both sides:

$\int{\frac{dS}{S}}=\mu\int dt$

$log S = \mu t + C$

$\frac{S_t}{S_0} = Ae^{\mu t}$

At time $t=0$, $S_t = S_0$ 

$S_0 = S_0Ae^{\mu t} \rightarrow A = 1$

$S_t = S_0e^{\mu t}$   (lognormal walk)


## Ito, Ito Integral, and Ito Product Rule

Goal: We are interested in knowing what happens for $t \rightarrow t+dt$, i.e.: finding out $dG$

* Ito: Stochastic Differential Equation (SDE): $dG_t = A(t, G_t)dt+B(t, G_t)dX_t$

* Ito Integral:$\int_{0}^{t}B(t, G_t)dX$

* The integral form of Ito:
    * $G_t = X + \int_{0}^{t} A(t,G_t)dt + \int_{0}^{t}B(t, G_t)dX$, assuming $G_0 = X$

Ito I & Ito II: dealing with function of Brownian Motion

Ito III: dealing with function of Stochastic Process




The reason we need Ito is that we cannot apply everyday calculus to stochastic variables:

* The stochastic variables here is Brownian Motion, $X_t$. i.e. {$X_t: t \epsilon R^+$} (family of random variables indexed with time)

* Everyday calculus says: $f'(x) = \lim_{\delta x\rightarrow 0} \frac{f(x+\delta x) - f(x)}{\delta x}$

* Applying to $X_t$:
$\frac{dX}{dt} = \lim_{\delta t \rightarrow 0}\frac{X(t+\delta t)-X(t)}{\delta t}$
= $\lim_{\delta t \rightarrow 0}\frac{O(\sqrt {\delta t})}{\delta t}$
= $\lim_{\delta t \rightarrow 0}\frac{1}{\sqrt {\delta t}} \rightarrow$ doesn't exist (differentiable nowhere)

For this reason, we instead work with stochastic differential equation.

### Stochastic Process

Stochastic process means: ${G_t: t \epsilon R^+}$

Here $G_t$ can mean any asset class. For example $S_t$ for stock price, $R_t$ for interest rate, $\sigma_t$ for volatility, etc. 

We are interested in knowing what happens for $t \rightarrow t+dt$, i.e.:

$dG = A(t, G_t)dt+B(t, G_t)dX_t$, this is a SDE for G or random walk for $dG$.

* $A(t, G_t)dt$ is deterministic - $A(t, G_t)$ is drift or growth
* $B(t, G_t)dX$ is random - $B(t, G_t)$ is diffusion or volatility.

The most general way of writing down SDE is: $dG_t = A(t, G_t)dt+B(t, G_t)dX_t$, for any arbitrary stochastic process $G$. $G$ could be anything. 

Assume $G_0=X$. Now do integration:

$\int_{0}^{t}dG_t$
$= \int_{0}^{t} A(t,G_t)ds + \int_{0}^{t}B(t, G_t)dX$, where s is a dummy variable for t

(Note $\int_{0}^{t}B(t, G_t)dX$ is Ito integral, because we are integrating against a Brownian Motion.)

$G_t -G_0 = \int_{0}^{t} A(t,G_t)ds + \int_{0}^{t}B(t, G_t)dX$

$G_t = G_0 + \int_{0}^{t} A(t,G_t)ds + \int_{0}^{t}B(t, G_t)dX$
$= X + \int_{0}^{t} A(t,G_t)ds + \int_{0}^{t}B(t, G_t)dX$



### Ito I: $dF = \frac{1}{2}\frac{d^2F}{dX^2}dt + \frac{dF}{dX}dX$

Use Ito I for functions that look like F=F(X). For example: $X^n, sinX, e^x, logX$.

According to Mean Square Limit (or Mean Square Convergence): $E[dX^2] \rightarrow dt$ (behaves like $dt$ as $dt \rightarrow 0$), note here $dX^2 = (dX)^2$.

i.e. $\lim_{\delta t \rightarrow 0}dX^2 = dt$

We saw similar outcome from Quadratic Variation: $\sum(X_t-X_{t-1})^2 = t$

implication: whenever you see $dX^2$, replace it with $dt$. i.e. $dX^2 = dt$

According to TSE, we get $F(X+dX) = F(X) +\frac{dF}{dX}dX +\frac{1}{2}\frac{d^2F}{dX^2}dX^2...$ 

In a regular TSE, say, $f(x+\delta x) = f(x) +\frac{df}{dx}\delta x +\frac{1}{2}\frac{d^2f}{dx^2}\delta x^2...$,
$\delta x^2$ is usually very small so we can ignore it, but in Brownian Motion, $dX^2$ is not small and cannot be ignored. It actually behaves like $dt$. 

Total Change = $dF = F(X+dX) - F(X) = \frac{dF}{dX}dX +\frac{1}{2}\frac{d^2F}{dX^2}dX^2 = \frac{dF}{dX}dX +\frac{1}{2}\frac{d^2F}{dX^2}dt$

$dt$ is what differentiates the stochastic calculus from everyday calculus. 

$\frac{dF}{dX}$ = diffusion

$\frac{1}{2}\frac{d^2F}{dX^2}$ = drift

### Ito II: $dF = (\frac{\partial F}{\partial t}+\frac{1}{2}\frac{\partial ^2 F}{\partial X^2})dt + \frac{\partial F}{\partial X}dX$ 

Use Ito II for functions that look like $F=F(t, X_t)$. For example: $t^2X^2, e^{t+X}, t^2 sinX$

Now we are dealing with two variables: $t \rightarrow t+dt$, and $X \rightarrow X+dX$, so we need to use two-dimensional TSE:

$F(t+dt, X+dX) = F(t,X) + \frac{\partial F}{\partial t}dt + \frac{\partial F}{\partial X}dX+\frac{1}{2}\frac{\partial ^2 F}{\partial X^2}dX^2$
$= F(t,X) + \frac{\partial F}{\partial t}dt + \frac{\partial F}{\partial X}dX+\frac{1}{2}\frac{\partial ^2 F}{\partial X^2}dt$

Total Change 
$= dF = F(t+dt, X+dX) - F(t,X) = (\frac{\partial F}{\partial t}+\frac{1}{2}\frac{\partial ^2 F}{\partial X^2})dt + \frac{\partial F}{\partial X}dX$ 

### Ito III: $dV = (\mu S\frac{dV}{dS}+\frac{1}{2}\sigma^2S^2\frac{d^2V}{dS^2})dt +(\sigma S\frac{dV}{dS})dX$


Rather than being intersted in the function of a brownian motion, we now are intersted in the function of a stochastic process.

Ito III can be used to solve any financial contract that is based on stock price.

Suppose we now wish to extend Ito I to consider the change in an option price V (S) where the underlying variable S follows a geometric Brownian motion. An obvious question we may ask is, what is the jump in V (S +dS) when S → S +dS?


Using T.S.E:

$V(S+dS)=V(S) + \frac{dV}{dS}dS +\frac{1}{2}\frac{d^2V}{dS^2}dS^2$

$dV = \frac{dV}{dS}dS +\frac{1}{2}\frac{d^2V}{dS^2}dS^2$

$= \frac{dV}{dS}(\mu Sdt+\sigma SdX) +\frac{1}{2}\frac{d^2V}{dS^2}\sigma^2S^2dt$

$= (\mu S\frac{dV}{dS}+\frac{1}{2}\sigma^2S^2\frac{d^2V}{dS^2})dt +(\sigma S\frac{dV}{dS})dX$

Hint: 

* if $dG = A(t, G_t)dt+B(t, G_t)dX_t$

* then $dG^2  =A^2dt^2 + 2ABdtdX+B^2dX^2 = O(dt^2)+O(dt^{(3/2)} + B^2dt = B^2dt$

Therefore, $dS^2 = (\mu Sdt + \sigma SdX)^2 = \sigma^2S^2dt$

Ito II vs Ito III: Ito II is still Brownian motion, Ito III now S has it's own makeup. 

#### Important Examples

**1. V (S) = logS**



$\frac{dV}{dS}=\frac{1}{S}$

$\frac{d^2V}{dS^2}=-\frac{1}{S^2}$

$dV = (\mu S(\frac{1}{S}+\frac{1}{2}\sigma^2S^2(-\frac{1}{S^2}))dt +\sigma S(\frac{1}{S})dX$ 

$ d(logS) = (\mu-\frac{1}{2}\sigma^2)dt +\sigma dX$

We want to solve for S because we want determine stock prices. Here's $\mu, S \epsilon R$

Integrating both sides between 0 and t:

$\int_{0}^{t} = \int_{0}^{t}(\mu-\frac{1}{2}\sigma^2)d\tau +\int_{0}^{t}\sigma dX$   (t>0)

$log\frac{S_t}{S_0} = (\mu-\frac{1}{2}\sigma^2)t + \sigma(X(t)-X(0))$


Take exponential on both sides, assuming X(0)=0 and S(0) = $S_0$:

$S(t) = S_0exp((\mu-\frac{1}{2}\sigma^2)t + \sigma(X(t))$

$ = S_0exp((\mu-\frac{1}{2}\sigma^2)t + \sigma\phi\sqrt{t})$

Usage:

$\int_{t}^{T}: S_T=S_texp((\mu-\frac{1}{2}\sigma^2)(T-t) + \sigma\phi\sqrt{T-t})$

$\int_{t}^{t+\delta t}: S_{t+\delta t}=S_texp((\mu-\frac{1}{2}\sigma^2)\delta t + \sigma\phi\sqrt{\delta t})$

**2. Vasicek Interest rate model for short-term interest rates**

$dr = γ (\overline r − r) dt+σdX$

$= - γ (r - \overline r) dt+σdX$

γ refers to the reversion rate and $\overline r$ denotes the mean rate

For the behvaior of interest rate, a closed form solution does exist. That's why people love Vasicek. 

Do a substitution: 

$u_t = r - \overline r$

then  $du  = d(r - \overline r) = dr$, because $\overline r$ is a constant  

$du = - γudt+σdX$

$du + γudt = σdX$

Now, consider an integrating factor, i.e. multiply by $e^{\gamma t}$    (need to review the integrating factor on math primer)

$e^{rt}(du + γudt) = e^{rt}σdX$

$d(e^{rt}u) = e^{rt}σdX$

Integrate over (0, t)

$\int_{0}^{t}d(e^{rs}u) = \int_{0}^{t}e^{rs}σdX$

$e^{rt}u_t - u_0 = σ\int_{0}^{t}e^{rs}dX$

$u_t = u_0e^{-\gamma t}+\sigma\int_{0}^{t}e^{r(s-t)}dX_s$


Using integrating by parts or stochastic integration formula:

$=u_0e^{-\gamma t}+\sigma(X(t) - \gamma\int_{0}^{t}X(s)e^{r(s-t)}ds)$    (need to review "integrating by parts)


For $\int e^{\gamma(s-t)} dX$, using integration by parts:

$v = e^{\gamma(s-t)} \rightarrow v'=\gamma e^{\gamma(s-t)}$

$du = dX \rightarrow u=X$


### Ito IV: $dV = (\frac{\partial V}{\partial t}+\mu S\frac{\partial V}{\partial S}+\frac{1}{2}\sigma^2S^2\frac{\partial^2V}{\partial S^2})dt+\sigma S\frac{\partial V}{\partial S}dX$


We will often want to simulate paths of correlated random walks. We may want to examine the statistical properties of a portfolio of stocks, or value a convertible bond under the assumption of random asset price and random interest rates.

Example:

Assets S_1 and S_2 both follow lognormal random walks with correlation ρ.
In continuous time we write:

$dS_1 = μ_1S_1 dt+σ_1S_1 dX_1$

$dS_2 = μ_2S_2 dt+σ_2S_2 dX_2$

with $E[dX_1 dX_2] = ρ dt$.  (this implies $E[\phi_1,\phi_2]=\rho$)

$S_{1_{i+1}}-S_{1_{i}} = S_{1_{i}}(μ_1\delta t+σ_1\phi_1\delta t^{1/2})$

$S_{2_{i+1}}-S_{2_{i}} = S_{2_{i}}(μ_1\delta t+σ_1\phi_1\delta t^{1/2})$

Q: How can we choose a $φ_1$ and a $φ_2$ which are both Normally distributed, both have mean zero and standard deviation of one,
and with a correlation of ρ between them?

A: This can be done in two steps, first pick two uncorrelated Normally distributed random variables, and then combine them.

Step 1: 

Simulate $\epsilon_1$ and $\epsilon_2$ such that $\epsilon_1$, $\epsilon_2$ ~ $N(0,1)$

$E[\epsilon_1,\epsilon_2]=0$

Simulate $\phi_1$ and $\phi_2$ such that $\phi_1$, $\phi_2$ ~ $N(0,1)$

$E[\phi_1,\phi_2]=\rho$

Recall $E[\phi_1]=E[\phi_2]=0$ and $V[\phi_1]= V[\phi_2] =1$

Step 2: Convert these independent Normal numbers into correlated Normals by taking a linear combination.

Set $\phi_1=\epsilon_1$

and $\phi_2=\alpha\epsilon_1+\beta\epsilon_2$

(1) We know $E[\phi_1,\phi_2]=\rho$

$E[\epsilon_1(\alpha\epsilon_1+\beta\epsilon_2)] = \rho$

$\alpha E[\epsilon_1^2]+\beta E[\epsilon_1\epsilon_2] = \rho$

$\alpha(1)+\beta(0)=\rho$

$\therefore \alpha=\rho$

(2) We know $E[\phi_2^2]=1$

$E[(\alpha\epsilon_1+\beta\epsilon_2)^2]= 1$

$E[\alpha^2\epsilon_1^2+\beta^2\epsilon_2^2+2\alpha\beta\epsilon_1\epsilon_2] = 1$

$\alpha^2E[\epsilon_1^2]+\beta^2E[\epsilon_2^2]+2\alpha\beta E[\epsilon_1\epsilon_2]=1$

$\alpha^2(1)+\beta^2(1)+2\alpha\beta(0)=1$

$\rho^2+\beta^2=1$

$\beta = \sqrt{1-\rho^2}$

Therefore, $\phi_1=\epsilon_1$, $\phi_2=\rho\epsilon_1+\sqrt{1-\rho^2}\epsilon_2$

Weighted sums of Normally distributed numbers are themselves Normally distributed!

If $X_i$ ~ $N(\mu_i, \sigma_i^2) for i=1, ..., n$   then

$\sum_{i=1}^{n}w_iX_i$~$N(\sum_{i=1}^{n}w_i\mu_i,\sum_{i=1}^{n}w_i^2\sigma_1^2)$

Ito IV:

Suppose $V=V(t,S)$, what happens when $t \rightarrow t+dt$, $s \rightarrow  S+dS$?

Use 2D TSE:

$V(t+dt, S+dS) = V(t,S)$
$+\frac{\partial V}{\partial t}dt$
$+\frac{\partial V}{\partial S}dS$
$+\frac{1}{2}\frac{\partial^2V}{\partial S^2}dS^2$


$dV = \frac{\partial V}{\partial t}dt$
$+\frac{\partial V}{\partial S}(\mu Sdt+\sigma SdX)$
$+\frac{1}{2}\frac{\partial^2V}{\partial S^2}(\sigma^2S^2dt)$

$dV = (\frac{\partial V}{\partial t}+\mu S\frac{\partial V}{\partial S}+\frac{1}{2}\sigma^2S^2\frac{\partial^2V}{\partial S^2})dt$
$+\sigma S\frac{\partial V}{\partial S}dX$



### Ito V: 

2 stocks, $\frac{dS_i}{S_i}=\mu_idt+\sigma dX_i$
$E(dX_1dX_2)=\rho dt$

Write dV where $V=V(t,S_1,S_2)$ see M1L6.

### Ito's Integral (Stochastic Integration formula):

Properties of Ito Integrals:

1. Linearity:

$\int_{0}^{T}(\alpha f(X_t)+\beta g(X_t))dX_t=\alpha\int_{0}^{T}f(X_t)dX_t+\beta\int_{0}^{T}g(X_t)dX_t$

2. Ito Isometry:

$E[(\int_{0}^{T}f_tdX_t)^2]=E[\int_{0}^{T}f^2_tdt]$

This is used in calculating the variance of a stochastic process.


#### Integration for Ito II: $\int_{0}^{t} \frac{\partial F}{\partial X}dx= F(t,X_t)-F(0,X_0) - \int_{0}^{t}(\frac{\partial F}{\partial t}+\frac{1}{2}\frac{\partial^2F}{\partial X^2})dt$


Start with Ito II: 

$\frac{\partial F}{\partial X}dX = dF - (\frac{\partial F}{\partial t}+\frac{1}{2}\frac{\partial ^2 F}{\partial X^2})dt $

Integrate both sides over [0, t]:

$\int_{0}^{t} \frac{\partial F}{\partial X}dx$
$= \int_{0}^{t} dF - \int_{0}^{t}(\frac{\partial F}{\partial t}+\frac{1}{2}\frac{\partial^2F}{\partial X^2})dt$
$= F(t,X_t)-F(0,X_0) - \int_{0}^{t}(\frac{\partial F}{\partial t}+\frac{1}{2}\frac{\partial^2F}{\partial X^2})dt$

$\int_{0}^{t} \frac{\partial F}{\partial X}dx$ is called Ito Integral.

#### Integration for Ito I: $\int_{0}^{t} \frac{\partial F}{\partial X}dx= F(X_t)-F(X_0) - \frac{1}{2}\int_{0}^{t}\frac{\partial^2F}{\partial X^2}dt$


### Ito Product Rule

Let $X_t = a(X_t,t)dt+b(t,X_t)dW^{(1)}_t$, and $Y_t=c(Y_t,t)dt+d(t,X_t)dW^{(2)}_t$ 

Solve $F=X_tY_t$

Using T.S.E:

$dF=\frac{\partial F}{\partial X}dX+\frac{\partial F}{\partial Y}dY+\frac{1}{2}\frac{\partial^2F}{\partial X^2}dX^2$
$+\frac{1}{2}\frac{\partial^2F}{\partial Y^2}dY^2 +\frac{\partial F}{\partial X\partial Y}dXdY$

$= YdX + XdY + 0 + 0 + dXdY$

So:

$d(XY) = YdX + XdY + dXdY$  --- (Ito rule for products). $dXdY$ cannot be ignored because it is $O(dt)$.

Note, this is different from the everyday differentiation: $(fg)'=f'g+fg'$

## Stimulating random walks

### Stimulating the lognormal random walk

The lognormal random walk model for assets can be written in continuous time as:

dS = μS dt+σS dX

In discrete time we write it as:

$\delta S=\mu S\delta t +\sigma\phi\sqrt{\delta t}$

$S_{i+1} − S_i = S_i(\mu S\delta t +\sigma\phi\sqrt{\delta t})$

$i$ means $t_i$

$S_{i+1} =S_i(1+\mu S\delta t +\sigma\phi\sqrt{\delta t})$

$S_{i+1}$ is the next timestep. 

We can easily simulate the model using a spreadsheet with the above formula. The method is called the Euler Maruyama method. It's accuracy is $O(\sqrt{\delta t})$


For a generic stochastic process $dy = A(y,t)dt +B(y,t)dX$

We get $y_{n+1} = y_n + A(y_n,n)\delta t+B(y_n,n)\Delta X_n$

This will be proved in Module 3. 

### Stimulating other random walks

This method is not restricted to the lognormal random walk.

The following is a stochastic differential equation model for an interest rate, that goes by the name of an Ornstein-Uhlenbeck
process (an example of a mean-reverting random walk), or when used in an interest rate context the Vasicek model:

$dr = (\eta -\gamma r)dt+\sigma dX$

In discrete time we can approximate this by:

$r_{i+1} = r_i + (\eta -\gamma r)dt+\sigma\phi\sqrt{\delta t}$



# Martingales

Here we focus on "Continuous time", not discrete time.

Martingale is a driftless stochastic procss (or constant mean process), this is the nice thing about Martingales because it enables us to focus on the diffusion part. 

If something is Martingale, then the expected value is going to be 0. 


Three Encounters with Martingale:
1. Martingales as a class of stochastic process
2. Exponential martingales (module 3): Can be used to price derivatives
3. Equivalent martingale measures (module 3):

    Binomial model: from real world measure and move to risk-neutral pricing, in this case the underlying stock price doesn't really matter. 

    That is: for $\frac{dS_i}{S_i}=\mu_idt+\delta_idW_i$, the $\mu_i$ is gong to diminish and replaced with a constant growth rate, across all stock prices.

## Conditional Expectations

### Components

Triple (or 3-tuple): ($\Omega, F, P$) are always given:

$\Omega$: sample space (i.e. the set of all possible outcomes)

$F_t$: filtration (i.e. a record of information). We will record more and more information based on the outcome that follows us. It can also be called $\sigma$ field, or $\sigma$ algebra. For example, we talked about conditional expectation $E[S_6|S_5]$, we can record $S_5$ so we call $S_5$ and the information earlier filtration.Therefore, we can use that information to predict where we will be in the future. We don't need to know all the info in filtration, just $S_5$ is enough. Filtration contains all information up until time $t_5$. $F_0$ is the empty set.

$P$: probability measure, which is a generalized CDF for assigning probabilities to outcomes. For example, $P(\omega_1)$, $P(\omega_2)$. And Random variables are $X(\omega_1)$, $X(\omega_2)$.. In this case, we are looking at the probability of an interval, $x \rightarrow x+dx$, so we need theory of integration, based on measure theory.

### Adapted (Measurable) Process

* It's adapted to a filtration. 

* A stochastic process $S_t$ is said to be adapted to the Filtration $F_t$ (or measurable with respect to $F_t$ , or $F_t$ -adapted) if the value of S_t at time t is known given the information set $F_t$.
* We are interested in calculating $E_t[S_T|F_t]$ (We are interested in calculating the expected value of $S_T$ given $F_t$)
* Given that we don't need to know all of $F_t$, sometimes we write the above as $E_t[S_T|S_t]$
* We can also write it as $Y=E[X|F] = E_t(X_{t+1}|F_t)$

### Properties of Conditional Expectations

If $X, Y$ are intergrable random variables and $\alpha, \beta$ are constants then:
1. Linearity: this is using the linear property of an integral
$E[\alpha X+\beta Y|F] = \alpha E[X|F] + \beta E[Y|F]$

* Here, integrable means the expected value is bounded, i.e. $E[X]<\infty$, where $E[h(x)] = \int_{-\infty}^{\infty}h(x)p(x)dx$

2. Tower Property (i.e. iterated expectations)
if $F \subset G$

* $E[E[X|G]|F]=E[X|F]$
* the smallest filtration always wins

3. A special case of Tower property:
* Because the smallest filtration wins, no information wins
* No information = unconditional expectation
* i.e. $E[E[X|F]]=E[X]$

4. Taking out what is known:
if X is F-measurable, then the value of X is known once we know F. Therefore $E_t[X_t|F_t]=X$


5. Taking out what is know (2):
* if $X$ is $F$-measurable but not $Y$, then $E[XY|F]=XE[Y|F]$.
* for example, if you are trying to figure out Apple price, but somebody sneaked in Google price, which is useless for your analysis.
* We don't know $Y$. $Y$ is a random variable.
* X behaves like a scalar

6. Independence: 
* if $X$ is independent from $F$, then knowing $F$ is useless to predict the value of $X$.
* i.e. $E[X|F] = E[X]$

7. Positivity:
* if $X>=0$, then $E[X|F]>=0$

8. Jensen's Inequality:
let $f$ be a convex function (e.g. $|X|$, $X^2$, $e^x$), then $f(E[X|F])<=E[f(X)|F]$

## Discrete Time Martingales

We will start with discrete time and later we can focus on continuous time.

A discrte time stochastic process {$M_t:t=0,...,T$} such that $M_t$ is $F_t$-measurable for $T=${$0,...,T$} is a martingale if $E[M_t] < \infty$ (i.e. integrability) and $E_t[M_{t+1}|F_t]=M_t$ (i.e. a Martingale is a driftless process), or can also be written as $E_t[M_{t+1}|M_t]=M_t$



Key takeaway: 
* We go from 0 to time t, and at time t, we decided to calculate the value for time t+1.
* the mean doesn't change, i.e. $E[M_{t+1}] = E[M_t]$:
    * $E_t[M_{t+1}|F_t]=M_t \rightarrow E[E_t[M_{t+1}|F_t]]=E[M_t]$, due to Tower Property, the equation can be reduced to $E[M_{t+1}]=E[M_t]$

## Continuous Time Martingales

A continuous time stochastic process {$M_t:t \epsilon R^+$} such that $M_t$ is $F_t$-measurable for $t \epsilon R^+$ is a martingale if $E[M_t] < \infty$ (i.e. integrability) and $E_s[M_t|F_s]=M_s$ (i.e. a Martingale is a driftless process), where $0<=s<=t$, or can also be written as $E_s[M_t|M_s]=M_s$. $M_s$ is our starting point (now, where as t is future). We have information up to time s.

**Proving that a Continuous Time Stochastic Process is a Martingale**
Consider a stochastic process $Y(t)$ with $dY(t)=f(Y_t,t)dt + g(Y_t,t)dX(t)$, where $Y(0)=Y_0$

If Y(t) is a martingale if and only if it satisfies the martingale condition $E[Y_t|F_s]=Y_s, 0<=s<=t.

Proof:

$dY(t)=f(Y_t,t)dt + g(Y_t,t)dX(t)$

Integrating both sides: 

$Y(t) = Y(s) + \int_{s}^{t}f(Y_u,u)du+\int_{s}^{t}g(Y_u, u)dX(u)$

Taking the expectation conditional on the filtration at time s, we get:

$E[Y_t|F_s] = E[Y(s) + \int_{s}^{t}f(Y_u,u)du+\int_{s}^{t}g(Y_u, u)dX(u)|F_s]$

$=Y(s) + E[\int_{s}^{t}f(Y_u,u)du|F] + 0$  (the last item is 0 because an Ito integral is a martingale)

So, $Y(t)$ is a martingale iff $E[\int_{s}^{t}f(Y_u,u)du|F]=0$. This is why we say that martingales are "driftless processes".

Consider a Brownian Motion with drift: $\mu t+X_t$. Is this a martigale?

Answer: no

Solution:

$E[\mu t+X_t|F_t]=E[\mu t+X_t-X_s+X_s|F_t]=E[X_t-X_s]+E[\mu t+X_s|F_t]$

The first item ~ $N(0,|t-s|)$  
The second item $\mu t+X_s \neq \mu s+X_s, so not a martingale 


## Lévy's Martingale Characterisation

Let $X_t$, $t > 0$ be a stochastic process and let $F_t$ be the filtration generated by it. $X_t$ is a Brownian motion if the following conditions are satisfied:

1. $X_0$ = 0
2. the sample path $t \rightarrow X_t$ are continuous
3. $X_t$ is a martingale with respect to the filtration $F_t$
4. Quadratic Variation property: $|X_t|^2 -t$ is a martingale with respect to the filtration $F_t$

## Ito integrals and martingales

Ito integrals are martingales

Consider the stochastic process $y(t) = X^2(t)$

Ito I tells us: $dF = \frac{1}{2}\frac{d^2F}{dX^2}dt + \frac{dF}{dX}dX$

Therefore $dy = 2XdX + dt$

Integral on both sides:

$y = \int_{0}^{T}2X(t)dX(t) + \int_{0}^{T}dt$

$X^2(T) = \int_{0}^{T}2X(t)dX(t) + T$

Because the quadratic variation property of Brownian motions implies $E[X^2(T)]=T$, $E[\int_{0}^{T}2X(t)dX(t)] = 0$

That is because, following Fubini's Theorem, 
$E[\int_{0}^{T}f(x_t)dt]=\int_{0}^{T}E[f{X_t}]dt $

Therefore, the Ito integral $\int_{0}^{T}2X(t)dX(t)$ is a martingale. In fact, all Ito integrals $\int_{0}^{T} g(t,X_t)dX_t$ are Martingales. 

### Martingale Representation Theorem

The converse also holds: we can represent any martingale as an Ito integral - 
i.e.:

If $M_t$ is a martingale, the there exists a function $g(t,X_t)$ satisfying the technical condition such that 
$M_T = M_0 + \int_{0}^{T}g(t,X_t)dX_t$. We have to find that $g(t,X_t)$.

### Reimann integral

There are a nmber of ways to approximate $\int_{0}^{T}f(t)dt$, and they will all be equal to each other as $N \rightarrow \infty$.

1. left hand rectangle rule:

$\int_{0}^{T}f(t)dt=\lim_{N\rightarrow\infty}\sum_{i=0}^{N-1}f(t_i)(t_{i+1}-t_i)$



2. right hand rectangle rule:

$\int_{0}^{T}f(t)dt=\lim_{N\rightarrow\infty}\sum_{i=0}^{N-1}f(t_{i+1})(t_{i+1}-t_i)$



3. trapezium rule:

$\int_{0}^{T}f(t)dt=\lim_{N\rightarrow\infty}\sum_{i=0}^{N-1}\frac{1}{2}(f(t_i)+f(t_{i+1}))(t_{i+1}-t_i)$


4. midpoint rule:

$\int_{0}^{T}f(t)dt=\lim_{N\rightarrow\infty}\sum_{i=0}^{N-1}f(\frac{1}{2}(t_i+t_{i+1}))(t_{i+1}-t_i)$


Now, consider the stochastic integral of the form:

$\int_{0}^{T}f(t,X)dX=\int_{0}^{T}f(t,X(t))dX(t)$, where $X(t)$ is a Brownian motion.


Following Reimann integral, we can approximate the above integral as:

(1) $\lim_{N\rightarrow\infty}\sum_{i=0}^{N-1}f(t_i,X_i)(X_{i+1}-X_i)$ --- (left hand rectangle rule)

or (2) $\lim_{N\rightarrow\infty}\sum_{i=0}^{N-1}f(t_{i+1},X_{i+1})(X_{i+1}-X_i)$  --- (right hand rectangle rule)

or (3) $\lim_{N\rightarrow\infty}\sum_{i=0}^{N-1}f(t_{i+\frac{1}{2}},X_{i+\frac{1}{2}})(X_{i+1}-X_i)$  --- (midpoint rule)

However, in he case of a stochastic variable $dX(t)$, (1) $\neq$ (2) $\neq$ (3)

(1) = $\int_{0}^{T}f(t,W_t)dW_t$, which results in Ito Integral. It is non-anticipatory because: given that we are at time $t_i$, we know $X_i=X(t_i)$, and therefore we know $f(t_i,X_i)$. The only uncertainly is $X_{i+1}$, but as $N \rightarrow \infty$, $X_{i+1}-X_i = dX \rightarrow 0$.

(2) is anticipatory, given that at time $t_i$ we know $X_i$ but are uncertain about the future value of $X_{i+1}$: Thus we are uncertain about both the value of $f(t_{i+1}, X_{i+1})$ and the value of $(X_{i+1} - X_i)$

Example:

From Ito I: we know $3\int_{0}^{T}X^2dX=X(T)^3-X(0)^3-3\int_{0}{T}X(t)dt$

This can also be derived from Reimann Integral:

$3\int_{0}^{T}X^2dX= \lim_{N \rightarrow \infty}3\sum_{i=0}^{N-1}X^2_i(X_{i+1}-X_i)$

Using $3b^2(a-b)=a^3-b^3-3b(a-b)^2-(a-b)^3$, we can assign $a = X_{i+1}, b =X_i$, hence:

$\lim_{N \rightarrow \infty}\sum_{i=0}^{N-1}3X^2_i(X_{i+1}-X_i)$
$=\sum_{i=0}^{N-1}X^3_{i+1} - \sum_{i=0}^{N=1}X^3_i - \sum_{i=0}^{N=1}3X_i(X_{i+1}-X_i)^2 - \sum_{i=0}^{N-1}(X_{i+1}-X_i)^3$

$=X^3_N - X^3_0 - \int_{0}^{T}3X(t)dt -O(dt^{3/2}) = X(T)^3 - X(0)^3 - \int_{0}^{T}3X(t)dt$

## Exponential Martingales

Consider the stochastic process Y (t) satisfying the SDE $dY (t) = f(t)dt + g(t)dX(t)$, $Y(0) = Y_0$ (initial condition), where $f(t)$ and $g(t)$ are two time-dependent functions and $X(t)$ is a standard Brownian motion. How do we choose f(t) so that $Z(t) = e^{Y(t)}$ is a martingale (i.e. driftless process)?

$dZ$ is the SDE for Z

$\frac{dZ}{dY}=e^Y = \frac{d^2Z}{dY^2}$


$dY^2 = f^2dt^2+g^2dX^2 + 2fgdtdX$
$= O(dt^2) + g^2dt + O(dt^{3/2}) = g^2dt$


$dZ = de^Y = \frac{dZ}{dY}dY + \frac{1}{2}\frac{d^2Z}{dY^2}dY^2$

$=\frac{dZ}{dY}(fdt + gdX) + \frac{1}{2}\frac{d^2Z}{dY^2}g^2dt$

$=e^Y(f+\frac{1}{2}g^2)dt + e^YgdX$

Z is a martingale if and only if it is a driftless process. Therefore $f+\frac{1}{2}g^2 = 0 \rightarrow f=-\frac{1}{2}g^2$

Now we have $dZ = gZdX$ This exponential martingale forms the basis of option pricing 

so, $dY = fdt + gdX = -\frac{1}{2}g^2dt + gdX$

$Y(T) = Y_0 -\frac{1}{2}\int_{0}^{T}g^2dt+\int_{0}^{T}gdX$


$Z = e^Y = exp(Y_0 -\frac{1}{2}\int_{0}^{T}g^2dt+\int_{0}^{T}gdX)$ 

$ = Z_0exp(-\frac{1}{2}\int_{0}^{T}g^2dt+\int_{0}^{T}gdX)$

## Equivalent Measure

Changing Probability Measure: Change from real world probability P and (1-P) to risk-neutral world probability q and (1-q).

If two measures P and Q share the same sample space and if P(A) = 0 implies Q(A) = 0 for all subset A, we say that Q is absolutely continuous with respect to P and denote this by Q << P.

* all impossible events under P remain impossible under Q
* The probability mass of the possible events will be distributed differently under P and Q. That is, the probability of an event in P will have a different value in Q.

If Q << P and P << Q then the two measures are said to be equivalent, denoted by P ~ Q.
