<a href="https://colab.research.google.com/github/kevinrchilders/stochastics/blob/main/stochastics.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

#Stochastics

---

From a special topics course at the University of Utah in Spring 2020.

---

## Markov chains

---

### A small web of sites

Consider a tiny web consisting of 6 sites. Site 1 is John’s
homepage. John teaches calculus and his site points to sites 2 and 3. Site 2 is the course syllabus and does not point to any other sites. Site 3 is the calculus course website. It points back at John’s homepage and also at the course syllabus. It also points at Emily’s webpage, site 5. (Emily is the TA for the course.) Site 4 belongs to a friend of Emily’s, Jack. It points at Emily’s website and at Jack’s old website, site 6. Emily’s website points at both Jack’s pages, the new one 4 and the old one 6. Jack’s old webpage 6 points at his new webpage 4.

Suppose from any site you are equally likely to follow any link.  This creates a Markov process, since your decision of what to click next depends only on your current location.  There's one issue: site 2 doesn't have any links.  So let's just say that from site 2 you are equally likely to jump to any of the six sites.  Then the transition matrix $\Pi$ is given by:

In [71]:
Pi = np.array([[0, 1/6, 1/3, 0, 0, 0], 
                [1/2, 1/6, 1/3, 0, 0, 0], 
                [1/2, 1/6, 0, 0, 0, 0], 
                [0, 1/6, 0, 0, 1/2, 1], 
                [0, 1/6, 1/3, 1/2, 0, 0], 
                [0, 1/6, 0, 1/2, 1/2, 0]])

print(Pi)

[[0.         0.16666667 0.33333333 0.         0.         0.        ]
 [0.5        0.16666667 0.33333333 0.         0.         0.        ]
 [0.5        0.16666667 0.         0.         0.         0.        ]
 [0.         0.16666667 0.         0.         0.5        1.        ]
 [0.         0.16666667 0.33333333 0.5        0.         0.        ]
 [0.         0.16666667 0.         0.5        0.5        0.        ]]


Suppose that we wish to "rank" the sites.
We could do this by computing the invariant measure (which tells us the probability to be at each site after a large number of clicks).
Notice that $\Pi$ has an eigenvalue of 1.

In [72]:
np.linalg.eigvals(Pi)[0]

(1.0000000000000013+0j)

We can find the invariant measure by solving for an eigenvector of $\Pi$ with eigenvalue 1, and normalizing so that the entries add to 1.

In [73]:
evec = np.linalg.eig(Pi)[1][:, 0]

# Shows that evec is an eigenvector for 1

print('Pi*evec =\t', np.round(np.dot(Pi, evec),5))
print('evec =\t\t', np.round(evec,5))

Pi*evec =	 [0.     +0.j 0.     +0.j 0.     +0.j 0.74278+0.j 0.37139+0.j 0.55709+0.j]
evec =		 [0.     +0.j 0.     +0.j 0.     +0.j 0.74278+0.j 0.37139+0.j 0.55709+0.j]


In [74]:
# Drop imaginary parts (which appear to be 0)
evec = np.real(evec)
print('Before normalizing:\t',np.round(evec,5))

# Scale so that the entries sum to 1
evec = evec / np.sum(evec)
print('After normalizing:\t',np.round(evec,5))

Before normalizing:	 [0.      0.      0.      0.74278 0.37139 0.55709]
After normalizing:	 [0.      0.      0.      0.44444 0.22222 0.33333]


So the invariant measure is given by $[0, 0, 0, \frac49, \frac29, \frac39]^T$.  Notice that in this model sites 1, 2, and 3 are transient.  So it isn't surprising that the invariant measure gives zero probability to these states.
As a result, we can't fully rank the sites.

To fix this, we can add to each site (other than 2) a small chance of jumping to any site.
Say that with 15% probability we make a random jump, and with 85% probability we follow a link.

Then new transition matrix is given by:

In [75]:
Pi = Pi * 0.85 + (0.15 / 6)

print(Pi)

[[0.025      0.16666667 0.30833333 0.025      0.025      0.025     ]
 [0.45       0.16666667 0.30833333 0.025      0.025      0.025     ]
 [0.45       0.16666667 0.025      0.025      0.025      0.025     ]
 [0.025      0.16666667 0.025      0.025      0.45       0.875     ]
 [0.025      0.16666667 0.30833333 0.45       0.025      0.025     ]
 [0.025      0.16666667 0.025      0.45       0.45       0.025     ]]


Now every site has a positive probability of leading to any other site, so there all states are recurrent.
We can calculate the invariant measure as before:

In [87]:
evec = np.real(np.linalg.eig(Pi)[1][:,0])
evec = evec / np.sum(evec)

print('Invariant measure: ', np.round(evec,3))

Invariant measure:  [0.052 0.074 0.057 0.349 0.2   0.269]


We can now rank the sites (most important first) as:

4, 6, 5, 2, 3, 1

### Probability of absorption