## Markov Processes via Example
A rental car company has three locations in Washington, DC: one at each major airport (Dulles, Regan National, and Baltimore Washington International, but let's go with A, B, and C).

The company manager really wants to have an even number of cars at each location, since it makes inventory simpler. However, a weird thing happens. Over time, the number of cars at each location shifts.

The reason for the shift is that although most people drop thier car off at the airport they picked it up at, some drive the car to a different airport and fly out of that one.

We can capture this data as a transition matrix:

$$ transition\ probabilites = 
\begin{bmatrix}
    .8&.1&.1\\
    .2&.5&.3\\
    .2&.2&.6\\
\end{bmatrix}
$$

The entry on row 2, column 3 is the probability of moving FROM state 2 TO state 3 (rows are "from" columns are "to"). So on any given day 30% of the passengers who rent cars at Airport 2 drop them off at Airport 3. Likewise, 50% of the passengers who pick up cars at airport 2 drop those cars back at airport 2.

Graphically:

[TODO: draw and upload image]

In code:

In [22]:
import numpy as np
trans = np.array([
 [.8, .1, .1],
 [.2, .5, .3],
 [.2, .2, .6]])
trans

array([[ 0.8,  0.1,  0.1],
       [ 0.2,  0.5,  0.3],
       [ 0.2,  0.2,  0.6]])

### Using the matrix
A transition matrix is very easy to use. We put the current distirbution of cars (1/3, 1/3, 1/3) on the right as a row vector and do the matrix multiplication. The result is how many cars are at each location after one step. In this case, it's (4/10, 4/15, 1/3). 

In [69]:
starting = np.array([1/3,1/3,1/3])
np.dot(starting, trans)

array([ 0.4       ,  0.26666667,  0.33333333])

So, Airport 1 gained some cars, Airport 2 lost some, and Airport 3 held steady.

We could just as well write down the total number of cars [120, 120, 120] and multiply to get [144, 96, 120], but working in terms of the percentage of cars will be useful later.

Becuase the new state can be found knowing only the just-prior state, this is a Markov process. [New state only caring about most recent state is the definition of a Marokv process]

### In the limit
If we run the update process a bunch, we'll see what the rental manager sees:

In [25]:
cur =  starting
for _ in range(50):
    print(cur)
    cur = np.dot(cur, trans)
cur

[ 0.33333333  0.33333333  0.33333333]
[ 0.4         0.26666667  0.33333333]
[ 0.44  0.24  0.32]
[ 0.464  0.228  0.308]
[ 0.4784  0.222   0.2996]
[ 0.48704  0.21876  0.2942 ]
[ 0.492224  0.216924  0.290852]
[ 0.4953344  0.2158548  0.2888108]
[ 0.49720064  0.215223    0.28757636]
[ 0.49832038  0.21484684  0.28683278]
[ 0.49899223  0.21462201  0.28638576]
[ 0.49939534  0.21448738  0.28611728]
[ 0.4996372   0.21440668  0.28595612]
[ 0.49978232  0.21435828  0.28585939]
[ 0.49986939  0.21432925  0.28580135]
[ 0.49992164  0.21431184  0.28576653]
[ 0.49995298  0.21430139  0.28574563]
[ 0.49997179  0.21429512  0.28573309]
[ 0.49998307  0.21429136  0.28572557]
[ 0.49998984  0.2142891   0.28572106]
[ 0.49999391  0.21428775  0.28571835]
[ 0.49999634  0.21428693  0.28571672]
[ 0.49999781  0.21428645  0.28571575]
[ 0.49999868  0.21428615  0.28571516]
[ 0.49999921  0.21428598  0.28571481]
[ 0.49999953  0.21428587  0.2857146 ]
[ 0.49999972  0.21428581  0.28571448]
[ 0.49999983  0.21428577  0.2857144 ]

array([ 0.5       ,  0.21428571,  0.28571429])

Half the cars end up at Airport 1! Infuriating!

The manager goes mad and gets a fleet of drivers to reset the cars. This time they'll put more cars at Airport 2 and 3 to combat the drift!

In [27]:
starting = np.array([2/10,4/10,4/10])

cur =  starting
for _ in range(50):
    print(cur)
    cur = np.dot(cur, trans)
cur

[ 0.2  0.4  0.4]
[ 0.32  0.3   0.38]
[ 0.392  0.258  0.35 ]
[ 0.4352  0.2382  0.3266]
[ 0.46112  0.22794  0.31094]
[ 0.476672  0.22227   0.301058]
[ 0.4860032  0.2190138  0.294983 ]
[ 0.49160192  0.21710382  0.29129426]
[ 0.49496115  0.21597095  0.28906789]
[ 0.49697669  0.21529517  0.28772814]
[ 0.49818601  0.21489088  0.2869231 ]
[ 0.49891161  0.21464866  0.28643973]
[ 0.49934697  0.21450344  0.2861496 ]
[ 0.49960818  0.21441633  0.28597549]
[ 0.49976491  0.21436408  0.28587101]
[ 0.49985894  0.21433273  0.28580832]
[ 0.49991537  0.21431393  0.28577071]
[ 0.49994922  0.21430264  0.28574814]
[ 0.49996953  0.21429587  0.2857346 ]
[ 0.49998172  0.21429181  0.28572647]
[ 0.49998903  0.21428937  0.2857216 ]
[ 0.49999342  0.21428791  0.28571867]
[ 0.49999605  0.21428703  0.28571692]
[ 0.49999763  0.2142865   0.28571587]
[ 0.49999858  0.21428619  0.28571523]
[ 0.49999915  0.214286    0.28571485]
[ 0.49999949  0.21428588  0.28571463]
[ 0.49999969  0.21428582  0.28571449]
[ 0.49999982  0.2142

array([ 0.5       ,  0.21428571,  0.28571429])

My god, the same distribution emerged! Feel free to play with the starting conditions; you'll always get this distribution back.

It turns out that the long-run distribution depends only on the structure of the transition matrix.

### Intuition
It should be clear that the long-run distribution is sepcial in that once we reach it we don't leave. In essence, the number of cars folowing out of any given airport is equal to the number of cars flowing in to that airport.

The number of cars flowing out of a node depends on its row of the transition matrix [which can't change] and the number of cars present. Suppose that the number of cars coming in is more or less fixed (75), and that a location presently has 100 cars and loses 80% of them at each update. This location will lose 80 cars and gain 75, ending up at 95 in inventory. The next day it will only lose 76 cars (instead of the 80 it just lost) becuase it has a lower number of cars in stock. It will net lose one car this time, ending with a stock of 94.

This can be thought of almost like plumbing: the outflow from a node depends on the pipe sizes (transition probabilites), but also on the pressue at the node itself (number of cars present).

Equilibrium occurs when losses match gains, in this case at 93.75 cars.

The general situation is a little more detailed since inflow isn't static, but we can see why equilibria should occur and be stable: if stock is above equilibrium inflow the node will emit more cars than it takes in and head to stability; if stock is below equilibrium inflow it will take in more cars than it emits and head to stability.

Later on, we'll develop precise conditions for equilibria to exist and be unique. It will boil down to "no disconnected nodes" and "no deterministic loops".


### Zooming in
Furious that the same distribution keeps emerging, the manager studies the problem some more. They add a red ferrari to the fleet, and track which airports it spends its time at.

Let's write the simulation:

In [65]:
import matplotlib.pyplot as plt
runtime = 100000
ferrari_locations = np.empty(runtime)

#the ferrari starts at airport 2
possible_locations = [0,1,2]
cur_location = 2 
#the code is much faster if location is encoded as [0,0,1] and we
#reduce via np.random.multinomial instead of random.choice, but
#we want to emphasize that the ferrari has a definite location at each step
for i in range(runtime):
    #get the probability of being at each location
    location_probs = trans[cur_location,:]
    #decide which location the car actually moved to
    actual_location = np.random.choice(possible_locations,p=location_probs)
    
    #store results
    cur_location = actual_location
    ferrari_locations[i] = actual_location
    
#find the percentage of time at each location
count_at_location = np.histogram(ferrari_locations, bins=[0,1,2,3])[0]
print(count_at_location/runtime)

[ 0.50227  0.21445  0.28328]


It happened again! The Ferarri wandered along whatever particular path between the airports, but the time it spends at each airport matches the long-run distribution we found above.

If we were to observe the ferrari's location over a large set of times (or a random set of times) we'll get back samples from the equilibrium distribution (airport 0 comes up half the time, etc).

### Exploitation:
Here, some very clever people had an idea. What if instead of just three states we had lots of states reprsenting, say, x-values instead of airports. We could cleverly rig up a transition matrix so that the long-term distribution is, say, a binomial distribution over those x values.

Then, to get a sample from the binomial we could just check in on the position of the ferarri at any given time!

The basic idea is sound, but there are implementation issues we must overcome. We'll need to figure out how to build such a clever transition matrix, and if we want to support continuous distributions we'll need to jump from a finite number of states to an infinite number.

This idea: 'follow the ferrari', or less poetically, "just pick a starting state and whack it with a well-chosen transition rule a few thousand times" is the kernel of all of Markov Chain Monte Carlo, one of the most revolutionary ideas in computation.