Skip to content

Latest commit

 

History

History
552 lines (329 loc) · 29.4 KB

drunkards_walk.md

File metadata and controls

552 lines (329 loc) · 29.4 KB

The Drunkard's Walk In Detail

Introduction

I would like to spend some time on a simple random walk on the non-negative integers 0 ... k called "the drunkard's walk" A good amount of probabilistic intuition can be had by studying the this system. This note is a quick way to derive some of the observations used in "A Slightly Unfair Game" and to set up some of the machinery used in Wald's "Sequential Analysis" (a relative of A/B testing).

This differs from many treatments of Markov chains in that we are analyzing an "absorbing chain" (instead of a "regular" or "ergodic" one, see Kemeny, Snell, Finite Markov Chains, 2nd Edition, 1976, Springer Verlag), and we are using bespoke recurrence arguments instead of the usual move to linear algebra. In fact I would like to prove a standard result that I think many will find unlikely or surprising: conditioning on where this Markov chain eventually will stop greatly changes the observed transition probabilities, yet does preserve the Markov property of independence from the past.

The drunkard's walk

Pick an integer k > 0. The states of our random process will be the integers 0 through k. The integers 0 and k are both "stop conditions" called "stop zero" and "stop non-zero". For an integer i strictly larger than 0 and strictly less than k, our random process is: pick the next integer to be either i+1 (called moving up or moving right) or i-1 (called moving down or moving left) with equal "50/50" probability. We stop the process when the state i is either 0 or k. This is called a "random walk" and is often known as "the drunkard's walk". Some variations have "stay probabilities" prior to stopping, but we will not need these here.

An example of this system "Pk[] system" for k = 4 is given below.

A lot can be directly seen for this walk. First we show the following.

  1. The drunkard's walk has the Markov property (and what that property is).
  2. The probability of a walk started in state-i "stopping non-zero" (first reaching k, with no prior visits to 0) is i/k.
  3. The expected time of a walk started in state-i to hit either of the stopping conditions (reaching 0 or k for the first time) is i * (k - i).

The Markov property

The drunkard's walk is a famous example of a Markov chain, or a sequential random process with the Markov property.

The Markov property is when the relative probability of future states of a system depends only on the current state, and not on older details of history or how long a process has been running. The Markov property is exactly saying that we can make statements such as the "the probability of moving from state 2 to state 3 is 1/2" without having to refer to the entire history of the random process prior to the state in question.

Deriving the probability of "stopping non-zero"

Let prob_posk(i) denote the probability that the drunkard's walk started at integer i (0 ≤ i ≤ k) stops at k before ever visiting 0.

For any i with 0 < i < k we can expand prob_posk(i) by one walk step to get:


   prob_posk(i) = (1/2) prob_posk(i-1) + (1/2) prob_posk(i+1)

It is then a matter of algebra to check that prob_posk(i) = i/k obeys this recurrence, and has the desired values for the boundary i = 0, k.

As a warm up we algebraically confirm the claim.

import sympy
import numpy as np
rng = np.random.default_rng(2023)
i, k = sympy.symbols("i k")

check = sympy.expand(
    ((1/2) * (i-1)/k + (1/2) * (i+1)/k) 
    - i/k
)

assert check == 0
print(check)
0

Deriving the probability of "stopping zero" for p ≠ 1/2

Let prob_zerop,k(i) denote the probability that an unfair drunkard's walk started at integer i (0 ≤ i ≤ k) stops at 0 before ever visiting k when the probability of increase state is p and decrease state is 1-p.

For k > 1 can expand prob_zerop,k(1) by one walk step to get:


   prob_zerop,k(1) = 
     (1-p)   # visit zero in 1 step
     + p prob_zerop,k-1(1) prob_zerop,k(1)   # move up, return, and try again

For 0 < p < 1 with p ≠ 1/2 it is then a matter of algebra to check that prob_zerop,k(1) = ((p/(1-p))k-1 – 1) / ((p/(1-p))k – 1) obeys this recurrence and has boundary condition prob_zerop,1(1) = 0

We confirm the claim.

i, k, p = sympy.symbols("i k p")

def f(k):
    return ((p/(1-p))**(k-1) - 1) / ((p/(1-p))**k - 1)

num, den = sympy.fraction((f(k) - (
    (1-p)
    + p * f(k-1) * f(k)
)).together())

print("numerator: " + str(num))
print("denominator: " + str(den))
numerator: p*((p/(1 - p))**k - 1) - p*((p/(1 - p))**(k - 2) - 1) - (p/(1 - p))**k + (p/(1 - p))**(k - 1)
denominator: (p/(1 - p))**k - 1
# show numerator simplifies to zero (divide out common term)
check = (num / (p/(1-p))**(k-1)).expand().simplify()

assert check == 0
print(check)
0
# check boundary condition
assert f(1) == 0

Of special interest is the p = 2/3 case, as it satisfies the limiting recurrence for k = ∞ with a "hit zero" probability of 1/2


   prob_zero2/3,∞(1) = 
     (1 - 2/3)
     + 2/3 prob_zero2/3,∞(1) prob_zero2/3,∞(1)

z = sympy.symbols('z')

soln = sympy.solve(
    z - ((1 - 2/3) + (2/3) * z * z),
    z
)

assert np.sum([zi == 1/2 for zi in soln]) > 0
soln
[0.500000000000000, 1.00000000000000]

Deriving the "expected time to stop"

Let e_timek(i) denote the expected number of steps the (fair) drunkard's walk started at integer i (0 ≤ i ≤ k) takes to reach either of 0 or k for the first time. We are not yet specifying which one is first reached.

For any i with 0 < i < k we can expand e_timek(i) by one walk step to get: e_timek(i) = 1 + (1/2) e_timek(i-1) + (1/2) e_timek(i+1). It is again, a matter of algebra to check that e_timek(i) = i * (k-i) obeys this recurrence, and has the desired values on the boundary.

Let's confirm this claim.

i, k = sympy.symbols("i k")

check = sympy.expand(
    (1 + (1/2) * (i-1) * (k-(i-1)) + (1/2) * (i+1) * (k-(i+1)))
    - i * (k-i)
)

assert check == 0
print(check)
0

The above directly gives us the common "expected time to move distance d is around d**2 steps" observation, without the usual appeal to linearity of expectation applied to independent variances.

Basic probability notation

Define a legal or valid trajectory as a sequence of states s = s0, s1, ... . Where:

  • All the sj are integers in the range 0 through k.
  • If sj isn't 0 or k, then |sj - sj+1| = 1 (the process has not yet stopped)
  • If sj is 0 or k, then sj+1 = sj (the chain has "stopped").

Let Sk(i) denote the set of all possible sequences s of integers obeying the above rules and where the first state s1 = i.

In an appendix we define the probability measure Pk,i[s] on Sk(i) that tells us the probability of observing a given sequence s starting from state i. We have the extra parameter (i) in our notation to avoid having to assume an initial starting distribution.

Throughout this note we use P[expression] as shorthand for sums: expression(s) P[s]. We also use P[expression | condition] as shorthand for P[condition and expression] / P[condition] (when P[condition] > 0).

Conditioning on the outcome

An interesting (and maybe even surprising) fact is: the following systems also have the Markov property and are therefore themselves Markov chains:

  • S⇐,k(i) defined as all the s in Sk(i) that "eventually stopped at 0", with the probability measure P⇐,k,i[] induced by restricting to S⇐,k(i).
  • S⇒,k(i) defined as all the s in Sk(i) that "eventually stopped at k", with the probability measure P⇒,k,i[] induced by restricting to S⇒,k(i).

We can in fact establish that P⇐,k[] (and similarly P⇒,k[]) has the Markov property. The argument is in an appendix.

Empirical conditional probabilities

We can show that the observed transition probabilities for P⇒,k[] are very different than the 1/2s seen for Pk[].

Let's set up some code to estimate these transition probabilities from simulation.

# our example k

k = 4
# observe the empirical transition probabilities for right absorbed systems
observed = dict()
for start_i in range(1, k):
    downs = 0
    ups = 0
    for rep in range(1000000):
        first_up = 0
        first_down = 0
        state_i = start_i
        # run until we hit the stopping conditions
        while (state_i > 0) and (state_i < k):
            coin_flip = rng.binomial(n=1, p=0.5, size=1)[0]
            if coin_flip > 0.5:
                if (first_up + first_down) == 0:
                    first_up = 1
                state_i = state_i + 1
            else:
                if (first_up + first_down) == 0:
                    first_down = 1
                state_i = state_i - 1
        # only count right-absorbed paths
        if state_i == k:
            downs = downs + first_down
            ups = ups + first_up
    observed[f"probability of seeing {start_i} to {start_i+1}"] = ups / (ups + downs) 

observed
{'probability of seeing 1 to 2': 1.0,
 'probability of seeing 2 to 3': 0.7505207655904511,
 'probability of seeing 3 to 4': 0.6665507149026005}

The conditional transition probabilities

It is a fact that the Pk,i[] probability system is Markov on Sk(i) (see appendix). Given this, we can write transition probabilities independent of history (for not yet stopped sequences, i.e. i ≠ 0, k):

  • prob_upk(i) = Pk,i[s2 = i+1 | s1 = i] = 1/2
  • prob_downk(i) = Pk,i[s2 = i-1 | s1 = i] = 1/2

where Pk,i[] is the probability measure from the Sk(i) system.

In an appendix we show the P⇐,k[] and P⇒,k[] probability measures are indeed Markov. This then allows us to define the following symbols for their transition probabilities (independent of earlier history):

  • prob_up⇐,k(i) = P⇐,k,i[s2 = i+1 | s1 = i]
  • prob_down⇐,k(i) = P⇐,k,i[s2 = i-1 | s1 = i]
  • prob_up⇒,k(i) = P⇒,k,i[s2 = i+1 | s1 = i]
  • prob_down⇒,k(i) = P⇒,k,i[s2 = i-1 | s1 = i]

Where P⇐,k,i[] is the probability measure from the S⇐,k(i) system and P⇒,k,i[] is the probability measure from the S⇒,k(i) system.

We can show for 0 < i < k:

  • prob_up⇒,k(i) = (i+1)/(2*i)
  • prob_down⇒,k(i) = 1 - prob_up⇒,k(i)

(and similar for prob_up⇐,k(i) and prob_down⇐,k(i))

The P⇒,k[] system for k = 4 with the above transition probabilities is portrayed below.

This means the sequences that eventually stop at k actually look like they are attracted in that direction! An example of this are Dr. Zumel's animations in A Slightly Unfair Game.

The P⇐,k[] system for k = 4 is as follows.

The 50/50 system is a mixture of these two systems, with the mixture proportions varying by state label.

Comparing observation to theory

Let's check our claimed theoretical transition probabilities in S⇒,k(i) are in fact close to our empirical estimates.

# show the theoretical up transitions in absorbed systems
prob_up_given_stop_k = lambda *, k, i: 0 if (i<=0) or (i>=k) else (i+1)/(2*i)
prob_down_given_stop_k = lambda *, k, i: 0 if (i<=1) or (i>=k) else 1 - prob_up_given_stop_k(i)
prob_up_given_stop_0 = lambda *, k, i: 0 if (i<=0) or (i>=k-1) else prob_down_given_stop_k(k-i)
prob_down_given_stop_0 = lambda *, k, i: 0 if (i<=0) or (i>=k) else prob_up_given_stop_k(k-i)
theoretical = {f"probability of seeing {i} to {i+1}": prob_up_given_stop_k(k=k, i=i) for i in range(1, k)}

theoretical
{'probability of seeing 1 to 2': 1.0,
 'probability of seeing 2 to 3': 0.75,
 'probability of seeing 3 to 4': 0.6666666666666666}
# confirm measurement matches theory
assert set(theoretical.keys()) == set(observed.keys())
for key, v_theoretical in theoretical.items():
    v_observed = observed[key]
    assert np.abs(v_theoretical - v_observed) < 1e-3

Conclusion

The point of this note is to get some familiarity with deep facts about a specific random walk, before delegating to the usual general analysis tools.

We have derived quite a few results for the "half up, half down" bounded drunkard's walk that stops at 0 or k (k > 0):

  • The probability of stopping at k after starting at i is prob_posk(i) = i/k.
  • The expected number of steps to stop after starting at i is e_timek(i) = i * (k-i). In particular for a start state i ~ k/2 this is k/2 * (k - k/2) = k2 / 4 expected run time.
  • The random process of looking only at sequences that stop at 0 is in fact itself a Markov chain.
  • The random process of looking only at sequences that stop at k is in fact itself a Markov chain.
  • If we condition on the walk stopping at k, then the probability of "stepping up" from state i (observing a i to i+1 transition on a sequence that has not yet stopped) is prob_up⇒,k(i) = (i+1)/(2*i).

The above is a bit more detail than one usually tolerates in analyzing a Markov chain. Showing that conditioning on where the Markov chain stops preserves the Markov property is of interest. Also deriving the expected quadratic run time directly is quite nice.

Thank you to Dr. Nina Zumel for comments and for preparing the diagrams.

Appendices

Probability notation in detail

Let's replace our informal probability measure with a strict measure on infinite sequences. Without a pre-defined probability space and measure, we are essentially making up probability statements as we go (instead of being able to derive them from a fixed base).

A run trajectory is a sequence of states s = s0, s1, ... . Where:

  • All the sj are integers in the range 0 through k.
  • If sj isn't 0 or k, then |sj - sj+1| = 1 (the process has not stopped)
  • If sj is 0 or k, then sj+1 = sj (the chain has "stopped").

Let Sk(i) denote the set of all possible sequences s of integers obeying the above rules where the first state s1 = i. Define n_transition(s) as the number of j > 1 such that vj-1 ≠ vj.

Define the probability measure Pk,i[] of s in Sk(i) assigning:

  • Pk,i[s] = ⊥ if s not in Sk(i) (invalid trajectories, not part of the probability space, "" called "bottom" and representing an invalid state).
  • Pk,i[s] = 1, if s1 = 0, k. Call such s "stopped."
  • Pk,i[s] = 0, when n_transition(s) is not finite. Call such s "never stopped."
  • Pk,i[s] = 1/2n_transition(s), when n_transition(s) is finite. Call such s "stopped" or "eventually."

We have the extra parameter (i) in our notation to avoid having to assume an initial starting distribution.

On can check that the Pk,i[s] ≥ 0 for all s in Sk(i) and sums in Sk(i) Pk,i[s] = 1. This meets the definition of a probability measure, so we can apply known probability theorems such as Bayes' law. The never stopped sequences have probability measure zero (so can be ignored in probability arguments; this is amusing, as this subset is the uncountable subset of the sequences).

The above random process P[] has "the Markov property" on Sk(): history becomes irrelevant. That is we claim:


   Pk,v1[sj+1 = v | s1=v1, ..., sj=vj] = Pk,vj[sj+1 = v | sj=vj]

when s1=v1, ..., sj=vj, sj+1 = v is a valid prefix of an s in Sk(v1).

Throughout this note we use P[expression] as shorthand for sums: expression(s) P[s]. We also use P[expression | condition] as shorthand for P[condition and expression] / P[condition] (when P[condition] > 0). For example Pk,i[sj+1 = v | sj=vj] denotes (sums in Sk(i): (sj=vj) and (sj+1 = v) Pk,i[s]) / (sums in Sk(i): sj=vj Pk,i[s]).

We often use the Markov property in analysis as follows: we can always pretend we are at the first move in a sequence! For 0 < i < k we have Pk,i[sj+1 = v2 | sj = v1] = Pk,v1[s2 = v2 | s1 = v1] (for s in Sk(i)).

Appendix: Pk[] has the Markov property

We sketch a rough outline of an argument why Pk[] over Sk() have the Markov property. This is an ugly proof of standard fact.

The Markov property itself is statements of the form:


   P[sj+1=v | sj=vj, sj-1=vj-1] = P[sj+1=v | sj=vj]

(when P[sj+1=v | sj=vj, sj-1=vj-1] is non-zero). I.e.: earlier history becomes irrelevant in conditional probabilities.

Call (s1, ..., sj) "A(s)" and (sj, ...) "B(s)". We have n_transition(s) = n_transition(A(s)) + n_transition(B(s)). So we can factor P[] as:


  Pk,i[f(A(s)) g(B(s))] = Pk,s1[f(A(s))] Pk,sj[f(B(s))]

(for s in Sk(i)).

The above follows from the usual factoring of sums of the form suma in A, b in B f(a) g(b) as (suma in A f(a))(sumb in B g(b)).

As we have the system is Markov over our states, it is justified to define prob_upk(i) as the probability the next state is i+1, given the next state is i.

Appendix: the derived measures

We define derived probability measures (for each i):

  • S⇐,k(i) defined as all the s in Sk(i) that "eventually stopped at 0".
  • P⇐,k,i[] as: P⇐,k,i[s] = ⊥ if s not in S⇐,k(i) else Pk,i[s] / (sumz in S⇐,k(i) Pk,i[z]).
  • S⇒,k(i) defined as all the s in Sk(i) that "eventually stopped at k".
  • P⇒,k,i[] as: P⇒,k,i[s] = ⊥ if s not in S⇒,k(i) else Pk,i[s] / (sumz in S⇒,k(i) Pk,i[z]).

Appendix: P⇐,k[] and P⇒,k[] have the Markov property

Here we establish the Markov property for our conditioned measures and compute the new transition probabilities. This uses standard probability facts (such as Bayes' law) and the Markov property of Pk[] (a standard fact, also established in an earlier appendix).

We will assume k > 1 and 0 < vj < k for j = 1...u to avoid trivial corner cases. To check if P⇒,k,v1[vu+1 = 1+vu | s1=v1, ..., su=vu] is Markov we need to show it equals f(vu) for some function f() independent of all state except vu and vu+1.

We have:


P⇒,k,v1[vu+1 = 1+vu | s1=v1, ..., su=vu]
  = Pk,v1[vu+1 = 1+vu | s1=v1, ..., su=vu, s stops at k]
  = Pk,v1[s stops at k | s1=v1, ..., su=vu, vu+1 = 1+vu] 
    * Pk,v1[vu+1 = 1+vu | s1=v1, ..., su=vu]
    / Pk,v1[stops at k | s1=v1, ..., su=vu]
  = Pk,1+vu[s stops at k | s1 = 1+vu] 
    * Pk,vu[vu+1 = 1+vu | s1=vu]
    / Pk,vu[stops at k | s1=vu]

At this point we have established the transition probabilities are Markov with respect to the state, as we now have terms involving only the most recent state.

In the above:

  • The first line is definitional.
  • The second line is an application of Bayes' Law P[A|B] = P[B|A] P[A] / P[B], with A = "vu+1 = 1+vu" and B = "s stops at k".
  • The third line is the known Markov property of P[].

We can now justify defining prob_up⇒,k(i) as the probability the next state is i+1, given the next state is i and given chain eventually stops and k.

We have proved the relation:


prob_up⇒,k(i)
  = prob_upk(i) 
      * Pk,1+i[s stops at k | s1 = 1+i] 
      / Pk,i[stops at k | s1 = i]

This relates a quantity we want to know (prob_up⇒,k(i)) about the chain known to halt and k to only known facts about the unconstrained Markov chain. Roughly we can think of this as saying prob_up⇒,k(i) is read off as what fraction of the unconstrained chain's positive stops are coming from a higher state.

To calculate the probability we can substitute in our known values for these probabilities.


prob_up⇒,k(i)
  = prob_upk(i) 
      * Pk,1+i[s stops at k | s1 = 1+i] 
      / Pk,i[stops at k | s1 = i]
  = (1/2)
    * ((1+i)/k)
    / (i/k)
  = (i+1)/(2*i)

In the above:

  • The second line is substituting in our previously calculated probabilities of "stopping non-zero", and the 1/2 transition probability under P[].
  • The last step is algebra.

For 0 < i < k we can write the transition probability (independent of older history) as:

  • prob_up⇒,k(i) = (i+1)/(2*i)
  • prob_down⇒,k(i) = 1 - prob_up⇒,k(i)

A similar argument establishes that P⇐,k[] is Markov on S⇐,k(). And we can get prob_up⇐,k and prob_down⇐,k by replacing i with k-i and also swapping the roles of up and down.

As a side note: the above work leads me wonder if "backward chain arguments" are perhaps being generalizations of Bayes' law. In particular I am anxious to re-read the brilliant "coupling from the past" arguments in Propp, James Gary; Wilson, David Bruce, "Exact sampling with coupled Markov chains and applications to statistical mechanics", (1996), Proceedings of the Seventh International Conference on Random Structures and Algorithms (Atlanta, GA, 1995), pp. 223–252 MR1611693. Another way of looking at it: is we are using the original "fair" chain as a substitute system to analyze the "known to stop at k."

Appendix: A non-Markov Conditioning

A bit of wisdom from Gian-Carlo Rota's "A Mathematician's Gossip":

Some theorems aer hygienic prescriptions meant to guard us against potentially unpleasant complications. Authors of mathematics books frequently forget to give any hint as to what these complications would be. This omission makes their exposition incomprehensible.

Rota extends this by arguing statements such as "all extremely regular rings aer fully normal" should not be motivated by demonstrating examples of extremely regular rings. Instead one needs to motivate such statements by showing examples of the horror of non-extremely regular rings the in addition are not even fully normal.

So let's demonstrate an example of a future conditioning does not in fact leave the Markov property intact. This is why it is exciting that conditioning on outcome (also in the future) leaving the Markov property intact is something to celebrate.

Consider a Markov chain walking on the integers 0 through 4 (with 0 and 4 as stopping states). If we start the random walk at state 2 and condition on the chain stopping in exactly 4 steps. This is not Markov on the original states, as the transition probabilities of such conditioned chains no longer depends on state (independent of history or time).

There are exactly 4 sequences obeying the above conditioning:

  • +-++
  • +---
  • -+++
  • -+--

The non-Markov condition can be seen: the second move must always be in the opposite direction of the first and the fourth move must be in the same direction as the first. So transition probabilities are no longer a function of current state alone.