I would like to spend some time on a simple random walk on the non-negative integers 0 ... k
called "the drunkard's walk" A good amount of probabilistic intuition can be had by studying the this system. This note is a quick way to derive some of the observations used in "A Slightly Unfair Game" and to set up some of the machinery used in Wald's "Sequential Analysis" (a relative of A/B testing).
This differs from many treatments of Markov chains in that we are analyzing an "absorbing chain" (instead of a "regular" or "ergodic" one, see Kemeny, Snell, Finite Markov Chains, 2nd Edition, 1976, Springer Verlag), and we are using bespoke recurrence arguments instead of the usual move to linear algebra. In fact I would like to prove a standard result that I think many will find unlikely or surprising: conditioning on where this Markov chain eventually will stop greatly changes the observed transition probabilities, yet does preserve the Markov property of independence from the past.
Pick an integer k > 0
. The states of our random process will be the integers 0
through k
. The integers 0
and k
are both "stop conditions" called "stop zero" and "stop non-zero". For an integer i
strictly larger than 0
and strictly less than k
, our random process is: pick the next integer to be either i+1
(called moving up or moving right) or i-1
(called moving down or moving left) with equal "50/50" probability. We stop the process when the state i
is either 0
or k
. This is called a "random walk" and is often known as "the drunkard's walk". Some variations have "stay probabilities" prior to stopping, but we will not need these here.
An example of this system "Pk[]
system" for k = 4
is given below.
A lot can be directly seen for this walk. First we show the following.
- The drunkard's walk has the Markov property (and what that property is).
- The probability of a walk started in state-
i
"stopping non-zero" (first reachingk
, with no prior visits to0
) isi/k
. - The expected time of a walk started in state-
i
to hit either of the stopping conditions (reaching0
ork
for the first time) isi * (k - i)
.
The drunkard's walk is a famous example of a Markov chain, or a sequential random process with the Markov property.
The Markov property is when the relative probability of future states of a system depends only on the current state, and not on older details of history or how long a process has been running. The Markov property is exactly saying that we can make statements such as the "the probability of moving from state 2
to state 3
is 1/2
" without having to refer to the entire history of the random process prior to the state in question.
Let prob_posk(i)
denote the probability that the drunkard's walk started at integer i
(0 ≤ i ≤ k
) stops at k
before ever visiting 0
.
For any i
with 0 < i < k
we can expand prob_posk(i)
by one walk step to get:
prob_posk(i) = (1/2) prob_posk(i-1) + (1/2) prob_posk(i+1)
It is then a matter of algebra to check that prob_posk(i) = i/k
obeys this recurrence, and has the desired values for the boundary i = 0, k
.
As a warm up we algebraically confirm the claim.
import sympy
import numpy as np
rng = np.random.default_rng(2023)
i, k = sympy.symbols("i k")
check = sympy.expand(
((1/2) * (i-1)/k + (1/2) * (i+1)/k)
- i/k
)
assert check == 0
print(check)
0
Let prob_zerop,k(i)
denote the probability that an unfair drunkard's walk started at integer i
(0 ≤ i ≤ k
) stops at 0
before ever visiting k
when the probability of increase state is p
and decrease state is 1-p
.
For k > 1
can expand prob_zerop,k(1)
by one walk step to get:
prob_zerop,k(1) =
(1-p) # visit zero in 1 step
+ p prob_zerop,k-1(1) prob_zerop,k(1) # move up, return, and try again
For 0 < p < 1
with p ≠ 1/2
it is then a matter of algebra to check that prob_zerop,k(1) = ((p/(1-p))k-1 – 1) / ((p/(1-p))k – 1)
obeys this recurrence and has boundary condition prob_zerop,1(1) = 0
We confirm the claim.
i, k, p = sympy.symbols("i k p")
def f(k):
return ((p/(1-p))**(k-1) - 1) / ((p/(1-p))**k - 1)
num, den = sympy.fraction((f(k) - (
(1-p)
+ p * f(k-1) * f(k)
)).together())
print("numerator: " + str(num))
print("denominator: " + str(den))
numerator: p*((p/(1 - p))**k - 1) - p*((p/(1 - p))**(k - 2) - 1) - (p/(1 - p))**k + (p/(1 - p))**(k - 1)
denominator: (p/(1 - p))**k - 1
# show numerator simplifies to zero (divide out common term)
check = (num / (p/(1-p))**(k-1)).expand().simplify()
assert check == 0
print(check)
0
# check boundary condition
assert f(1) == 0
Of special interest is the p = 2/3
case, as it satisfies the limiting recurrence for k = ∞
with a "hit zero" probability of 1/2
prob_zero2/3,∞(1) =
(1 - 2/3)
+ 2/3 prob_zero2/3,∞(1) prob_zero2/3,∞(1)
z = sympy.symbols('z')
soln = sympy.solve(
z - ((1 - 2/3) + (2/3) * z * z),
z
)
assert np.sum([zi == 1/2 for zi in soln]) > 0
soln
[0.500000000000000, 1.00000000000000]
Let e_timek(i)
denote the expected number of steps the (fair) drunkard's walk started at integer i
(0 ≤ i ≤ k
) takes to reach either of 0
or k
for the first time. We are not yet specifying which one is first reached.
For any i
with 0 < i < k
we can expand e_timek(i)
by one walk step to get: e_timek(i) = 1 + (1/2) e_timek(i-1) + (1/2) e_timek(i+1)
. It is again, a matter of algebra to check that e_timek(i) = i * (k-i)
obeys this recurrence, and has the desired values on the boundary.
Let's confirm this claim.
i, k = sympy.symbols("i k")
check = sympy.expand(
(1 + (1/2) * (i-1) * (k-(i-1)) + (1/2) * (i+1) * (k-(i+1)))
- i * (k-i)
)
assert check == 0
print(check)
0
The above directly gives us the common "expected time to move distance d
is around d**2
steps" observation, without the usual appeal to linearity of expectation applied to independent variances.
Define a legal or valid trajectory as a sequence of states s = s0
, s1
, ... . Where:
- All the
sj
are integers in the range0
throughk
. - If
sj
isn't0
ork
, then|sj - sj+1| = 1
(the process has not yet stopped) - If
sj
is0
ork
, thensj+1 = sj
(the chain has "stopped").
Let Sk(i)
denote the set of all possible sequences s
of integers obeying the above rules and where the first state s1 = i
.
In an appendix we define the probability measure Pk,i[s]
on Sk(i)
that tells us the probability of observing a given sequence s
starting from state i
. We have the extra parameter (i
) in our notation to avoid having to assume an initial starting distribution.
Throughout this note we use P[expression]
as shorthand for sums: expression(s) P[s]
. We also use P[expression | condition]
as shorthand for P[condition and expression] / P[condition]
(when P[condition] > 0
).
An interesting (and maybe even surprising) fact is: the following systems also have the Markov property and are therefore themselves Markov chains:
S⇐,k(i)
defined as all thes
inSk(i)
that "eventually stopped at0
", with the probability measureP⇐,k,i[]
induced by restricting toS⇐,k(i)
.S⇒,k(i)
defined as all thes
inSk(i)
that "eventually stopped atk
", with the probability measureP⇒,k,i[]
induced by restricting toS⇒,k(i)
.
We can in fact establish that P⇐,k[]
(and similarly P⇒,k[]
) has the Markov property. The argument is in an appendix.
We can show that the observed transition probabilities for P⇒,k[]
are very different than the 1/2
s seen for Pk[]
.
Let's set up some code to estimate these transition probabilities from simulation.
# our example k
k = 4
# observe the empirical transition probabilities for right absorbed systems
observed = dict()
for start_i in range(1, k):
downs = 0
ups = 0
for rep in range(1000000):
first_up = 0
first_down = 0
state_i = start_i
# run until we hit the stopping conditions
while (state_i > 0) and (state_i < k):
coin_flip = rng.binomial(n=1, p=0.5, size=1)[0]
if coin_flip > 0.5:
if (first_up + first_down) == 0:
first_up = 1
state_i = state_i + 1
else:
if (first_up + first_down) == 0:
first_down = 1
state_i = state_i - 1
# only count right-absorbed paths
if state_i == k:
downs = downs + first_down
ups = ups + first_up
observed[f"probability of seeing {start_i} to {start_i+1}"] = ups / (ups + downs)
observed
{'probability of seeing 1 to 2': 1.0,
'probability of seeing 2 to 3': 0.7505207655904511,
'probability of seeing 3 to 4': 0.6665507149026005}
It is a fact that the Pk,i[]
probability system is Markov on Sk(i)
(see appendix). Given this, we can write transition probabilities independent of history (for not yet stopped sequences, i.e. i ≠ 0, k
):
prob_upk(i) = Pk,i[s2 = i+1 | s1 = i] = 1/2
prob_downk(i) = Pk,i[s2 = i-1 | s1 = i] = 1/2
where Pk,i[]
is the probability measure from the Sk(i)
system.
In an appendix we show the P⇐,k[]
and P⇒,k[]
probability measures are indeed Markov. This then allows us to define the following symbols for their transition probabilities (independent of earlier history):
prob_up⇐,k(i) = P⇐,k,i[s2 = i+1 | s1 = i]
prob_down⇐,k(i) = P⇐,k,i[s2 = i-1 | s1 = i]
prob_up⇒,k(i) = P⇒,k,i[s2 = i+1 | s1 = i]
prob_down⇒,k(i) = P⇒,k,i[s2 = i-1 | s1 = i]
Where P⇐,k,i[]
is the probability measure from the S⇐,k(i)
system and P⇒,k,i[]
is the probability measure from the S⇒,k(i)
system.
We can show for 0 < i < k
:
prob_up⇒,k(i) = (i+1)/(2*i)
prob_down⇒,k(i) = 1 - prob_up⇒,k(i)
(and similar for prob_up⇐,k(i)
and prob_down⇐,k(i)
)
The P⇒,k[]
system for k = 4
with the above transition probabilities is portrayed below.
This means the sequences that eventually stop at k
actually look like they are attracted in that direction! An example of this are Dr. Zumel's animations in A Slightly Unfair Game.
The P⇐,k[]
system for k = 4
is as follows.
The 50/50 system is a mixture of these two systems, with the mixture proportions varying by state label.
Let's check our claimed theoretical transition probabilities in S⇒,k(i)
are in fact close to our empirical estimates.
# show the theoretical up transitions in absorbed systems
prob_up_given_stop_k = lambda *, k, i: 0 if (i<=0) or (i>=k) else (i+1)/(2*i)
prob_down_given_stop_k = lambda *, k, i: 0 if (i<=1) or (i>=k) else 1 - prob_up_given_stop_k(i)
prob_up_given_stop_0 = lambda *, k, i: 0 if (i<=0) or (i>=k-1) else prob_down_given_stop_k(k-i)
prob_down_given_stop_0 = lambda *, k, i: 0 if (i<=0) or (i>=k) else prob_up_given_stop_k(k-i)
theoretical = {f"probability of seeing {i} to {i+1}": prob_up_given_stop_k(k=k, i=i) for i in range(1, k)}
theoretical
{'probability of seeing 1 to 2': 1.0,
'probability of seeing 2 to 3': 0.75,
'probability of seeing 3 to 4': 0.6666666666666666}
# confirm measurement matches theory
assert set(theoretical.keys()) == set(observed.keys())
for key, v_theoretical in theoretical.items():
v_observed = observed[key]
assert np.abs(v_theoretical - v_observed) < 1e-3
The point of this note is to get some familiarity with deep facts about a specific random walk, before delegating to the usual general analysis tools.
We have derived quite a few results for the "half up, half down" bounded drunkard's walk that stops at 0
or k
(k > 0
):
- The probability of stopping at
k
after starting ati
isprob_posk(i) = i/k
. - The expected number of steps to stop after starting at
i
ise_timek(i) = i * (k-i)
. In particular for a start statei ~ k/2
this isk/2 * (k - k/2) = k2 / 4
expected run time. - The random process of looking only at sequences that stop at
0
is in fact itself a Markov chain. - The random process of looking only at sequences that stop at
k
is in fact itself a Markov chain. - If we condition on the walk stopping at
k
, then the probability of "stepping up" from statei
(observing ai
toi+1
transition on a sequence that has not yet stopped) isprob_up⇒,k(i) = (i+1)/(2*i)
.
The above is a bit more detail than one usually tolerates in analyzing a Markov chain. Showing that conditioning on where the Markov chain stops preserves the Markov property is of interest. Also deriving the expected quadratic run time directly is quite nice.
Thank you to Dr. Nina Zumel for comments and for preparing the diagrams.
Let's replace our informal probability measure with a strict measure on infinite sequences. Without a pre-defined probability space and measure, we are essentially making up probability statements as we go (instead of being able to derive them from a fixed base).
A run trajectory is a sequence of states s = s0
, s1
, ... . Where:
- All the
sj
are integers in the range0
throughk
. - If
sj
isn't0
ork
, then|sj - sj+1| = 1
(the process has not stopped) - If
sj
is0
ork
, thensj+1 = sj
(the chain has "stopped").
Let Sk(i)
denote the set of all possible sequences s
of integers obeying the above rules where the first state s1 = i
. Define n_transition(s)
as the number of j > 1
such that vj-1 ≠ vj
.
Define the probability measure Pk,i[]
of s
in Sk(i)
assigning:
Pk,i[s] = ⊥
ifs
not inSk(i)
(invalid trajectories, not part of the probability space, "⊥
" called "bottom" and representing an invalid state).Pk,i[s] = 1
, ifs1 = 0, k
. Call suchs
"stopped."Pk,i[s] = 0
, whenn_transition(s)
is not finite. Call suchs
"never stopped."Pk,i[s] = 1/2n_transition(s)
, whenn_transition(s)
is finite. Call suchs
"stopped" or "eventually."
We have the extra parameter (i
) in our notation to avoid having to assume an initial starting distribution.
On can check that the Pk,i[s] ≥ 0
for all s
in Sk(i)
and sums in Sk(i) Pk,i[s] = 1
. This meets the definition of a probability measure, so we can apply known probability theorems such as Bayes' law. The never stopped sequences have probability measure zero (so can be ignored in probability arguments; this is amusing, as this subset is the uncountable subset of the sequences).
The above random process P[]
has "the Markov property" on Sk()
: history becomes irrelevant. That is we claim:
Pk,v1[sj+1 = v | s1=v1, ..., sj=vj] = Pk,vj[sj+1 = v | sj=vj]
when s1=v1, ..., sj=vj, sj+1 = v
is a valid prefix of an s
in Sk(v1)
.
Throughout this note we use P[expression]
as shorthand for sums: expression(s) P[s]
. We also use P[expression | condition]
as shorthand for P[condition and expression] / P[condition]
(when P[condition] > 0
). For example Pk,i[sj+1 = v | sj=vj]
denotes (sums in Sk(i): (sj=vj) and (sj+1 = v) Pk,i[s]) / (sums in Sk(i): sj=vj Pk,i[s])
.
We often use the Markov property in analysis as follows: we can always pretend we are at the first move in a sequence! For 0 < i < k
we have Pk,i[sj+1 = v2 | sj = v1] = Pk,v1[s2 = v2 | s1 = v1]
(for s
in Sk(i)
).
We sketch a rough outline of an argument why Pk[]
over Sk()
have the Markov property. This is an ugly proof of standard fact.
The Markov property itself is statements of the form:
P[sj+1=v | sj=vj, sj-1=vj-1] = P[sj+1=v | sj=vj]
(when P[sj+1=v | sj=vj, sj-1=vj-1]
is non-zero). I.e.: earlier history becomes irrelevant in conditional probabilities.
Call (s1, ..., sj)
"A(s)
" and (sj, ...)
"B(s)
". We have n_transition(s) = n_transition(A(s)) + n_transition(B(s))
. So we can factor P[]
as:
Pk,i[f(A(s)) g(B(s))] = Pk,s1[f(A(s))] Pk,sj[f(B(s))]
(for s
in Sk(i)
).
The above follows from the usual factoring of sums of the form suma in A, b in B f(a) g(b)
as (suma in A f(a))(sumb in B g(b))
.
As we have the system is Markov over our states, it is justified to define prob_upk(i)
as the probability the next state is i+1
, given the next state is i
.
We define derived probability measures (for each i
):
S⇐,k(i)
defined as all thes
inSk(i)
that "eventually stopped at0
".P⇐,k,i[]
as:P⇐,k,i[s] = ⊥ if s not in S⇐,k(i) else Pk,i[s] / (sumz in S⇐,k(i) Pk,i[z])
.S⇒,k(i)
defined as all thes
inSk(i)
that "eventually stopped atk
".P⇒,k,i[]
as:P⇒,k,i[s] = ⊥ if s not in S⇒,k(i) else Pk,i[s] / (sumz in S⇒,k(i) Pk,i[z])
.
Here we establish the Markov property for our conditioned measures and compute the new transition probabilities. This uses standard probability facts (such as Bayes' law) and the Markov property of Pk[]
(a standard fact, also established in an earlier appendix).
We will assume k > 1
and 0 < vj < k
for j = 1...u
to avoid trivial corner cases. To check if P⇒,k,v1[vu+1 = 1+vu | s1=v1, ..., su=vu]
is Markov we need to show it equals f(vu)
for some function f()
independent of all state except vu
and vu+1
.
We have:
P⇒,k,v1[vu+1 = 1+vu | s1=v1, ..., su=vu]
= Pk,v1[vu+1 = 1+vu | s1=v1, ..., su=vu, s stops at k]
= Pk,v1[s stops at k | s1=v1, ..., su=vu, vu+1 = 1+vu]
* Pk,v1[vu+1 = 1+vu | s1=v1, ..., su=vu]
/ Pk,v1[stops at k | s1=v1, ..., su=vu]
= Pk,1+vu[s stops at k | s1 = 1+vu]
* Pk,vu[vu+1 = 1+vu | s1=vu]
/ Pk,vu[stops at k | s1=vu]
At this point we have established the transition probabilities are Markov with respect to the state, as we now have terms involving only the most recent state.
In the above:
- The first line is definitional.
- The second line is an application of Bayes' Law
P[A|B] = P[B|A] P[A] / P[B]
, withA = "vu+1 = 1+vu"
andB = "s stops at k"
. - The third line is the known Markov property of
P[]
.
We can now justify defining prob_up⇒,k(i)
as the probability the next state is i+1
, given the next state is i
and given chain eventually stops and k
.
We have proved the relation:
prob_up⇒,k(i)
= prob_upk(i)
* Pk,1+i[s stops at k | s1 = 1+i]
/ Pk,i[stops at k | s1 = i]
This relates a quantity we want to know (prob_up⇒,k(i)
) about the chain known to halt and k
to only known facts about the unconstrained Markov chain. Roughly we can think of this as saying prob_up⇒,k(i)
is read off as what fraction of the unconstrained chain's positive stops are coming from a higher state.
To calculate the probability we can substitute in our known values for these probabilities.
prob_up⇒,k(i)
= prob_upk(i)
* Pk,1+i[s stops at k | s1 = 1+i]
/ Pk,i[stops at k | s1 = i]
= (1/2)
* ((1+i)/k)
/ (i/k)
= (i+1)/(2*i)
In the above:
- The second line is substituting in our previously calculated probabilities of "stopping non-zero", and the 1/2 transition probability under
P[]
. - The last step is algebra.
For 0 < i < k
we can write the transition probability (independent of older history) as:
prob_up⇒,k(i) = (i+1)/(2*i)
prob_down⇒,k(i) = 1 - prob_up⇒,k(i)
A similar argument establishes that P⇐,k[]
is Markov on S⇐,k()
. And we can get prob_up⇐,k
and prob_down⇐,k
by replacing i
with k-i
and also swapping the roles of up and down.
As a side note: the above work leads me wonder if "backward chain arguments" are perhaps being generalizations of Bayes' law. In particular I am anxious to re-read the brilliant "coupling from the past" arguments in Propp, James Gary; Wilson, David Bruce, "Exact sampling with coupled Markov chains and applications to statistical mechanics", (1996), Proceedings of the Seventh International Conference on Random Structures and Algorithms (Atlanta, GA, 1995), pp. 223–252 MR1611693. Another way of looking at it: is we are using the original "fair" chain as a substitute system to analyze the "known to stop at k
."
A bit of wisdom from Gian-Carlo Rota's "A Mathematician's Gossip":
Some theorems aer hygienic prescriptions meant to guard us against potentially unpleasant complications. Authors of mathematics books frequently forget to give any hint as to what these complications would be. This omission makes their exposition incomprehensible.
Rota extends this by arguing statements such as "all extremely regular rings aer fully normal" should not be motivated by demonstrating examples of extremely regular rings. Instead one needs to motivate such statements by showing examples of the horror of non-extremely regular rings the in addition are not even fully normal.
So let's demonstrate an example of a future conditioning does not in fact leave the Markov property intact. This is why it is exciting that conditioning on outcome (also in the future) leaving the Markov property intact is something to celebrate.
Consider a Markov chain walking on the integers 0 through 4 (with 0 and 4 as stopping states). If we start the random walk at state 2 and condition on the chain stopping in exactly 4 steps. This is not Markov on the original states, as the transition probabilities of such conditioned chains no longer depends on state (independent of history or time).
There are exactly 4 sequences obeying the above conditioning:
+-++
+---
-+++
-+--
The non-Markov condition can be seen: the second move must always be in the opposite direction of the first and the fourth move must be in the same direction as the first. So transition probabilities are no longer a function of current state alone.