# NumPy Example Random Walks
An illustrative application of utilizing array operations is in the simulation of random walks. Let’s first consider a simple random walk starting at 0 with steps of 1 and -1 occurring with equal probability. A pure Python way to implement a single random walk with 1,000 steps using the built-in random module:

In [3]:
import random
position = 0
walk = [position]
steps = 1000
for i in xrange(steps):
    step = 1 if random.randint(0, 1) else -1
    position += step
    walk.append(position)

You might make the observation that walk is simply the cumulative sum of the random steps and could be evaluated as an array expression. Thus, I use the np.random module to draw 1,000 coin flips at once, set these to 1 and -1, and compute the cumulative sum:

In [5]:
import numpy as np

nsteps = 1000
draws = np.random.randint(0, 2, size=nsteps)
steps = np.where(draws > 0, 1, -1)
walk = steps.cumsum()
walk.min()

-4

In [6]:
walk.max()

45

Here we might want to know how long it took the random walk to get at least 10 steps away from the origin 0 in either direction. np.abs(walk) >= 10 gives us a boolean array indicating where the walk has reached or exceeded 10, but we want the index of the first 10 or -10. Turns out this can be computed using `argmax`, which returns the first index of the maximum value in the boolean array (True is the maximum value):

In [7]:
(np.abs(walk) >= 10).argmax()

151

## Simulating Many Random Walks at Once
If your goal was to simulate many random walks, say 5,000 of them, you can generate all of the random walks with minor modifications to the above code. The numpy.random functions if passed a 2-tuple will generate a 2D array of draws, and we can compute the cumulative sum across the rows to compute all 5,000 random walks in one shot:

In [8]:
nwalks = 5000
nsteps = 1000
draws = np.random.randint(0, 2, size=(nwalks, nsteps)) # 0 or 1
steps = np.where(draws > 0, 1, -1)
walks = steps.cumsum(1)
walks

array([[  1,   2,   1, ...,  54,  55,  56],
       [ -1,  -2,  -1, ...,  48,  49,  48],
       [  1,   2,   1, ...,  18,  19,  20],
       ..., 
       [  1,   2,   1, ..., -94, -93, -92],
       [  1,   0,   1, ..., -10,  -9,  -8],
       [  1,   2,   1, ...,   8,   9,  10]])

Now, we can compute the maximum and minimum values obtained over all of the walks:

In [9]:
walks.max()

126

In [10]:
walks.min()

-123

Out of these walks, let’s compute the minimum crossing time to 30 or -30. This is slightly tricky because not all 5,000 of them reach 30. We can check this using the `any` method:

In [11]:
hits30 = (np.abs(walks) >= 30).any(1)
hits30

array([ True,  True,  True, ...,  True, False,  True], dtype=bool)

In [12]:
hits30.sum() # Number that hit 30 or -30

3352

We can use this boolean array to select out the rows of walks that actually cross the absolute 30 level and call `argmax` across axis 1 to get the crossing times:

In [13]:
crossing_times = (np.abs(walks[hits30]) >= 30).argmax(1)
crossing_times.mean()

512.26252983293557

Feel free to experiment with other distributions for the steps other than equal sized coin flips. You need only use a different random number generation function, like `normal` to generate normally distributed steps with some mean and standard deviation:

In [14]:
steps = np.random.normal(loc=0, scale=0.25, size=(nwalks, nsteps))
steps

array([[-0.06156853,  0.04186237,  0.12350331, ...,  0.44286713,
        -0.04890024, -0.05832853],
       [ 0.07992379, -0.49431927, -0.28209059, ..., -0.10821859,
         0.07765109,  0.21418729],
       [-0.04153766,  0.09211521, -0.12049272, ...,  0.05107135,
         0.29538126, -0.0713954 ],
       ..., 
       [-0.02207018,  0.37111419,  0.07684467, ...,  0.34328896,
         0.70356956, -0.37050724],
       [ 0.00475714, -0.16765689,  0.08845308, ...,  0.17437522,
         0.67641765, -0.13024071],
       [-0.1421904 , -0.1309853 ,  0.2816003 , ..., -0.33062671,
         0.0279834 ,  0.18297758]])