## Key points

1. Data is stored in memory in a linear fashion
1. `A numpy.ndarray` is a versatile object that can be used to store and process various kinds of numeric data efficiently
1. Python lists are slower and more difficult to work with than `ndarray` for numerical work

## Computing the mean efficiently

As a simple first example, we will consider the problem of computing the average (or *mean*) of a collection of random numbers between 0 and 1 (e.g., `0.241`, `0.429`, `0.012`). To generate these numbers, we can use the function `random.random`:

In [6]:
import random
random.random()

0.46284729488486553

Now that we know to generate a single random number, we can construct a simple `for` loop to generate a `list` of them:

In [48]:
N = 100
data = []
for i in range(N):
    data.append(random.random())

In [51]:
print(data)

[0.23534029388323263, 0.2512830314067692, 0.2735303484717052, 0.18963695918767165, 0.6744353068402732, 0.8915058106321414, 0.5339676520138728, 0.10608903391678681, 0.8577010029444818, 0.2066981578170951, 0.2894081994229112, 0.5352551873196615, 0.8136032490658739, 0.6987560545396956, 0.58982886385295, 0.6156873234190439, 0.05437231312947799, 0.07872863090926374, 0.7013832991630957, 0.7481294446268315, 0.6210270285145206, 0.9288247708077876, 0.8442902511630704, 0.6721149693261215, 0.6233513554372632, 0.24280250707636886, 0.5524249260623049, 0.5387181844492188, 0.0675918158287383, 0.3433498896920838, 0.6679989643120914, 0.19073853527035445, 0.909096409992837, 0.8451816757611608, 0.8887406325227667, 0.8278399798161655, 0.5395083202077453, 0.5656483393374391, 0.8159977737934211, 0.8384802064328517, 0.4847958642491441, 0.7214511504534955, 0.9283875350873286, 0.8362140738690768, 0.25869471225574314, 0.8093466839982717, 0.1169236060553217, 0.6375750601989424, 0.15984810045610154, 0.55887625654

A simple method for computing the average of a list of numbers is as follows:

In [54]:
data_sum = 0
for num in data:
    data_sum += num
data_mean = data_sum/N

In [55]:
print(data_mean)

0.5107446201922261


## Exercise 1

### Writing a `mean` function

Using the loop above, write a function called `mean`, which accepts a single input called `data` (expected to be a `list`), and returns the average of the values in `data`. For example, your function should behave as follows:

```.python
test_data = [0, 0.5, 0.1]
print(mean(test_data))
```

```.output
0.19999999999999998
```

    

In [None]:
# %load solution_1.py

In [56]:
test_data = [0, 0.5, 0.1]
print(mean(test_data))

0.19999999999999998


Let's now time our `mean` function for a few different data sizes:

In [59]:
%timeit mean()

CPU times: user 0 ns, sys: 0 ns, total: 0 ns
Wall time: 10.7 µs


In [37]:
import numpy as np

In [38]:
np.mean(data)

0.55299919997144653

In [40]:
print(mean(data))

0.5529991999714464


In [41]:
print(np.mean(data))

0.552999199971


In [42]:
%timeit mean(data)

9.02 µs ± 125 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)


In [43]:
%timeit np.mean(data)

33.9 µs ± 131 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each)


In [44]:
N = 100000
data = np.random.rand(N)

In [45]:
%timeit mean(data)

28.8 ms ± 6.4 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)


In [46]:
%timeit np.mean(data)

131 µs ± 167 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each)
