In [8]:
import numpy as np

## Arrays and Vectorization

**Arrays** are sequences of same-type data points (most-often numbers).  Numpy allows us to work with the sequence without writing a for-loop, using a technique called **vectorization**.  

So, instead of writing:

```python
values = [1, 3, 6, 8, 2]
squares = [el ** 2 for value in values]
```

We can instead write:

```python
values = np.array([1, 3, 6, 8, 2])
squares = values ** 2
```

Besides an **array()** class, Numpy also includes a lot of math functions, which makes analysis much easier.  Let's try some out!

## Numpy Exercises

Turn the following list into an array:

```python
my_list = [3, 6, 2, 10]
```

Add 20 to each element of the array [-3, 5, 2].

In [14]:
x = np.array([-3, 5, 2])
x + 20

array([17, 25, 22])

In [16]:
np.add(np.array([-3, 5, 2]), 20)

array([17, 25, 22])

Get the absolute value of all numbers from -6 to 6

In [22]:
data = np.array(range(-6, 7))
np.absolute(data)

array([6, 5, 4, 3, 2, 1, 0, 1, 2, 3, 4, 5, 6])

In [27]:
data = np.arange(-6, 7)
np.abs(data)

array([6, 5, 4, 3, 2, 1, 0, 1, 2, 3, 4, 5, 6])

Get the first 5 values in the array.

In [30]:
data[:5]

array([-6, -5, -4, -3, -2])

Reverse-order the array below.

In [31]:
data[::-1]

array([ 6,  5,  4,  3,  2,  1,  0, -1, -2, -3, -4, -5, -6])

### Building Arrays

Numpy has some convenient array-building functions as well.  Some commonly-used are examples are **arange()**, **linspace()**, **zeros()**, and the random number generation functions in **random**.

Make an array containing the numbers 1 to 15.

In [32]:
np.arange(1, 16)

array([ 1,  2,  3,  4,  5,  6,  7,  8,  9, 10, 11, 12, 13, 14, 15])

In [34]:
np.linspace(1, 15, 15)

array([ 1.,  2.,  3.,  4.,  5.,  6.,  7.,  8.,  9., 10., 11., 12., 13.,
       14., 15.])

In [35]:
np.array(range(1, 16))

array([ 1,  2,  3,  4,  5,  6,  7,  8,  9, 10, 11, 12, 13, 14, 15])

Make an array containing 20 zeros.

In [33]:
np.zeros(20)

array([0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.,
       0., 0., 0.])

Make an array contain 20 ones!

In [36]:
np.ones(20)

array([1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1.,
       1., 1., 1.])

In [37]:
np.zeros(20) + 1

array([1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1.,
       1., 1., 1.])

In [38]:
np.ones(20) * 1 + 1 - 1

array([1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1.,
       1., 1., 1.])

Generate an array of 10 random numbers from Numpy's **random** submodule, using any function you want.

In [45]:
x = np.random.random((3, 5))
x

array([[0.5798537 , 0.90839763, 0.63574701, 0.49924858, 0.56818441],
       [0.02762499, 0.34086583, 0.13595581, 0.99024784, 0.87429868],
       [0.871498  , 0.3436759 , 0.1247118 , 0.83962319, 0.19700305]])

Get the mean of these numbers.

In [46]:
np.mean(x)

0.5291290944111886

In [47]:
x.mean()

0.5291290944111886

In [48]:
np.transpose(x)

array([[0.5798537 , 0.02762499, 0.871498  ],
       [0.90839763, 0.34086583, 0.3436759 ],
       [0.63574701, 0.13595581, 0.1247118 ],
       [0.49924858, 0.99024784, 0.83962319],
       [0.56818441, 0.87429868, 0.19700305]])

In [49]:
x.transpose()

array([[0.5798537 , 0.02762499, 0.871498  ],
       [0.90839763, 0.34086583, 0.3436759 ],
       [0.63574701, 0.13595581, 0.1247118 ],
       [0.49924858, 0.99024784, 0.83962319],
       [0.56818441, 0.87429868, 0.19700305]])

In [50]:
x.T

array([[0.5798537 , 0.02762499, 0.871498  ],
       [0.90839763, 0.34086583, 0.3436759 ],
       [0.63574701, 0.13595581, 0.1247118 ],
       [0.49924858, 0.99024784, 0.83962319],
       [0.56818441, 0.87429868, 0.19700305]])

What is the sum of these numbers?

In [54]:
x = np.random.random(10)
np.sum(x)

5.061527355589156

The standard deviation?

In [55]:
np.std(x)

0.2444230534429837

Subtract the mean of the array from each element in the array (a.k.a. "mean-centering" the values)

In [60]:
x - np.mean(x)

array([ 0.37387813, -0.01406969,  0.17948647,  0.15516739,  0.02740746,
       -0.32360428, -0.2110894 , -0.19955168,  0.33116786, -0.31879226])

### Slicing Arrays

Slice only the values of the array [1, 5, 3, 8, 7] that are greater than 3.

In [68]:
x = np.array([1, 5, 3, 8, 7])
mask = x > 3
mask

array([False,  True, False,  True,  True])

In [69]:
x[mask == False]

array([1, 3])

In [70]:
x[x > 3]

array([5, 8, 7])

Make an array of boolean values that says which elements of the array [4, 2, 10, 6, 1, 7] are even. (Hint: The result should be [True, True, True, True, False, False])

In [74]:
x = np.array([4, 2, 10, 6, 1, 7])
x % 2 < 1

array([ True,  True,  True,  True, False, False])

In [75]:
x = np.array([4, 2, 10, 6, 1, 7])
x % 2 == 0

array([ True,  True,  True,  True, False, False])

Make an array that says which elements in the array [4, 2, 10, 6, 1, 7] are equal to 10.

In [77]:
x = np.array([4, 2, 10, 6, 1, 7])
x == 10

array([False, False,  True, False, False, False])

Select only the values from the exercise above that are above the mean.

In [78]:
x[x > np.mean(x)]

array([10,  6,  7])

Multiply all the values in the array [2, 5, 1, 10, 6] by 100, but only if they are less than 6.

In [126]:
x = np.array([2, 5, 1, 10, 6])
x[x < 6] 
x

array([ 2,  5,  1, 10,  6])

In [134]:
x = np.array([1, 2, 3, 4, 5])
y = x[1:4]
y[0] = 999
x

array([  1, 999,   3,   4,   5])

In [85]:
i = 0

In [None]:
i = i + 1

In [95]:
i += 1
i

10

### Translating Algorithms into Code

Calculate the standard deviation of an array's values, without using the numpy.std() function.  (Formula can be found here: http://www.mathsisfun.com/data/standard-deviation-formulas.html)

1. Work out the Mean (the simple average of the numbers)
2. Then for each number: subtract the Mean and square the result
3. Then work out the mean of those squared differences.
4. Take the square root of that and we are done!


In [120]:
x = np.random.normal(10, 1, size=10000)
%timeit np.sqrt(np.mean((x - np.mean(x)) ** 2))

39.5 µs ± 413 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each)


In [121]:
%timeit np.std(x)

39.3 µs ± 3.78 µs per loop (mean ± std. dev. of 7 runs, 10000 loops each)


In [123]:
def standard_deviation(x):np.sqrt(np.mean((x - np.mean(x)) ** 2))
    return np.sqrt(np.mean((x - np.mean(x)) ** 2))

In [124]:
standard_deviation(x)

1.0078775965073816