In [None]:
import sys; sys.path.append("..")
#from os.path import abspath; print(abspath('.'))
from utils import count_down

# Current Homework

![homework: factory method](hw_factory.png)

There are two errors in the following solution, what are they?

In [None]:
@staticmethod
def filled(self, rows, cols, value):
    sample = []
    for i in range(cols):
        sample.append(value)
    result = []
    for i in range(rows):
        result.append(sample)
    self._matrix = result
    
count_down(3)

In [None]:
rows = cols = 3; value = 2
sample = []
for i in range(cols):
    sample.append(value)
result = []
for i in range(rows):
    result.append(sample)
    
result

In [None]:
result[0][1] = 3
result

* The 'filled'-method is a factory-method, which are always static. Static methods are not called on instances of objects and don't know specific instances, therefore using the keyword ``self`` is wrong. Instead, a temporary matrix should be created and then returned using the class's own constructor
* The first row of the matrix (a list-object) is generated and then used over and over again. This means the very same object gets re-used for all subsequent rows - which means that if you change an element of the first row, you change it for all rows.

# Numerical Computing with NumPy

## Introductory example

Python built-in collections like `list` offer a flexible way of storing and maniupulating data. As dicussed previously, collections usually just store references to objects. While this is every convenient when writing code, it comes with costs in performance in memory. 

Let's look at an example. Say we took one million measurements in an experiment and now want to compute the mean of it. We could do it in the following manner. 

In [None]:
import random 
measurements = [random.randint(150, 200) for _ in range(1_000_000)]
measurements

In [None]:
import random 
measurements = [random.randint(150, 200) for _ in range(1_000_000)]

def calculate_mean(measurements):
    accumulator = 0
    for measurement in measurements:
        accumulator += measurement
    
    mean = accumulator / len(measurements)
    return mean

%timeit calculate_mean(measurements)

This is rather slow since Python has to rebind a new variable in every loop and then has to check whether the `+` operation is supported between the `accumulator` and the current `measurement`. This prevents it from trying to add together objects that can't be added, but in this case we are pretty sure that we are only dealing with integers. If we could tell the interpreter that we are only adding integers, we could skip all that typechecking and speed up the operation. For this purpose, `numpy` was invented.  


To use numpy we have to import it. The import is usuall aliased as `np` so we have to type less later on. Aliasing things is only recommended if it is well established in the community of the respective package.

In [None]:
import numpy as np

Numpy's standard datatype is the `ndarray` (which stands for n-dimensional array). In the simplest case, numpy array can be created from list.

In [None]:
measurements_array = np.array(measurements)
measurements_array

In [None]:
type(measurements_array)

They behave very similar to list, but have a fixed datatype underneath. Numpy automatically notices that all our values are intergers and chooses the appropriate datatype. An integer that takes up 64 bits of memory. https://docs.scipy.org/doc/numpy-1.13.0/user/basics.types.html

In [None]:
measurements_array[0]

In [None]:
measurements_array[0:5]

In [None]:
measurements_array.dtype

Moreover, numpy offers a lot of routines for mathematical operations of arrays. Let's see if we acually gained something by using numpy.

In [None]:
%timeit np.mean(measurements_array)

Almost 100x speedup in comparision to the pure Python implementation! After convincing ourselfs that NumPy is useful, we have a more in depth look at the numpy array.

## Anatomy of arrays
Every array has a bunch of attributes that yield inforation about what it is.

### dtype

`.dtype` gives information about the data type. arrays can contain bools, ints, unsigned ints, floats or complex numbers of various byte sizes. They can also store strings or Python objects, but that has very few use cases.

In [None]:
values = [0, 1, 2, 3, 4]
int_arr = np.array(values, dtype='int')
int_arr, int_arr.dtype

If the dtype does not match the given values, numpy will cast everything to that data type.

In [None]:
bool_arr = np.array(values, dtype='bool')
bool_arr, bool_arr.dtype

If no explicit data type is given, numpy will choose the "smallest common denominator". In the following example, everything becomes a float, as ints can be represented as floats, but not vice versa.

In [None]:
values = [0, 1, 2.5, 3, 4]
float_arr = np.array(values)
float_arr, float_arr.dtype

However, once the data type is set, everything will be coerced to that type.

In [None]:
int_arr[1] = 2.5
int_arr, int_arr.dtype

These non-Python data types force us to again think about problems like overflow etc.

In [None]:
values = [0, 1, 2, 3, 4]
uint_arr = np.array(values, dtype='int8')
uint_arr, uint_arr.dtype

In [None]:
uint_arr[1] += 255
uint_arr

...and can lead to some problems when comparing them to standard python types

In [None]:
type(measurements_array[0]) == type(183)

In [None]:
np.array([1.2-1.0], dtype=np.float32)[0] == 1.2-1.0

For better comparisons, you can compare using an epsilon-value:

In [None]:
epsilon = 0.00001
abs(np.array([1.2-1.0], dtype=np.float32)[0] - (1.2-1.0) ) < epsilon

http://effbot.org/pyfaq/why-are-floating-point-calculations-so-inaccurate.htm

### shape and ndim
`.shape` is very important for keeping track of arrays with more than one dimension. It is a tuple with the number of elementns in each dimension. `.ndim` is just the number of dimensions in total. 

In [None]:
values = [0, 1, 2, 3, 4]
one_dim_arr = np.array(values)
one_dim_arr

In [None]:
one_dim_arr.shape

In [None]:
one_dim_arr.ndim

In [None]:
values = [[0, 1, 2, 3, 4]] * 3
two_dim_arr = np.array(values)
two_dim_arr

In [None]:
[[0, 1, 2, 3, 4]] * 3

In [None]:
two_dim_arr.shape

In [None]:
two_dim_arr.ndim

In [None]:
two_dim_arr[1,1] = 10

In [None]:
two_dim_arr

In [None]:
values = [[[0, 1, 2, 3, 4]] * 3] * 6
three_dim_arr = np.array(values)
three_dim_arr

In [None]:
three_dim_arr.shape

In [None]:
three_dim_arr.ndim

### Other attributes

In [None]:
two_dim_arr

In [None]:
two_dim_arr.T

In [None]:
print(dir(two_dim_arr))

## Creating arrays
We already saw how arrays can be created from Python lists (the same works with tuples). However, we often would like to create arrays directly, without creating Python objects. This can be accomplished by several utility functions.

The equivalent of `range`.

In [None]:
np.arange(9)

In [None]:
np.arange(start=2, stop=14, step=2)

Creating an array with a certain number of values in a certain interval.

In [None]:
np.linspace(start=-5, stop=5, num=10)

An array containing zeros. The default `dtype` is `float`.

In [None]:
np.zeros(5)

`np.zeros` takes a `shape` argument that lets us create multidimensional arrays.

In [None]:
np.zeros((2, 3, 2))

The same goes for `ones`, `empty` and `full`.

In [None]:
np.ones(shape=(2, 3, 2))

In [None]:
# Corresponds to whatever was left in memory. Using zeros for initialising arrays is usually saver.
np.empty(shape=(2, 3, 2))

In [None]:
np.full(shape=(2, 3, 2), fill_value=42)

### Exercise

Create a 3*3 array that solely consists of True

In [None]:
count_down(2)

In [None]:
np.ones((3,3), dtype=bool)

`np.random` contains a lot of functions to create arrays filled with random values of various probability distributions.

In [None]:
np.random.random((3, 3))

In [None]:
np.random.randint(0, 10, (5, 5))

Using ``np.random.randint`` and a boolean dtype, you can create random boolean arrays!

### Exercise
 
Create a 5x5 array in which *statistically* 1/4 of items are False, all others being True.

In [None]:
count_down(3)

You can use the method ``astype(bool)`` to convert an integer-array into a boolean-array

In [None]:
np.random.randint(0, 4, (5, 5)).astype(bool)

Note that in this array, there are only *on average* 1/4 elements of the elements True. To create an array where precisely 1/4 of elements are True, one would create an array with that many True's and ``shuffle`` it, as we'll see lateron.

``np.repeat`` repeats elements of an array:

In [None]:
np.repeat(3, 5)

In [None]:
np.repeat([[1,2],[3,4]], 2)

### Reshape

In [None]:
a = np.arange(start=2, stop=14)
print(a.shape)
a

In [None]:
b = a.reshape(3, 4)
b

-1 as axis automatically figures out the size of the respective dimension

In [None]:
a.reshape(-1, 2)

### Exercise

Using only what you've seen so far and only using numpy, create a two-dimensional array that contains the sequence [1, 2, 3] in every row (10 rows)

In [None]:
count_down(5)

In [None]:
# what you're supposed to create:
np.array([1, 2, 3]*10).reshape(10, -1)
# however, numpy doesn't allow the *10-syntax the way lists do (and this leads to the aforementioned side effects)

The transpose of the matrix can help you greatly!

In [None]:
np.repeat(np.arange(1, 4), 10).reshape(3, -1).T

### Comparing Arrays

In [None]:
epsilon = 0.000000000001
a = np.zeros((3,3))
b = np.zeros((3,3))
a[0,0] += 0.5*epsilon
a == b

In [None]:
(a == b).all()

flaws with doing this: 
* if either a or b is empty and the other one contains a single element, this will return True. (the comparison a==b returns an empty array, for which the all-operator returns True)
* If a and b don't have the same shape and aren't broadcastable, then this approach will raise an error.

Instead, use numpys provided functions!

In [None]:
np.array_equal(a, b)

In [None]:
np.allclose(a, b)

In [None]:
np.isclose(a, b)

## Masking
Logical arrays, i.e. arrays containing boolean values, can be used to index other arrays. These logical arrays are then called masks. This is especially useful to index based on logical conditions.

In [None]:
# A simple integer array.
arr = np.arange(1, 6)
arr

In [None]:
# A boolean array of the same shape as arr.
mask = np.array([True, False, True, False, True])
mask

Using the mask for indexing returns an array with only elements at positions where `mask` is `True`.  

In [None]:
arr[mask]

Luckily for us, Operators in numpy work element-wise and return a boolean array:

In [None]:
arr < 10

Because of this, we can use direct comparison as a mask:

In [None]:
arr[arr < 10]

Maks can be used for assignment, which keeps the shape of the original array (thanks to *fancy indexing*, which will come up later)

In [None]:
arr[mask] = 10
arr

### Exercise

Replace all odd numbers in the given array with -1

In [None]:
count_down(3)

In [None]:
arr = np.array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])
arr

In [None]:
arr[arr % 2 == 1] = -1
arr

## Mathematical operations
Numpy contains a lot of mathematical functions that operate on arrays in a vectorized manner. That means that they are applied to each element, without explicit for-loops. Vectorized functions are called `ufuncs` (universal functions) in Numpy.

### Standard arithmetic

In [None]:
arr = np.arange(9)
arr

In [None]:
arr * 3

In [None]:
arr + (arr*2)

In [None]:
arr - arr

In [None]:
1 /0

In [None]:
arr / arr

In [None]:
arr * arr

In [None]:
arr ** 2

### Exercise
Create a 1-dimensional array that repeats the sequence [1, 2, 3] 30 times!

In [None]:
count_down(3)

In [None]:
a = np.arange(3*30) % 3 + 1
a

In [None]:
var1 = lambda: np.repeat(np.arange(1, 4), 30).reshape(3, -1).T.flatten()
var2 = lambda: np.arange(3*30) % 3 + 1
var3 = lambda: np.array([[1,2,3] for _ in range(30)]).flatten()

%timeit var1()
%timeit var2()
%timeit var3()

Using `@` you can even do matrix multiplication. In the case of 1d arrays, this is the inner product between two vectors.

In [None]:
arr @ arr

In [None]:
# That's the same as
np.sum(arr * arr)

### Some standard functions

In [None]:
arr

In [None]:
np.log(arr)

In [None]:
np.exp(arr)

In [None]:
np.sin(arr)

Always try to use vectorized ufuncs instead of explicit loops!

### Broadcasting
What happens if you try to add arrays of different shapes? Numpy will try to expand the arrays according to three rules and try to make their shapes match, so the operation can be applied elementwise. 

**1. Rule** If the arrays have different numbers of dimensions, the smaller shape is padded with ones on its left side.<br/>
            Example: (5 x 3) + (3) &rarr; (5 x 3) + (**1** x 3)<br/>
**2. Rule** If the number of the dimensions matches, but the size of a dimension does not, dimensions with the size of 1 are expanded.<br/>
            Example: (5 x 3) + (1 x 3) &rarr; (5 x 3) + (**5** x 3)<br/>
**3. Rule** If the shapes of the  arrays still defer after applying the Rule 1 and 2, a broadcasting error is raised.

The figure below gives an illustration (source https://jakevdp.github.io/PythonDataScienceHandbook/02.05-computation-on-arrays-broadcasting.html) ![](broadcasting.png)




The Numpy documentation gives further insights https://docs.scipy.org/doc/numpy-1.14.0/user/basics.broadcasting.html. 


In [None]:
a = np.arange(15).reshape((5, 3))
a

In [None]:
b = np.arange(3)
b

In [None]:
a + b 

Here is a case in which broadcasting fails.

In [None]:
c = np.arange(4)
a + c

### Aggregations functions
Aggregation function are functions that reduce the dimensionality of an array. They provide an `axis` argument, to specify which dimension to reduce.

In [None]:
np.random.seed(1)
two_dim_arr = np.random.randint(0, high=20, size=(4, 4))
two_dim_arr

If just the array is passed, the aggregation operation is performed over the whole array.

In [None]:
np.min(two_dim_arr)

The optional `axis` argument allows us to specify, which dimension should be aggregated. You can think of it as the operation being applied to all entries that are obtained by keeping the indices in all dimensions fixed except for the `axis` dimension.
Let's look at the result of the minimum operation with `axis=0`:

In [None]:
np.min(two_dim_arr, axis=0)

The axis concept extends to more than one dimension

In [None]:
np.random.seed(1)
three_dim_arr = np.random.randint(0, high=20, size=(4, 4, 4))
three_dim_arr

In [None]:
np.min(three_dim_arr, axis=0)

Here the entry at index `[0, 0]`, i.e. `5` is the minimum of the following values. 

In [None]:
for i in range(4):
    print(three_dim_arr[i, 0, 0])

Let's demonstrate all axes again with another three-dimensional array:

In [None]:
a = np.array([[[2,4],[6,9]],[[3,1],[7,8]],[[4,5],[9, 0]]])

In [None]:
a, a.shape

In [None]:
np.min(a)

In [None]:
np.min(a, axis=0)

setting the axis-argument is the same as going through all other axes of the respective array in turn, returning the respective aggregate for every combination of these.

In [None]:
for i in range(a.shape[1]):
    for j in range(a.shape[2]):
        print(a[:, i, j])

For axis=1, we loop through axis 0 and axis 2:

In [None]:
a

In [None]:
np.min(a, axis=1)

In [None]:
for i in range(a.shape[0]):
    for j in range(a.shape[2]):
        print(a[i, :, j])

...and finally, for axis 2 we loop through axis 0 and 1

In [None]:
a

In [None]:
np.min(a, axis=2)

In [None]:
for i in range(a.shape[0]):
    for j in range(a.shape[1]):
        print(a[i, j, :])

The shape of the resulting array is simply the shape of the original array, leaving the specified axis out:

In [None]:
mins = []
for i in range(a.shape[0]):
    for j in range(a.shape[1]):
        mins.append(min(a[i,j,:]))
np.array(mins).reshape([a.shape[0], a.shape[1]])

In [None]:
for ax in range(3):
    print(np.min(a, axis=ax).shape == a.shape[:ax]+a.shape[ax+1:])

...however, of course, using numpy is much faster than looping over the array:

In [None]:
def find_min_manual(arr):
    mins = []
    for i in range(arr.shape[0]):
        for j in range(arr.shape[1]):
            mins.append(min(arr[i,j,:]))
    np.array(mins).reshape([arr.shape[0], arr.shape[1]])

%timeit find_min_manual(a)
%timeit np.min(a, axis=2)

**End of tuesday-lecture**

#### More than one Dimension

Aggregation functions can also aggregate more than one dimension at once.

In [None]:
np.min(three_dim_arr, axis=(1, 2))

Here the entry at index `[2]`, i.e. `3` is the minimum of the following values. 

In [None]:
for i in range(4):
    for j in range(4):
        print(three_dim_arr[2, i, j])

#### Other aggregation functions.

In [None]:
two_dim_arr

In [None]:
np.max(two_dim_arr)

In [None]:
np.max(two_dim_arr, axis=0)

In [None]:
np.max(two_dim_arr, axis=1)

In [None]:
np.sum(two_dim_arr)

In [None]:
np.sum(two_dim_arr, axis=0)

In [None]:
np.sum(two_dim_arr, axis=1)

Many of these function are also available as method on the array object.

In [None]:
two_dim_arr.sum(axis=0)

### Exercise

Create a Lambda-function to flatten a given numpy-array, such that it is one-dimensional

In [None]:
a = np.arange(64).reshape((2,2,2,2,2,2))
a

In [None]:
count_down(3)

In the end, you want to create a shape of (product_of_all_dimensions, 1).

In [None]:
flatten = lambda x: x.reshape(np.product(x.shape))

In [None]:
flatten(a)

...of course, there's also a numpy-function for that.

In [None]:
a.flatten()

## Advanced indexing
Numpy provides indexing methods that go beyond the indexing techniques known from standard Python sequences.


### Multidimensional indexing
Instead of doing subsequent indexing as with standard Python lists you can index all dimensions at once.

In [None]:
two_dim_list = [
    [ 0,  1,  2],
    [ 3,  4,  5],
    [ 6,  7,  8],
    [ 9, 10, 11],
    [12, 13, 14]
]
two_dim_list[2][1]

In [None]:
inner_list = two_dim_list[2]
inner_list[1]

In [None]:
two_dim_arr = np.array(two_dim_list)
two_dim_arr[2, 1]

In [None]:
two_dim_arr = np.array(two_dim_list)
two_dim_arr[2, 1]

You can use a colon to get all values from that dimensions.

In [None]:
large_two_dim_arr = np.arange(81).reshape((9, 9))
large_two_dim_arr

In [None]:
large_two_dim_arr[:, 1]

Standard slicing with `(start, stop, step)` works as expected.

In [None]:
large_two_dim_arr[:, 1:3]

In [None]:
large_two_dim_arr[:, 2:7:2]

Slices of an array are always `views`. That means, you just "view" the same chunk of meomory from a different perspective. This saves a lot of memory, but it means also that the original array will be changed, if you change the view.

In [None]:
arr_slice = large_two_dim_arr[:, 1]
arr_slice[:] = 0
large_two_dim_arr

In [None]:
large_two_dim_arr[:, 2] =0
large_two_dim_arr

If you need all values from several consecutive dimensions you can use ellipsis (`...`) as a shorthand.

In [None]:
# Ellipsis is an actual Python object.
print(...)

In [None]:
# np.stack joins arrays along a new axis.
four_dim_arr = np.stack((np.ones((3, 3, 3)), 
                         np.ones((3, 3, 3)) * 2, 
                         np.ones((3, 3, 3)) * 3, 
                         np.ones((3, 3, 3)) * 4))
four_dim_arr

In [None]:
four_dim_arr.shape

In [None]:
four_dim_arr[3, :, :, :]

In [None]:
four_dim_arr[1,..., 1]

In [None]:
four_dim_arr[1, :, :, 1]

### Fancy indexing
You can pass an array containing indices, this especially useful for drawing random items from an array.

In [None]:
arr = np.arange(9) + 10
arr

In [None]:
indices = np.array([1, 4, 5])
arr[indices]

The resulting array will reflect the shape of the index array.

In [None]:
indices = np.array([[1, 4],
                    [5, 7]])
arr[indices]

You can index each dimension separately.

In [None]:
two_dim_arr

In [None]:
x_indices = np.array([3, 4])
y_indices = np.array([1, 2])
two_dim_arr[x_indices, y_indices] # Corresponds to indexing at [3, 1] and [4, 2].

### Advanced Masking

In [None]:
a = np.arange(9)
a

In [None]:
a % 3 == 0

In [None]:
a[a % 3 == 0]

Assigning values using masking only works because the result of an applied mask-operation is a list, so this is in the end fancy indexing!

In [None]:
a[a % 3 == 0] = 0
a

In [None]:
a[np.array([0, 3, 6])] = 10
a

#### Combining masks

Mask can be created by using logical operators. For example, to get all the entries in an array that are greater than two.


In [None]:
arr = np.arange(1, 6)
greater_two = arr > 2
greater_two


In [None]:
arr[greater_two]

Or even shorter.

In [None]:
arr[arr > 2]

Different masks can be combined using bitwise logical operators. These are the vectorized version of locial operators and should not be confounded with `and`, `or` and `not` with try to evaluated the truth value of a whole object.

In [None]:
smaller_or_equal_four = arr <= 4
smaller_or_equal_four   

Bitwise and `&`.

In [None]:
greater_two & smaller_or_equal_four

In [None]:
# This does not work.
greater_two and smaller_or_equal_four

In [None]:
arr[greater_two & smaller_or_equal_four]

Bitwise or using `|`.

In [None]:
arr[greater_two | smaller_or_equal_four]

Bitwise xor using `^`.

In [None]:
arr[greater_two ^ smaller_or_equal_four]

Bitwise negation using `~`.

In [None]:
# Gives everything smaller or equal to 2.
arr[~greater_two]

#### Using np.where

Using masking always changes the original array, whereas sometimes the original array should rather be unchanged. ``np.where`` figures out the indices of an array where the given condition is true.

In [None]:
a = np.arange(9).reshape(3, 3)
a[a % 3 == 0] = 0
a

In [None]:
a = np.arange(9).reshape(3, 3)
tmp = np.where(a % 3 == 0)
tmp

In [None]:
b = np.ones((3, 3))
b[tmp] = 0
b

``where`` can also be used to assign values to a new array:

In [None]:
np.where(a % 3 == 0, 0, a)

### Exercise

Create an array of 8 8-dimensional one-hot-vectors

In [None]:
count_down(5)

In [None]:
np.array(np.arange(8*8).reshape((8, 8)) % 9 == 0, dtype=int)

In [None]:
np.eye(8, dtype=int)

### Exercise

Create a new array in which all odd numbers of ``a`` are replaced with -1 without changing arr

In [None]:
a = np.arange(10)
a

In [None]:
count_down(3)

In [None]:
np.where(a % 2 == 1, -1, a)

### Exercise

Figure out the common items of the arrays a and b, where the arrays match:

In [None]:
a = np.array([1,2,3,2,3,4,3,4,5,6])
b = np.array([7,2,10,2,7,4,9,4,9,8])

In [None]:
count_down(2)

In [None]:
np.unique(a[np.where(a == b)])

In [None]:
np.intersect1d(a,b)

## Extending arrays

### Adding new dimensions with `np.newaxis`

Instead of `np.newaxis`, `None` can be used.

In [None]:
one_dim_arr = np.arange(5)
one_dim_arr, one_dim_arr.shape

In [None]:
two_dim_arr = one_dim_arr[np.newaxis, :]
two_dim_arr, two_dim_arr.shape

In [None]:
two_dim_arr = one_dim_arr[:, np.newaxis, None]
two_dim_arr, two_dim_arr.shape

Adding new dimensions is useful for example when Tensorflow is used to batch-inputs, but you want to provide a single datapoint for prediction:

In [None]:
one_dim_arr[:, None]

### Combining arrays
There are many ways to combine existing arrays, like `np.append`, `np.concatenate` and `np.stack`. However, these operations always require the whole array to be copied. Therefore, it often makes more sense to allocate an array of the size you need later upfront and then just fill the respective parts.

In [None]:
np.concatenate((one_dim_arr, one_dim_arr))

In [None]:
np.stack((one_dim_arr, one_dim_arr))

In [None]:
np.append(one_dim_arr, one_dim_arr)

There are also the functions ``np.vstack`` (row-wise-stacking) and ``np.hstack`` (column-wise-stacking):
* hstack is equivalent to concatenation along the second axis, except for 1-D arrays where it concatenates along the first axis
* vstack is equivalent to concatenation along the first axis after 1-D arrays of shape (N,) have been reshaped to (1,N).

In [None]:
two_dim_arr = np.arange(16).reshape(4, -1)
two_dim_arr_2 = np.arange(16).reshape(4, -1) + 16
two_dim_arr_2

In [None]:
np.hstack((two_dim_arr, two_dim_arr_2))

In [None]:
np.vstack((two_dim_arr, two_dim_arr_2))

## np.random

* random.seed
* https://docs.scipy.org/doc/numpy/reference/generated/numpy.random.choice.html#numpy.random.choice
* https://docs.scipy.org/doc/numpy/reference/generated/numpy.random.shuffle.html#numpy.random.shuffle
* https://docs.scipy.org/doc/numpy/reference/generated/numpy.random.permutation.html#numpy.random.permutation
* Using random, you can eg. create an array where precisely 1/4 of elements are true
* Create random indices and fancy indexing to keep the order for input+target

### Shuffling arrays

``np.random.shuffle`` shuffles an array among the first index. That means, a one-dimensional is completely shuffled, whereas for multidimensional arrays, t

In [None]:
a = np.arange(10)
np.random.shuffle(a)
a

In [None]:
a = np.arange(9).reshape(3, 3)
np.random.shuffle(a)
a

## Further Readings

NumPy chapter from Jake VanderPlas's "Python Data Science Handbook" https://jakevdp.github.io/PythonDataScienceHandbook/02.00-introduction-to-numpy.html

[Video tutorial from Scipy 2017](https://youtu.be/lKcwuPnSHIQ)


In [None]:
from IPython.display import YouTubeVideo
YouTubeVideo('lKcwuPnSHIQ')