### Drill: NumPy Vectorization and Multidimensional Arrays

Vectorization helps us process lots of data quickly

- Make a random array of integers in the range of 1-10 with a size of 5 rows by 3 columns; name this array `ar`
- Next double all values in the array `ar`

In [1]:
import numpy as np

np.random.seed(5678)

ar = np.random.randint(1, 11, size=(5, 3))
ar

array([[ 5,  8,  3],
       [ 5,  4,  6],
       [ 3,  7,  4],
       [ 1,  6,  4],
       [ 7, 10,  4]])

In [2]:
ar * 2

array([[10, 16,  6],
       [10,  8, 12],
       [ 6, 14,  8],
       [ 2, 12,  8],
       [14, 20,  8]])

Notice we can chain behaviors. Instead of using a <code>for loop</code> or some other technique, using NumPy allows us to write concise, fast-operating code.

- Try this by squaring the `ar` array and then dividing it by 0.05!

In [3]:
(ar ** 2) / 0.05

array([[ 500., 1280.,  180.],
       [ 500.,  320.,  720.],
       [ 180.,  980.,  320.],
       [  20.,  720.,  320.],
       [ 980., 2000.,  320.]])

In practice we might need to transform a bunch of data.  For instance, when data values between objects or phenomenon of interest are really close or small, we could do a log transformation on the data.  This is a common technique in data visualization.  

- Using the array (`ar`), transform the values by a base<sub>2</sub> log (natural log) to all the values

In [4]:
np.log(ar)

array([[1.60943791, 2.07944154, 1.09861229],
       [1.60943791, 1.38629436, 1.79175947],
       [1.09861229, 1.94591015, 1.38629436],
       [0.        , 1.79175947, 1.38629436],
       [1.94591015, 2.30258509, 1.38629436]])

<h3> Practise with Multidimensional arrays</h3>

- Let's make a new 4 x 5 array named `newArray` of randomized integer values (10-15)

In [5]:
newArray = np.random.randint(10, 15, size=(4, 5))
newArray

array([[13, 11, 10, 13, 13],
       [11, 11, 12, 11, 10],
       [14, 10, 12, 12, 14],
       [11, 12, 14, 12, 10]])

- What is the shape of the array `newArray`? 
- How would you reshape the array to be a 2 x 10 array?

In [6]:
newArray.shape

(4, 5)

In [7]:
newArray.reshape((2,10))

array([[13, 11, 10, 13, 13, 11, 11, 12, 11, 10],
       [14, 10, 12, 12, 14, 11, 12, 14, 12, 10]])

When determining what data to select for further processing, you might want to look for some criteria.  In this example, say we want to find the minimum value of a data set. (e.g., we're integrating lots of data files and we're checking to see if the data fits in the range of expected values and makes sense)

- For the array `newArray`, find the min, max, variance and mean.

In [8]:
print('Min :', newArray.min())
print('Max :', newArray.max())
print('Var :', newArray.var())
print('Mean:', newArray.mean())

Min : 10
Max : 14
Var : 1.7600000000000002
Mean: 11.8


- What happens when we add both of the arrays (`ar` and `newArray`) together?

In [9]:
ar + newArray

ValueError: operands could not be broadcast together with shapes (5,3) (4,5) 

In [10]:
np.add(ar, newArray)

ValueError: operands could not be broadcast together with shapes (5,3) (4,5) 

- How would you make the 4 x 5 array `newArray` into a one dimensional array (a 1 x 20 array)?

In [11]:
newArray.flatten()

array([13, 11, 10, 13, 13, 11, 11, 12, 11, 10, 14, 10, 12, 12, 14, 11, 12,
       14, 12, 10])