## Section 5a: Introduction to NumPy

There are two main libraries that are used during data analysis. These two libraries and NumPy and Pandas. The discussion for this notebook will be more towards numpy and pandas in the next notebook.

Most people use numpy because of their arrays. The arrays are in fact very similar to python lists. However numpy arrays contain many in built functions which allows them a lot more flexibility as compared to a simple python list.

### Section 5a.1 Creation of NumPy Arrays

### Section 5a.1.1 Creation through a list

The creation of a NumPy array can be through a list:

In [None]:
import numpy as np
np.array([2,5.6,3,1,4])

array([2. , 5.6, 3. , 1. , 4. ])

In [None]:
[5,4.3,47,6]

[5, 4.3, 47, 6]

All elements in NumPy must have the same type. So if we have a floating (decimal) number in one of the elements, all the elements will be cast to a floating number.

array([5.555, 1.   , 2.   , 3.   ])

As you see from the array above, all the numbers are being cast into floating numbers.

Set the type of the elements of the array through dtype.

In [None]:
np.array([1,2,3,4], dtype='float32')

array([1., 2., 3., 4.], dtype=float32)

In [None]:
temp_list = [4,2,5,6]

In [None]:
np.array(temp_list)

array([4, 2, 5, 6])

### Section 5a.1.2 Creation through a function


In [None]:
np.zeros(5, dtype=float)

array([0., 0., 0., 0., 0.])

In [None]:
np.zeros((4,5))

array([[0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0.]])

In [None]:
np.zeros((4,5,2))

array([[[0., 0.],
        [0., 0.],
        [0., 0.],
        [0., 0.],
        [0., 0.]],

       [[0., 0.],
        [0., 0.],
        [0., 0.],
        [0., 0.],
        [0., 0.]],

       [[0., 0.],
        [0., 0.],
        [0., 0.],
        [0., 0.],
        [0., 0.]],

       [[0., 0.],
        [0., 0.],
        [0., 0.],
        [0., 0.],
        [0., 0.]]])

In [None]:
np.ones((2,5))

array([[1., 1., 1., 1., 1.],
       [1., 1., 1., 1., 1.]])

In [None]:
np.full((2,5), 1.23)

array([[1.23, 1.23, 1.23, 1.23, 1.23],
       [1.23, 1.23, 1.23, 1.23, 1.23]])

### Section 5a.1.3 Creation through range


Create an array between 0 and 20 (non inclusive) whee the difference between the numbers is 2.

In [None]:
np.arange(0, 20, 2.5)

array([ 0. ,  2.5,  5. ,  7.5, 10. , 12.5, 15. , 17.5])

In [None]:
np.arange(0,22, 3)

array([ 0,  3,  6,  9, 12, 15, 18, 21])

Create an array of 10 values evenly spaced between 0 and 18.


In [None]:
# np.linspace(start, end, number of values)
np.linspace(0, 18, 10)

array([ 0.,  2.,  4.,  6.,  8., 10., 12., 14., 16., 18.])

### Section 5a.1.4 Creation through random function

Create array 2 by 5, with random values between 0 and 1.

In [None]:
np.random.random((2,5))

array([[0.34871567, 0.73559236, 0.34755562, 0.81966324, 0.94359915],
       [0.4199247 , 0.67231091, 0.54005404, 0.37793906, 0.92725998]])

If you need it between eg 3 and 9:

In [None]:
upper = 9
lower = 3

np.random.random((2,5)) * (upper - lower) + lower

array([[5.99188055, 6.1219152 , 8.24782255, 4.6702928 , 8.09135305],
       [5.07252857, 5.32123876, 8.57970091, 4.36915448, 4.83145535]])

Create array 2 by 5, with under a normal distribution centered at 3 with std deviation of 9.

In [None]:
np.random.normal(5, 7, (2,5))

array([[ 5.04190189,  1.85409631, 14.00573054,  0.74693551,  0.18273862],
       [ 8.42970207, 15.25943114,  0.88419586,  3.28022678,  1.94899141]])

Create array 2 by 5, with random values between 0 and 10 (non inclusive).

In [None]:
np.random.randint(0,10, (2,5))

array([[2, 0, 8, 6, 5],
       [9, 4, 0, 7, 4]])

Creation of Identity matrix.

In [None]:
np.eye(5)

array([[1., 0., 0., 0., 0.],
       [0., 1., 0., 0., 0.],
       [0., 0., 1., 0., 0.],
       [0., 0., 0., 1., 0.],
       [0., 0., 0., 0., 1.]])

### Section 5a.2 NumPy Arrays Fundamentals

#### Section 5a.2.1 NumPy Arrays Indexing

In [None]:
np.random.seed(5)

np_array0 = np.random.randint(10, size=5)
np_array1 = np.random.randint(10, size=(2,5))
np_array2 = np.random.randint(10, size=(2,5,3))


In [None]:
print(np_array0)
print(np_array1)
print(np_array2)

[3 6 6 0 9]
[[8 4 7 0 0]
 [7 1 5 7 0]]
[[[1 4 6]
  [2 9 9]
  [9 9 1]
  [2 7 0]
  [5 0 0]]

 [[4 4 9]
  [3 2 4]
  [6 9 3]
  [3 2 1]
  [5 7 4]]]


In [None]:
np_array0.ndim

1

In [None]:
temp_tuple = np_array1.shape

In [None]:
print(temp_tuple[0])
print(temp_tuple[1])

2
5


In [None]:
np_array0.size

5

array([3, 6, 6, 0, 9])

3

In [None]:
np_array1

array([[8, 4, 7, 0, 0],
       [7, 1, 5, 7, 0]])

In [None]:
np_array1[1,-4]

1

In [None]:
np_array1[1,2]

5

7

0

#### Section 5a.2.2 NumPy Arrays Slicing

The colon (:) character is used to access a slice of the array. The slice notation has three sections:

```
some_array[start:stop:step]
```

In [None]:
np_array0

array([3, 6, 6, 0, 9])

In [None]:
np_array0[:2]

array([3, 6])

In [None]:
np_array0[0:4:2]

array([3, 6])

Every other element, starting at index 0. Hence we get indexes 0, 2 and 4.

In [None]:
np_array0[0::2]

array([3, 6, 9])

Reverse the elements in the array:

In [None]:
np_array0[::-1]

array([9, 0, 6, 6, 3])

In [None]:
np_array0[::-2]

array([9, 6, 3])

Reversed every other from index 3

In [None]:
np_array0

array([3, 6, 6, 0, 9])

In [None]:
np_array0[3::-2]

array([0, 6])

Recall the array2 looks like the following:

In [None]:
np_array1[:2,:3]

array([[8, 4, 7],
       [7, 1, 5]])

array([[8, 4, 7],
       [7, 1, 5]])

Every other column

In [None]:
np_array1

array([[8, 4, 7, 0, 0],
       [7, 1, 5, 7, 0]])

In [None]:
np_array1[:2,::2]

array([[8, 7, 0],
       [7, 5, 0]])

### Section 5a.3 Combining NumPy Arrays

In [None]:
x = np_array0
y =- np_array0[::-1]

In [None]:
x

array([3, 6, 6, 0, 9])

In [None]:
y

array([-9,  0, -6, -6, -3])

In [None]:
np.concatenate([x,y])

array([ 3,  6,  6,  0,  9, -9,  0, -6, -6, -3])

In [None]:
np.hstack([x,y])

array([ 3,  6,  6,  0,  9, -9,  0, -6, -6, -3])

In [None]:
np.vstack([x,y])

array([[ 3,  6,  6,  0,  9],
       [-9,  0, -6, -6, -3]])

In [None]:
np_array1

array([[8, 4, 7, 0, 0],
       [7, 1, 5, 7, 0]])

In [None]:
np.concatenate([np_array1, np_array1])

array([[8, 4, 7, 0, 0],
       [7, 1, 5, 7, 0],
       [8, 4, 7, 0, 0],
       [7, 1, 5, 7, 0]])

In [None]:
np.concatenate([np_array1, np_array1], axis=1)

array([[8, 4, 7, 0, 0, 8, 4, 7, 0, 0],
       [7, 1, 5, 7, 0, 7, 1, 5, 7, 0]])

In [None]:
print(np_array0)
print(np_array1)

[3 6 6 0 9]
[[8 4 7 0 0]
 [7 1 5 7 0]]


In [None]:
np.vstack([np_array0, np_array1])

array([[3, 6, 6, 0, 9],
       [8, 4, 7, 0, 0],
       [7, 1, 5, 7, 0]])

In [None]:

np_array0.shape

(5,)

In [None]:
x_1array = np_array1
x_2array = np_array1

print(x_1array.shape)
print(x_2array.shape)

# np_array1.shape == np_array1.shape
if x_1array.shape[1] == x_2array.shape[1]:
  print(np.vstack([x_1array, x_2array]))


(2, 5)
(2, 5)
[[8 4 7 0 0]
 [7 1 5 7 0]
 [8 4 7 0 0]
 [7 1 5 7 0]]


In [None]:
curr_sensor = []
for i in sensor_list:
  curr_sensor.append(i)

In [None]:
array_to_append = [[99],[99]]
array_to_append = [99,99]
array_to_append = np.array(array_to_append)

In [None]:
np_array1

array([[8, 4, 7, 0, 0],
       [7, 1, 5, 7, 0]])

In [None]:
x_size = array_to_append.shape[0]
array_to_append = array_to_append.reshape(x_size,1)

In [None]:
array_to_append

array([[99],
       [99]])

In [None]:
np.hstack([np_array1, array_to_append])

array([[ 8,  4,  7,  0,  0, 99],
       [ 7,  1,  5,  7,  0, 99]])

### Section 5a.4 NumPy Functions

In [None]:
np_array4 = np.arange(1,5)
print(np_array4)

[1 2 3 4]


In [None]:
print(np_array4 // 2)
# print(np_array4 + 5)
# print(np_array4 - 5)
# print(np_array4 * 2)
# print(np_array4 / 2)

[0 1 1 2]


In [None]:
print(np_array4)
print(np.exp(np_array4))
print(np.exp2(np_array4))
print(np.power(2, np_array4))

[1 2 3 4]
[ 2.71828183  7.3890561  20.08553692 54.59815003]
[ 2.  4.  8. 16.]
[ 2  4  8 16]


In [None]:
print(np.log(np_array4))
print(np.log2(np_array4))
print(np.log10(np_array4))

[0.         0.69314718 1.09861229 1.38629436]
[0.        1.        1.5849625 2.       ]
[0.         0.30103    0.47712125 0.60205999]


In [None]:
big_array = np.random.rand(1000000)
%timeit sum(big_array)
%timeit np.sum(big_array)

97.2 ms ± 27.8 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)
397 µs ± 8.77 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)


In [None]:
%timeit min(big_array)
%timeit np.min(big_array)

60.3 ms ± 1.71 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)
470 µs ± 21.9 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)


array([[8, 4, 7, 0, 0],
       [7, 1, 5, 7, 0]])

(2, 5)

array([7, 1, 5, 0, 0])

array([0, 0])

0

The following table provides a list of useful aggregation functions available in NumPy:

|Function Name      |   NaN-safe Version  | Description                                   |
|-------------------|---------------------|-----------------------------------------------|
| ``np.sum``        | ``np.nansum``       | Compute sum of elements                       |
| ``np.prod``       | ``np.nanprod``      | Compute product of elements                   |
| ``np.mean``       | ``np.nanmean``      | Compute mean of elements                      |
| ``np.std``        | ``np.nanstd``       | Compute standard deviation                    |
| ``np.var``        | ``np.nanvar``       | Compute variance                              |
| ``np.min``        | ``np.nanmin``       | Find minimum value                            |
| ``np.max``        | ``np.nanmax``       | Find maximum value                            |
| ``np.argmin``     | ``np.nanargmin``    | Find index of minimum value                   |
| ``np.argmax``     | ``np.nanargmax``    | Find index of maximum value                   |
| ``np.median``     | ``np.nanmedian``    | Compute median of elements                    |
| ``np.percentile`` | ``np.nanpercentile``| Compute rank-based statistics of elements     |
| ``np.any``        | N/A                 | Evaluate whether any elements are true        |
| ``np.all``        | N/A                 | Evaluate whether all elements are true        |

### Section 5a.5 Sorting Functions

In [None]:
medium_array = np.random.rand(10000)

In [None]:
%timeit np.sort(medium_array)

688 µs ± 28.9 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)


107 ms ± 6.87 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)


103 ms ± 19.3 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)


array([[7, 1, 5, 0, 0],
       [8, 4, 7, 7, 0]])

array([[0, 0, 4, 7, 8],
       [0, 1, 5, 7, 7]])

### Section 5a.6 Indexing

In [None]:
np_array1

array([[8, 4, 7, 0, 0],
       [7, 1, 5, 7, 0]])

In [None]:
np_array1 < 3

array([[False, False, False,  True,  True],
       [False,  True, False, False,  True]])

In [None]:
np.argwhere(np_array1 < 3)

array([[0, 3],
       [0, 4],
       [1, 1],
       [1, 4]])

array([[0, 3],
       [0, 4],
       [1, 1],
       [1, 4]])