# Big Data Real-Time Analytics with Python and Spark

## Chapter 2 - Data Manipulation in Python with Numpy
- Documentation: https://numpy.org/

In [1]:
# Python version
from platform import python_version
print('The version used in this notebook is: ', python_version())

The version used in this notebook is:  3.8.8


In [2]:
# Import Numpy module
import numpy as np

In [3]:
# package version used in this notebook
%reload_ext watermark
%watermark -a "Bianca Amorim" --iversion

Author: Bianca Amorim

numpy: 1.20.1



### Arrays Transposition and Reshaping

In [6]:
array1 = np.array([[65, 23, 19], [41, 87, 10]]); array1

array([[65, 23, 19],
       [41, 87, 10]])

In [7]:
# T is a method, even if it does not have ()
array1.T

array([[65, 41],
       [23, 87],
       [19, 10]])

In [8]:
# You can use the transpose method which is a NumPy function
np.transpose(array1)

array([[65, 41],
       [23, 87],
       [19, 10]])

> This is mandatory to build large artificial neural networks for modern artificial intelligence systems

### Broadcasting

We get each element of the array and add the next element of the same array through the array transpose. This is what we call Broadcasting. I get each value, and I am developing that value, consequently increasing my data structure.

In [9]:
array2 = np.array([[33, 34, 35]])

In [10]:
# array (1x3)
array2

array([[33, 34, 35]])

In [11]:
# array (3x1)
array2.T

array([[33],
       [34],
       [35]])

In [12]:
# array (3x1) + array (1x3) = array (3x3)
array2.T + array2

array([[66, 67, 68],
       [67, 68, 69],
       [68, 69, 70]])

### Converting matrices into vectors

In [13]:
# Arrays of arrays (matrix)
array3 = np.array([np.arange(1, 6), np.arange(10, 15)])

In [14]:
array3

array([[ 1,  2,  3,  4,  5],
       [10, 11, 12, 13, 14]])

In [18]:
array3.T

array([[ 1, 10],
       [ 2, 11],
       [ 3, 12],
       [ 4, 13],
       [ 5, 14]])

In [15]:
array3.ndim

2

In [16]:
# Put all the elements of the matrix and put in a unidimensional structure
array3.ravel()

array([ 1,  2,  3,  4,  5, 10, 11, 12, 13, 14])

In [17]:
array3.T.ravel()

array([ 1, 10,  2, 11,  3, 12,  4, 13,  5, 14])

> The **ravel()** method takes all the elements of an array and puts them into a one-dimensional structure.

**In many situations it is not computationally feasible to work with the matrix**, so I load my data into a matrix, apply transformations, make the necessary adjustments and when I have to process the ML algorithm, I convert the algorithm into a vector. I simply change the format of that structure completely.

**Computer vision** is one of the hottest areas in AI today, it is used in face recognition systems, autonomous car systems, and many other applications. At the moment we are processing the image in a face recognition systems, we convert the image, which is a matrix, into a vector.S

### Logical Operators

In [19]:
np.random.seed(100)
array4 = np.random.randint(1, 10, size = (4, 4))

In [20]:
array4

array([[9, 9, 4, 8],
       [8, 1, 5, 3],
       [6, 3, 3, 3],
       [2, 1, 9, 5]])

In [21]:
# If I divide the array4 by 7, will I get any value which the remainder is equal 0?
np.any((array4 % 7) == 0)

False

In [22]:
# Are all the values inside array4 less than 11?
np.all(array4 < 11)

True

To find this element I can use where

In [23]:
# Return the row where the element with index 1 in the column is greater than 2 and less than 9
np.where(np.logical_and(array4[:,1] > 2, array4[:,1] < 9))

(array([2]),)

In [26]:
# Same result as the previous line
np.intersect1d(np.where(array4[:,1] > 2), np.where(array4[:,1] < 9))

array([2])

In [27]:
array4[2]

array([6, 3, 3, 3])

### Arrays Reshaping and sorting

In [39]:
array5 = np.arange(1, 16); array5

array([ 1,  2,  3,  4,  5,  6,  7,  8,  9, 10, 11, 12, 13, 14, 15])

In [41]:
array5.reshape(3, 5)

array([[ 1,  2,  3,  4,  5],
       [ 6,  7,  8,  9, 10],
       [11, 12, 13, 14, 15]])

In [32]:
array6 = np.array([[3,2], [10, -1]]); array6

array([[ 3,  2],
       [10, -1]])

In [33]:
# Sorting by rows for axis 1
array6.sort(axis = 1)

In [34]:
array6

array([[ 2,  3],
       [-1, 10]])

In [35]:
array7 = np.array([[3,2], [10, -1]])

In [36]:
array7

array([[ 3,  2],
       [10, -1]])

In [37]:
# Sorting by rows for axis 0
array7.sort(axis = 0)

In [38]:
# This will not change the first column because it is already ordained
array7

array([[ 3, -1],
       [10,  2]])

### The End