# Python for Data Science course: PART 2

<img src="images/numpy.png" width="500">

Numpy is the core library for scientific computing in Python. It provides a high-performance multidimensional array object, and tools for working with these arrays. If you are already familiar with MATLAB, you might find this tutorial useful to get started with Numpy.
## Table of content:
* [Numpy Arrays](#numpyArrays)
   * [Datatypes](#datatypes)
   * [Initialization](#initialization)
   * [Creating arrays](#creation)
   * [Basic Instructions](#basicInstructions)
* [Array indexing](#Arrayindexing)
   * [Slicing](#slicing)
   * [Integer indexing](#intIndexing)
   * [Boolean indexing](#boolIndexing)
* [Array Operations](#operations)
   * [Arithmetic](#arithmeticOperations)
   * [Comparison](#comparison)
   * [Aggregation](#aggregation)
   * [Statistical](#stat)
   * [Copying and Sorting](#copyingandsorting)
* [Array Manipulation](#manipulation)

In [120]:
# importing the library and giving a shortcut to use in the code
import numpy as np

## 1.- Numpy Arrays<a class="anchor" id="numpyArrays"></a>

A numpy array is a grid of values, all of the same type, and is indexed by a tuple of nonnegative integers. The number of dimensions is the rank of the array; the shape of an array is a tuple of integers giving the size of the array along each dimension.

### 1.0-. Datatypes <a class="anchor" id="datatypes"></a>

In [4]:
np.int64 #Signed 64-bit integer types
np.float32 #Standard double-precision floating point
np.complex #Complex numbers represented by 128 floats
np.bool #Boolean type storing TRUE and FALSE values
np.object #Python object type
np.string_ #Fixed-length string type
np.unicode_ #Fixed-length unicode type

numpy.str_

### 1.1-. Initialization <a class="anchor" id="initialization"></a>
We can initialize numpy arrays from nested Python lists, and access elements using square brackets:
#### 1.1.1-. Numpy 1D

In [5]:
a = np.array([1, 2, 3])   # Create a rank 1 array

In [6]:
# Shows the array type
type(a)

numpy.ndarray

In [7]:
# Shows the array shape
a.shape

(3,)

In [8]:
# Change an element of the array
a[0] = 5
print(a)

[5 2 3]


#### 1.1.2-. Numpy 2D

In [9]:
b = np.array([[1,2,3],
              [4,5,6]])    # Create a rank 2 array

In [10]:
# Shows the array type
type(b)

numpy.ndarray

In [11]:
b.shape

(2, 3)

### 1.2-. Creating arrays in Numpy:<a class="anchor" id="creation"></a>

Create an array of all zeros

In [12]:
a = np.zeros((2,2))   # Create an array of all zeros
print(a)              # Prints "[[ 0.  0.]
                      #          [ 0.  0.]]"

[[0. 0.]
 [0. 0.]]


Create an array of all ones

In [13]:
b = np.ones((1,2))    # Create an array of all ones
print(b)              # Prints "[[ 1.  1.]]"

[[1. 1.]]


Create a constant array

In [14]:
c = np.full((2,2), 7)  # Create a constant array
print(c)               # Prints "[[ 7.  7.]
                       #          [ 7.  7.]]"

[[7 7]
 [7 7]]


Create a 2x2 identity matrix

In [15]:
d = np.eye(2)         # Create a 2x2 identity matrix
print(d)              # Prints "[[ 1.  0.]
                      #          [ 0.  1.]]"

[[1. 0.]
 [0. 1.]]


Create an array filled with random values

In [16]:
e = np.random.random((2,2))  # Create an array filled with random values
print(e)                     # Might print "[[ 0.91940167  0.08143941]
                             #               [ 0.68744134  0.87236687]]"

[[0.56239089 0.56756568]
 [0.94048688 0.45963114]]


### 1.3-. Basic Instructions:<a class="anchor" id="basicInstructions"></a>

In [17]:
a = np.array([[1,2,3,4], 
              [5,6,7,8], 
              [9,10,11,12]])

Knowing the shape of the array.

In [18]:
a.shape

(3, 4)

Knowing the length of the array

In [19]:
len(a)

3

Knowing the dimension of the array

In [18]:
a.ndim

2

Knowing the total size of the array

In [20]:
a.size

12

Knowing the type of the array

In [21]:
a.dtype

dtype('int32')

In [22]:
a.dtype.name

'int32'

Changing the array format

In [23]:
a.astype('float32')

array([[ 1.,  2.,  3.,  4.],
       [ 5.,  6.,  7.,  8.],
       [ 9., 10., 11., 12.]], dtype=float32)

## 2.- Array indexing<a class="anchor" id="Arrayindexing"></a>

### 2.1.- Slicing: <a class="anchor" id="slicing"></a>
Slicing is an operation to extract a subset of elements from an array.
Similar to Python lists, numpy arrays can be sliced. Since arrays may be multidimensional, you must specify a slice for each dimension of the array:

In [28]:
#Create the following rank 2 array with shape (3, 4)
# [[ 1  2  3  4]
#  [ 5  6  7  8]
#  [ 9 10 11 12]]
a = np.array([[1,2,3,4], 
              [5,6,7,8], 
              [9,10,11,12]])
a

array([[ 1,  2,  3,  4],
       [ 5,  6,  7,  8],
       [ 9, 10, 11, 12]])

In [29]:
# Use slicing to pull out the subarray consisting of the first 2 rows
# and columns 1 and 2; b is the following array of shape (2, 2):
# [[2 3]
#  [6 7]]
b = a[:2, 1:3]
b

array([[2, 3],
       [6, 7]])

In [36]:
# A slice of an array is a view into the same data, so modifying it
# will modify the original array.

print(a[0, 1])   # Prints "2"

77


In [31]:
print(a)

[[ 1  2  3  4]
 [ 5  6  7  8]
 [ 9 10 11 12]]


In [34]:
b[0, 1] = 77     # b[0, 0] is the same piece of data as a[0, 1]

In [35]:
print(a[0, 1])   # Prints "77"
print(a)

77
[[ 1 77 77  4]
 [ 5  6  7  8]
 [ 9 10 11 12]]


### 2.2.- Integer indexing:<a class="anchor" id="intIndexing"></a>
Allows you to construct arbitrary arrays using the data from another array:

In [37]:
a = np.array([[1,2], 
              [3, 4], 
              [5, 6]])

In [38]:
# An example of integer array indexing.
# The returned array will have shape (3,) and
print(a[[0, 1, 2], [0, 1, 0]])  # Prints "[1 4 5]"

[1 4 5]


In [39]:
# The above example of integer array indexing is equivalent to this:
print(np.array([a[0, 0], a[1, 1], a[2, 0]]))  # Prints "[1 4 5]"

[1 4 5]


In [40]:
# When using integer array indexing, you can reuse the same
# element from the source array:
print(a[[0, 0], [1, 1]])  # Prints "[2 2]"

[2 2]


In [41]:
# Equivalent to the previous integer array indexing example
print(np.array([a[0, 1], a[0, 1]]))  # Prints "[2 2]"

[2 2]


Selecting or mutating one element from each row of a matrix:

In [43]:
a = np.array([[1,2,3], 
              [4,5,6], 
              [7,8,9], 
              [10, 11, 12]])
print(a)

[[ 1  2  3]
 [ 4  5  6]
 [ 7  8  9]
 [10 11 12]]


In [44]:
# Create an array of indices
indices = np.array([0, 2, 0, 1])

# Select one element from each row of a using the indices in b
print(a[np.arange(4), indices])  # Prints "[ 1  6  7 11]"

[ 1  6  7 11]


In [45]:
# Mutate one element from each row of a using the indices in b
a[np.arange(4), indices] += 10

print(a)  # prints "array([[11,  2,  3],
          #                [ 4,  5, 16],
          #                [17,  8,  9],
          #                [10, 21, 12]])

[[11  2  3]
 [ 4  5 16]
 [17  8  9]
 [10 21 12]]


### 2.3.- Boolean array indexing: <a class="anchor" id="boolIndexing"></a>
Boolean array indexing lets you pick out arbitrary elements of an array. Frequently this type of indexing is used to select the elements of an array that satisfy some condition.

In [47]:
a = np.array([[1,2], 
              [3, 4], 
              [5, 6]])
print(a)

[[1 2]
 [3 4]
 [5 6]]


In [48]:
bool_idx = (a > 2)   # Find the elements of a that are bigger than 2;
                     # this returns a numpy array of Booleans of the same
                     # shape as a, where each slot of bool_idx tells
                     # whether that element of a is > 2.

print(bool_idx)      # Prints "[[False False]
                     #          [ True  True]
                     #          [ True  True]]"

[[False False]
 [ True  True]
 [ True  True]]


In [50]:

# We use boolean array indexing to construct a rank 1 array
# consisting of the elements of a corresponding to the True values
# of bool_idx
print(a[bool_idx])  # Prints "[3 4 5 6]"

[3 4 5 6]


In [51]:
# We can do all of the above in a single concise statement:
print(a[a > 2])     # Prints "[3 4 5 6]"

[3 4 5 6]


## 3.- Array Operations<a class="anchor" id="operations"></a>

In [52]:
a = np.array([[1, 2, 3], 
              [4, 5, 6], 
              [7, 8, 9]])
b = np.array([[1, 2, 1], 
              [3, 4, 1], 
              [5, 6, 1]])

### 3.1.- Arithmetic operations<a class="anchor" id="arithmeticOperations"></a>

Subtraction

In [53]:
a-b

array([[0, 0, 2],
       [1, 1, 5],
       [2, 2, 8]])

In [54]:
np.subtract(a, b)

array([[0, 0, 2],
       [1, 1, 5],
       [2, 2, 8]])

Addition

In [55]:
a+b

array([[ 2,  4,  4],
       [ 7,  9,  7],
       [12, 14, 10]])

In [56]:
np.add(a, b)

array([[ 2,  4,  4],
       [ 7,  9,  7],
       [12, 14, 10]])

Division

In [57]:
a/b

array([[1.        , 1.        , 3.        ],
       [1.33333333, 1.25      , 6.        ],
       [1.4       , 1.33333333, 9.        ]])

In [58]:
np.divide(a, b)

array([[1.        , 1.        , 3.        ],
       [1.33333333, 1.25      , 6.        ],
       [1.4       , 1.33333333, 9.        ]])

Multiplication

In [59]:
a*b

array([[ 1,  4,  3],
       [12, 20,  6],
       [35, 48,  9]])

In [60]:
np.multiply(a, b)

array([[ 1,  4,  3],
       [12, 20,  6],
       [35, 48,  9]])

Exponentiation

In [61]:
np.exp(a)

array([[2.71828183e+00, 7.38905610e+00, 2.00855369e+01],
       [5.45981500e+01, 1.48413159e+02, 4.03428793e+02],
       [1.09663316e+03, 2.98095799e+03, 8.10308393e+03]])

Square root

In [62]:
np.sqrt(a)

array([[1.        , 1.41421356, 1.73205081],
       [2.        , 2.23606798, 2.44948974],
       [2.64575131, 2.82842712, 3.        ]])

Element-wise sines

In [63]:
np.sin(a)

array([[ 0.84147098,  0.90929743,  0.14112001],
       [-0.7568025 , -0.95892427, -0.2794155 ],
       [ 0.6569866 ,  0.98935825,  0.41211849]])

Element-wise consines

In [64]:
np.cos(a)

array([[ 0.54030231, -0.41614684, -0.9899925 ],
       [-0.65364362,  0.28366219,  0.96017029],
       [ 0.75390225, -0.14550003, -0.91113026]])

Element-wise logarithm

In [65]:
np.log(a)

array([[0.        , 0.69314718, 1.09861229],
       [1.38629436, 1.60943791, 1.79175947],
       [1.94591015, 2.07944154, 2.19722458]])

Dot product

In [66]:
a.dot(b)

array([[ 22,  28,   6],
       [ 49,  64,  15],
       [ 76, 100,  24]])

### 3.2.- Comparison<a class="anchor" id="comparison"></a>

In [67]:
# Element-wise comparison
a == b

array([[ True,  True, False],
       [False, False, False],
       [False, False, False]])

In [68]:
a < 2

array([[ True, False, False],
       [False, False, False],
       [False, False, False]])

In [69]:
# Array-wise comparison
np.array_equal(a, b)

False

### 3.3.- Aggregation<a class="anchor" id="aggregation"></a>

In [71]:
a

array([[1, 2, 3],
       [4, 5, 6],
       [7, 8, 9]])

In [72]:
# Total sum of all the elements
print(a.sum())

45


In [73]:
# Totals for elements in Y axis (Columns)
print(a.sum(axis=0))

[12 15 18]


In [74]:
# Totals for elements in X axis (Rows)
print(a.sum(axis=1))

[ 6 15 24]


In [75]:
a.min(axis=1)

array([1, 4, 7])

In [76]:
a.max(axis=0)

array([7, 8, 9])

In [77]:
a.cumsum(axis=0)

array([[ 1,  2,  3],
       [ 5,  7,  9],
       [12, 15, 18]], dtype=int32)

### 3.4.- Statistical <a class="anchor" id="stat"></a>

In [83]:
a

array([[1, 2, 3],
       [4, 5, 6],
       [7, 8, 9]])

In [84]:
np.mean(a)

5.0

In [85]:
# Median
np.median(a)

5.0

In [86]:
# Correlation coefficient
np.corrcoef(a)

array([[1., 1., 1.],
       [1., 1., 1.],
       [1., 1., 1.]])

In [87]:
# Standard deviation
np.std(a)

2.581988897471611

### 3.5.- Copying and Sorting <a class="anchor" id="copyingandsorting"></a>

In [113]:
a = np.array([[9, 2, 3], 
              [4, 1, 6], 
              [7, 8, 3]])

In [114]:
copy = np.copy(a) # or copy = a.copy()

In [119]:
a.sort(axis=1) 
a

array([[1, 3, 6],
       [2, 4, 8],
       [3, 7, 9]])

## a.sort(reversed=True)

In [112]:
copy

array([[1, 2, 3],
       [4, 5, 6],
       [7, 8, 9]])

## 4.- Array Manipulation<a class="anchor" id="manipulation"></a>

In [58]:
a

array([[1, 2, 3],
       [4, 5, 6],
       [7, 8, 9]])

Transposing

In [59]:
np.transpose(a)

array([[1, 4, 7],
       [2, 5, 8],
       [3, 6, 9]])

Change array shape

In [60]:
# Flatten the array with ravel
new = a.ravel()[:-1]
print(new)

[1 2 3 4 5 6 7 8]


In [61]:
new.reshape((2,4))

array([[1, 2, 3, 4],
       [5, 6, 7, 8]])

Adding or removing elements

In [62]:
a = [1,2,3,4,5]
b = [6,7,8,9]
# Append items to an array
a_b = np.append(a, b)
print(a_b)

# Insert items in an array
a_in = np.insert(a, 0, 44)
print(a_in)

# Delete items from an array
a_del=np.delete(a_in, [0])
print(a_del)

[1 2 3 4 5 6 7 8 9]
[44  1  2  3  4  5]
[1 2 3 4 5]


Combining arrays

In [63]:
a = np.array([[1, 2, 3], 
              [4, 5, 6], 
              [7, 8, 9]])
b = np.array([[1, 2, 1], 
              [3, 4, 1], 
              [5, 6, 1]])

# Concatenate vertically
np.concatenate((a,b), axis = 0)
np.vstack((a,b))

array([[1, 2, 3],
       [4, 5, 6],
       [7, 8, 9],
       [1, 2, 1],
       [3, 4, 1],
       [5, 6, 1]])

In [64]:
# Concatenate horizontally
np.concatenate((a,b), axis = 1)
np.hstack((a,b))

array([[1, 2, 3, 1, 2, 1],
       [4, 5, 6, 3, 4, 1],
       [7, 8, 9, 5, 6, 1]])

Splitting arrays

In [65]:
# Split the array horizontally at the 3rd index
np.hsplit(a, 3)

[array([[1],
        [4],
        [7]]), array([[2],
        [5],
        [8]]), array([[3],
        [6],
        [9]])]

In [66]:
# Split hte array vertically at the 3rd index
np.vsplit(b,3) 

[array([[1, 2, 1]]), array([[3, 4, 1]]), array([[5, 6, 1]])]