##**Numpy (Numerical Python)**
**Numpy** is a python module / library which provides  mathematical and statistical methods.  
A **numpy array** is like a list with superpowers:
- Unlike lists, numpy arrays have *dimensions*, that is, they are 1D, 2D, 3D, etc.  
- A **1D array** is known as a **vector**. It is  like a geometric line, having only the dimension of length.
- A **2D array** is known as a **matrix**. It are like a rectangle, but instead of having length and width, it consists of **rows** and **columns**--like a spreadsheet.
- A **3D array** is like a spreadsheet of rows and columns, but where each cell contains multiple vales. It is analogous to a storage unit of cubby holes, where each hole contains multiple items.
- **reshape**: A numpy array can be *reshaped* into a new array having a different number of dimensions, and/or different configuration of rows and columns.
- **transpose**: A numpy array can be *transposed*, such that the rows become columns and the columns become rows. In Excel this is known as a *pivot table*.

Some numpy methods and properties:
- **np.array(list)** takes a list as its argument and returns an array.
- **my_arr.shape** returns a **tuple** of the rows and cols of **my_arr**
- **my_arr.ndim** returns the number of dimensions of **my_arr**
- **my_arr2 = my_arr.reshape(num_rows,num_cols)** returns a new array containing  
  the number of rows and columns passed to *reshape* method
- **my_arr2 = my_arr.transpose()** returns a new array where the old rows are the new  
 columns and the old columns are the new rows

**import numpy as np** is the command for importing the numpy module

In [None]:
# import the numpy module
import numpy as np
import random

**np.array(list)**  
The **array()** method takes a list as its argument and returns an array.



**Only 1 datatype per array**.

Unlike list items, the items of a numpy array must all be of the same datatype.  
Since numpy arrays are used in data science, they usually hold integers or floats.  
A numpy array may also hold strings, but both strings and numbers in the same array is not allowed.  

In [None]:
# declare a list containing items of mixed datatype:
# this "mixed-bag" list contains data from a single row of an employees table:

employee = ["Smith", "Jane", "VP Sales", 78000, True]
print(type(employee[-1])) # bool
print(type(employee[-2])) # int
print(type(employee[-3])) # str
print('employee:', employee)

# pass the list to the array method, and print the result:
emp_arr = np.array(employee)

print('emp_arr:', emp_arr, type(emp_arr)) # <class 'numpy.ndarray'>

# L@@K: the numpy array items have all been stringified

<class 'bool'>
<class 'int'>
<class 'str'>
employee: ['Smith', 'Jane', 'VP Sales', 78000, True]
emp_arr: ['Smith' 'Jane' 'VP Sales' '78000' 'True'] <class 'numpy.ndarray'>


In [None]:
# declare a list of numbers:
nums_list = [2,3,5,7,9,12,16,23]

# print the list and its datatype:
print('nums_list:', nums_list, type(nums_list))
# nums_list: [2, 3, 5, 7, 9, 12, 16, 23] <class 'list'>

nums_list: [2, 3, 5, 7, 9, 12, 16, 23] <class 'list'>


In [None]:
# pass nums_list to the numpy array() method, saving the returned array as nums_arr:
nums_arr = np.array(nums_list)

# print the array and its datatype.
print('nums_arr:', nums_arr, type(nums_arr))
# nums_arr: [ 2  3  5  7  9 12 16 23] <class 'numpy.ndarray'>

# L@@K: numpy array items are not separated by commas.

nums_arr: [ 2  3  5  7  9 12 16 23] <class 'numpy.ndarray'>


**arr.shape** returns a **tuple** containing the shape of the array.  
The number of items in the **tuple** equal the number of dimensions.  
**arr.ndim** returns the number of dimensions as an integer

In [None]:
# numpy has methods that lists do not have: shape, ndim
# shape property returns a tuple of (rows, cols)
print('nums_arr.shape:', nums_arr.shape) # (8,)

# ndim property returns the number of dimensions
print('nums_arr.ndim:', nums_arr.ndim) # 1
# list does not have shape or ndim properties -- that's numpy stuff

nums_arr.shape: (8,)
nums_arr.ndim: 1


**Visualize a one-dimensional array as a *vector* (line) rather than as a “one-column stack”**

**array1 = np.array([1,2,3,4])**
The shape of **array1** is the tuple **(4,)**. It has four “rows” but is one-dimensional. It is therefore
a vector (line), having length as its only dimension. So, the term “row” is a misnomer here.
Having no columns, array1 should not be visualized as a one-column stack of multiple rows. Instead, it is more accurate to visualize the array1 vector as a *number line*.

In [None]:
# declare a list of a dozen numbers:
dozen = [2,4,5,12,7,43,32,87,33,48,63,81]

# make a numpy array from the list:
doz_arr = np.array(dozen)

# print the list, along with its shape and dimensions:
print('doz_arr:\n', doz_arr)
# [ 2  4  5 12  7 43 32 87 33 48 63 81]
print('shape:', doz_arr.shape) # (12,)
print('ndim:', doz_arr.ndim) # 1

# L@@K: the shape of (12,) with nothing after the comma
# indicates a 1D array--a vector

doz_arr:
 [ 2  4  5 12  7 43 32 87 33 48 63 81]
shape: (12,)
ndim: 1


**Reshaping the Array**
You can re-arrange the array into any shape, as long as the total number of elements (or the product
of the shape) remains the same. We use array.reshape(rows,columns). For example if the shape is (2,3),
you can only reshape to an array with 6 elements such as (1,6), (3,2) or even (6,). Also note that the
order of the elements remains the same after reshaping and you will see soon why this is important.

**arr.reshape(rows,cols)** for reshaping an array.   

Arrays cannot be made from **ragged lists**: all rows must contain the same number of items

In [None]:
# reshape doz_arr as a 4 row, 3 col array:
_4x3 = doz_arr.reshape(4,3)
print('_4x3:\n', _4x3)
print('shape:', _4x3.shape) # (4,3)
print('ndim:', _4x3.ndim) # 2

# (12,) can be reshaped into a 2 row, 6 col matrix (2,6)
# (12,) can be reshaped into a 4 row, 3 col matrix (4,3)
# (12,) can be reshaped into a 1 row, 12 col matrix (1,12)
# (12,) can be reshaped into a 12 row, 1 col matrix (12,1)

# let's see the diff between (12,1) and (1,12)
asymmetrical_list = [ [ 2,  4,  5], [12,  7, 43], [32, 87, 33], [48, 63, ''] ]
asym_arr = np.array(asymmetrical_list)
# ERROR! Can't make arr from ragged list
print('asym_arr:', asym_arr)

_1x12 = _4x3.reshape(1,12)
print('_1x12:', _1x12)
print('shape:', _1x12.shape) # (1,12)
print('ndim:', _1x12.ndim) # 2
# _1x12: [[ 2  4  5 12  7 43 32 87 33 48 63 81]]

# L@@K: the (1,12) has double square brackets

_12x1 = _4x3.reshape(12,1)
print('_12x1:\n', _12x1)
print('shape:', _12x1.shape) # (12,1)
print('ndim:', _12x1.ndim) # 2
# _12x1:
# [[ 2]
#  [ 4]
#  [ 5]
#  [12]
#  [ 7]
#  [43]
#  [32]
#  [87]
#  [33]
#  [48]
#  [63]
#  [81]]

# L@@K: the (12,1) is a one-column stack

_4x3:
 [[ 2  4  5]
 [12  7 43]
 [32 87 33]
 [48 63 81]]
shape: (4, 3)
ndim: 2
asym_arr: [['2' '4' '5']
 ['12' '7' '43']
 ['32' '87' '33']
 ['48' '63' '']]
_1x12: [[ 2  4  5 12  7 43 32 87 33 48 63 81]]
shape: (1, 12)
ndim: 2
_12x1:
 [[ 2]
 [ 4]
 [ 5]
 [12]
 [ 7]
 [43]
 [32]
 [87]
 [33]
 [48]
 [63]
 [81]]
shape: (12, 1)
ndim: 2


Set the shape / dimensionality of an array at the time it is declared, using reshape method:

In [None]:
# make a 3D arr from arr:
_2x3x2 = doz_arr.reshape(2,3,2)
print(_2x3x2, _2x3x2.shape, _2x3x2.ndim)

[[[ 2  4]
  [ 5 12]
  [ 7 43]]

 [[32 87]
  [33 48]
  [63 81]]] (2, 3, 2) 3


**Accessing elements of a numpy array**  
Inner items may be accessed by a second set of square brackets;  
this is how inner items in a nested list are accessed.

**[num1][num2]** is nested-list-accessor-style.   
**[row_index,col_index]** 2D array items can be accessed  
using single square brackets, with the row
**[row_start_index:row_end_index,col_start_index:col_end_index]** 2D array items can be accessed  
using single square brackets, with the row

Setting (changing) values of numpy array items requires **[row_index, col_index]**:

In [None]:
# declare an array of 20 nums; just pass the list directly into the array()
# method; no need to save the list to a variable first:
nums_list = [ [3,5,2,3,5], [4,6,7,11,13], [14,16,17,1,3], [11,12,17,23,29] ]

# makes slice/selections from the "2D" nested list:
# get the 13
print(nums_list[1][-1]) # 13
# get the [7,11,13] as a mini-list:
print(nums_list[1][2:]) # [7, 11, 13]
# get [16,17]:
print(nums_list[2][1:3]) # [16, 17]
# get [3,2,5]
print(nums_list[0][::2]) # [3,2,5]
# get [3,2,5]
print(nums_list[0][::2]) # [3,2,5]
# get [4,6,7,11,13], [14,16,17,1,3]
print(nums_list[1:3]) # [ [4,6,7,11,13], [14,16,17,1,3] ]
print(nums_list[1][-2:],nums_list[2][-2:])

nums_arr = np.array(nums_list)
print('nums_arr:\n', nums_arr, nums_arr.shape, nums_arr.ndim) # (4,5) 2

nums_flat = nums_arr.reshape(20,)
print('nums_flat:\n', nums_flat, nums_flat.shape, nums_flat.ndim)
# [ 3  5  2  3  5  4  6  7 11 13 14 16 17  1  3 11 12 17 23 29] (20,) 1

13
[7, 11, 13]
[16, 17]
[3, 2, 5]
[3, 2, 5]
[[4, 6, 7, 11, 13], [14, 16, 17, 1, 3]]
[11, 13] [1, 3]
nums_arr:
 [[ 3  5  2  3  5]
 [ 4  6  7 11 13]
 [14 16 17  1  3]
 [11 12 17 23 29]] (4, 5) 2
nums_flat:
 [ 3  5  2  3  5  4  6  7 11 13 14 16 17  1  3 11 12 17 23 29] (20,) 1
_5x4:







**sub-set / take a slice of a numpy array**.

**arr[row_start_index:row_end_index, col_start_index, col_end_index]**.

**: sets the range**

**negative indexing** works, so **-1** refers to the last row or column
**end index is exclusive** -- not included in the result.

**array[:, col_index]** gets all rows in that column.

**array[row_index,:]** gets all columns in that row

**array[:end_index]** nothing before the : means start at the beginning
**array[start_index:]** nothing after the : means go to the end.

In [None]:
print(nums_arr)
# sub-set / take a slice of a numpy array value as you would a list
# syntax:
# arr[start_index:end_index_excl]
print()
_5x4 = nums_arr.reshape(5,4)
print(_5x4)
# The _5x4 arr:
# [[ 3  5  2  3]
#  [ 5  4  6  7]
#  [11 13 14 16]
#  [17  1  57 11]
#  [12 17 23 29]]

# make a new array called _2x4 that prints the following:
#  [11 13 14 16]
#  [17  1  57 11]

# _2x
print()

[[ 3  5  2  3  5]
 [ 4  6  7 11 13]
 [14 16 17  1  3]
 [11 12 17 23 29]]

[[ 3  5  2  3]
 [ 5  4  6  7]
 [11 13 14 16]
 [17  1  3 11]
 [12 17 23 29]]



Practice selecting parts of the **_5x4** array:

In [None]:
print('get slices from this _5x4 arr:')
print()
# [[ 3  5  2  3]
#  [ 5  4  6  7]
#  [11 13 14 16]
#  [17  1  57 11]
#  [12 17 23 29]]

# 1. get the first row:
print('\nfirst row:\n', _5x4[0])
# first row:
# [3 5 2 3]

# 2. get the last row:
print('\nlast row:\n', _5x4[-1])
# last row:
# [12 17 23 29]

# 3. get the last row, last 2 cols: [23 29]
print('\nlast row, last 2 cols:\n', _5x4[-1,-2:])
# last row, last 2 cols:
# [23 29]

# 4. get all rows, but just the first col:
print('\nall rows, first col:\n',  _5x4[:,0])
# all rows, first col:
# [ 3  5 11 17 12]

# 5. get the first 3 cols in the 2nd row
print('\n2nd row, first 3 cols:\n', _5x4[1,:3])
# 2nd row, first 3 cols:
# [5  4  6]

# 6. get the tic-tac-toe / 3x3, lower right corner:
print('\n3x3 lower right corner:\n', _5x4[-3:,-3:])
#  3x3 lower right corner:
# [[13 14 16]
# [ 1 57 11]
# [17 23 29]]

# 7. get the middle "6-pack":
print('\nmiddle 6-pack:\n')
# middle 6-pack:
# [ 4  6]
# [13 14]
# [1  57]

# 8. replace the middle 3 rows, next to the last col, with all 5's:
# _5x
print('\nmiddle 3 rows, next to the last col, all 5\'s:\n')
# middle 3 rows, next to the last col, all 5's:
# [[ 3  5  2  3]
#  [ 5  4  5  7]
#  [11 13  5 16]
#  [17  1  5 11]
#  [12 17 23 29]]

# 9. replace the upper left 2x2 with all 9's:
# _5x
print('\nupper left 2x2 with all 9\'s:\n')
# [[ 9  9  2  3]
#  [ 9  9  5  7]
#  [11 13  5 16]
#  [17  1  5 11]
#  [12 17 23 29]]

get slices from this _5x4 arr:


first row:
 [3 5 2 3]

last row:
 [12 17 23 29]

last row, last 2 cols:
 [23 29]

all rows, first col:
 [ 3  5 11 17 12]

2nd row, first 3 cols:
 [5 4 6]

3x3 lower right corner:
 [[13 14 16]
 [ 1  3 11]
 [17 23 29]]

middle 6-pack:


middle 3 rows, next to the last col, all 5's:


upper left 2x2 with all 9's:



**CHALLENGE:**.  

On your own: do these numpy array selection / subsetting exercises:


In [None]:
# given these 9 numbers, make a 3x3 numpy array called _3x3
# you can chain the reshape() method directly onto the array() method
# to make the 3x3 array in one line:
# 2,3,5,7,11,13,17,23,29
# _3x
print("_3x3:\n")
# [[2 3 5]
# [7 11 13]
# [17 23 29]]

# 1. get the first, first 2 cols:
print('\nfirst row, first 2 cols:\n')
# first row, first 2 cols:
# [2 3 ]

# 2. get the whole last column:
print('\nlast col:\n')
# last col:
# [5 13 29]

# 3. get the last 2 items, middle column:
print('\nlast 2 items, middle col:\n')
# [11 23]

# 4. replace the middle row with all 8's, and print the result:
# expected result
# _3x
print('\nmiddle row 8\'s:\n')
# [[2 3 5]
# [8 8 8]
# [17 23 29]]

# 5. replace the last col with all 7's, and print the result:
# _3x
print('\nlast col 7\'s:\n')
# [[2 3 7]
# [8 8 7]
# [17 23 7]]

# 6. replace the bottom left 2x2 with all 4's:
# _3x
print('\nbottom left 2x2 all 4\'s:\n')
# [[2 3 7]
#  [4 4 7]
#  [4 4 7]]

# make an array called _5x6 from these 30 numbers
# the array should have 5 rows and 6 cols, for a of shape of (5,6)
# 3,5,2,3,5,7,4,6,7,11,13,9,14,16,17,61,3,9,11,12,17,23,29,65,4,6,7,11,13,9
# _5x

print('\n_5x6:\n')
# [[ 3  5  2  3  5  7]
#  [ 4  6  7 11 13  9]
#  [14 16 17 61  3  9]
#  [11 12 17 23 29 65]
#  [ 4  6  7 11 13  9]]

# 7. get the next to last row, last 4 cols:
print('\nnext to last row, last 4 cols:\n')
# next to last row, last 4 cols:
# [17 23 29 65]

# 8. get the middle row, midddle 4 cols
print('\nmiddle row, midddle 4 cols:\n')
# middle row, midddle 4 cols:
# [16 17 61  3]

# 9. get the all rows, last column:
print('\nall rows, last column:\n')
# all rows, last column:
# [ 7  9  9 65  9]

# 10. replace the middle 2x3 6-pack with all 1's:

print('\nmiddle 2x3 6-pack, all 2\'s:\n')
# [[ 3  5  2  3  5  7]
#  [ 4  6  1  1 13  9]
#  [14 16  1  1  3  9]
#  [11 12  1  1 29 65]
#  [ 4  6  7 11 13  9]]

_3x3:


first row, first 2 cols:


last col:


last 2 items, middle col:


middle row 8's:


last col 7's:


bottom left 2x2 all 4's:


_5x6:


next to last row, last 4 cols:


middle row, midddle 4 cols:


all rows, last column:


middle 2x3 6-pack, all 2's:



**Doing Math Across Arrays**
In numpy arrays you can perform math on corresponding items in two or more arrays.  
For **arr1 + arr2**, each item in **arr1** will be added to each item of the same index in **arr2**.

In [None]:
# math across lists doesn't work:
# declare two lists and add them together with plus-sign (+)
list1 = [10,12,15,18]
list2 = [22,33,44,55]

# add the lists together:
list3 = list1 + list2
# list1.extend(list2)
print(list3) # [10, 12, 15, 18, 22, 33, 44, 55]

# L@@K: we get this: [10, 12, 15, 18, 22, 33, 44, 55]
# corresponding items were NOT added together: [32,45,59,73]

# multiply list1 by 3:
list4 = list1 * 3
print(list4)
# L@@K: multiplying lists by 3 makes one list of each sequence repeated 3 times:
# [10, 12, 15, 18, 10, 12, 15, 18, 10, 12, 15, 18]

[10, 12, 15, 18, 22, 33, 44, 55]
[10, 12, 15, 18, 10, 12, 15, 18, 10, 12, 15, 18]


**Vector Operations** means that calculations can be made across array items

In [None]:
# math across numpy arrays DOES work via Vector Operation
# corresponding items are added, multiplied, etc

# declare two arrays from list1 and list2:
arr1 = np.array(list1)
print(arr1)

arr2 = np.array(list2)
print(arr2)

# add the arrays together; this yields the sum of the corresponding items:
arr3 = arr1 + arr2
print(arr3) # [32 45 59 73]
print()

# multiplying arrays together yields the product of corresponding items
arr4 = arr1 * arr2
print(arr4) # [220 396 660 990]

# multiply arr1 by 5; this multiplies each item in arr1 by 5
arr5 = arr1 * 5
print(arr5) # [50 60 75 90]

[10 12 15 18]
[22 33 44 55]
[32 45 59 73]

[220 396 660 990]
[50 60 75 90]


**dot(arr1,arr2)** method.  
the dot method multiplies corresponding items in two arrays,  
and then finds the sum of the products; the result is one number:

In [None]:
# dot() method
# the dot method multiplies corresponding items in two arrays,
# and then finds the sum of the products; the result is one number:
sum_prod_arr1_arr2 = np.dot(arr1,arr2)
print('sum_prod_arr1_arr2:', sum_prod_arr1_arr2) # 2266 (sum of [220 396 660 990])

sum_prod_arr1_arr2: 2266


In [None]:
# multiply 2D matrix by 1D vector
lets_see = np.dot(arr1,arr2) * np.array([3,5,6,7])
print(lets_see) # [ 6798 11330 13596 15862]

[ 6798 11330 13596 15862]


In [None]:
import math

**np.sqrt(nums_list)** method takes a number or a list of numbers as its argument.  
It returns the square root of the number(s).


In [None]:
# pass a number to the np.sqrt() method:
print(np.sqrt(81)) # 9
math.sqrt(81)
# pass a list of numbers to the np.sqrt() method:
nums = [5,6,7,8,9,14,16,21,25,30,36]
nums_sqrts = np.sqrt(nums)
print(nums_sqrts)
# [2.         2.23606798 2.44948974 2.64575131 2.82842712 3.
#  3.74165739 4.         4.58257569 5.         5.47722558 6.        ]

9.0
[2.23606798 2.44948974 2.64575131 2.82842712 3.         3.74165739
 4.         4.58257569 5.         5.47722558 6.        ]


**min(list)** returns minimum value of a list.  
**max(list)** returns maximum value of a list
**np.mean(nums_list)** returns the mean of a list of values

In [None]:
# make an array called nums_list from a list of nums:
# 3, 5, 8, 12, 18, 25, 32, 45, 54, 67, 72

print('\nnums_arr:')

# get the min and max of the list
# use core python (no numpy)
print('\nmin:')
print('\nmax:')
# OR use numpy method:
print('\nnp.min:')
print('\nnp.max:')

# to get the mean or median value of a list, use numpy:
print('\nnp.mean:')


nums_arr:

min:

max:

np.min:

np.max:

np.mean:


**np.random.randint(low, high, size)**
returns an array of specified size (number of items), with each value
within a range set by inclusive low-high boundaries. For size, if you pass in just one number, you create
a one-dimensional array. For a multi-dimensional array, pass in a tuple of the specified shape.
So (3, 4) would make a 2D array of 3 rows x 4 columns.

In [None]:
# get a random value from the arr
# numpy has random built in, so no need to import random separtately

# generate a random integer from 1-100:
print('random int from 1-100:')

# generate 10 non-repeating integers from 1-100
print('Sample 10 non-repeating ints 1-100:')
print() # [48 21 26 42 33 92 35 83 33 80]

random int from 1-100:
Sample 10 non-repeating ints 1-100:



**transposing**
Not to be confused with reshaping, which does not change the sequence of array items.  
Reshaping only changes the row-col configuration.

**arr.transpose()** method is called on a 2D array and returns a new array where the rows  
from the original array are the columns and the columns are the rows.

In [None]:
# using these 12 numbers, declare an array called _3x4, having shape (3,4)
_3x4 = np.array([102,13,5,88,12,32,157,24,45,77,54,37]).reshape(3,4)
print('_3x4:\n', _3x4)
# [[102  13   5  88]
#  [ 12  32 157  24]
#  [ 45  77  54  37]]

# use the reshape() method, make a array called _4x3, having shape (4,3)
# transpose the array:
_4x3 = _3x4.reshape(4,3)
print('\n_4x3:\n', _4x3)
# [[102  13   5]
#  [ 88  12  32]
#  [157  24  45]
#  [ 77  54  37]]
# L@@K: reshaping does NOT change the sequence: all nums are in same order

# transpose changes the order of the array items,
# since the rows become cols and the cols become rows

# transpose the (3,4) array. It becomes a (4,3), BUT this is not the same
# as reshaping as (4,3). With transpose, the order of the items has changed:
_3x4_transposed = _3x4.transpose()
print('\n_3x4_transposed:\n', _3x4_transposed)

# L@@K: the sequence of numbers is not the same as in the original.
# This is because the rows are now cols, and vice-versa:

# [[102  12  45]
#  [ 13  32  77]
#  [  5 157  54]
#  [ 88  24  37]]

_3x4:
 [[102  13   5  88]
 [ 12  32 157  24]
 [ 45  77  54  37]]

_4x3:
 [[102  13   5]
 [ 88  12  32]
 [157  24  45]
 [ 77  54  37]]

_3x4_transposed:
 [[102  12  45]
 [ 13  32  77]
 [  5 157  54]
 [ 88  24  37]]


In [None]:
# CHALLENGE: Exercise with reshape vs. transpose:
# 1, 2, 3, 4, 5, 6

# given these 6 numbers, make a (2,3) numpy array:
# _2x
print('_2x3:\n')
# [[1 2 3]
#  [4 5 6]]

# reshape as (3,2):
# _3x
print('\n_3x2:\n')
# [[1 2]
#  [3 4]
#  [5 6]]
# L@@K: the sequence of numbers is the same for both _2x3 and _3x2

# transpose the _2x3 array. The shape becomes (3,2) but the number sequence has changed:
# _2x
print('\n_2x3_transposed:\n')
# [[1 4]
#  [2 5]
#  [3 6]]

_2x3:


_3x2:


_2x3_transposed:

