# MADS Numpy Tutorial  -- WORKING DRAFT
The purpose of this notebook is to provide a showcase of some useful features of the numpy library. 

#### NumPy is a Python library used for working with arrays.
#### Why Use Numpy: 
1. NumPy aims to provide an array object that is up to 50x faster than traditional Python lists.
2. The array object in NumPy is called _ndarray_, it provides a lot of supporting functions that make working with an ndarray very easy.
3. Arrays are very frequently used in data science, where speed and resources are very important.
4. Many other libraries are built on numpy, so understanding and using numpy effectively may improve other tools 
     
#### Why are Numpy arrays faster Than lists?
NumPy arrays are stored at one continuous place in memory unlike lists, so processes can access and manipulate them very efficiently because of the way memory is cached inside CPUs. This behavior is called locality of reference in computer science.
     


## Topics:
1. Creating numpy arrays
2. Random
3. Copy vs. View
4. Shape
5. Join
6. Sorting and Index Operations
7. Ufuncts
8. Polynomials

References:
1. numpy github codebase: https://github.com/numpy/numpy/tree/main/numpy
2. W3school - NumPy
3. Numpy Documentation: https://numpy.org/doc/stable/index.html


## 1. Create NumPy Array Objects

In [11]:
import numpy as np

py_list = [1, 2, 3, 4, 5] #python list object
np_arr = np.array(py_list) # convert python list object to numpy ndarray object
type(np_arr)

numpy.ndarray

In [2]:
py_tuple = (1, 2, 3, 4, 5) #python tuple object
np_arr = np.array(py_tuple) # convert python tuple object to numpy ndarray object
type(np_arr)

numpy.ndarray

### Dimensions in Arrays
What are the 'Dimensions' of an Array? 
    * The dimension of an array is the level of the array's depth

0-D Array, 1-D Array, 2-D Array ....

In [3]:
#0-D Array
num = 42 # an integer
arr = np.array(num) # convert the integer to an ndarray
print('The type of the array is: ', type(arr))
print('The dimension of the array is:', arr.ndim)
print(arr)

The type of the array is:  <class 'numpy.ndarray'>
The dimension of the array is: 0
42


In [4]:
#1-D Array
py_list = [42, 43, 45, 65] # a python list 
arr = np.array(py_list) # convert the list to a one dim ndarray
print('The type of the array is: ', type(arr))
print('The dimension of the array is:', arr.ndim)
print(arr)

The type of the array is:  <class 'numpy.ndarray'>
The dimension of the array is: 1
[42 43 45 65]


In [5]:
#2-D Array
# An array that has 1D arrays as its elements is called a 2-D array
py_list = [[42, 43, 45, 65], [1, 2, 3, 4]] # a python list 
arr = np.array(py_list) # convert the list to a two dim ndarray
print('The type of the array is: ', type(arr))
print('The dimension of the array is:', arr.ndim)
print(arr)

The type of the array is:  <class 'numpy.ndarray'>
The dimension of the array is: 2
[[42 43 45 65]
 [ 1  2  3  4]]


In [6]:
# 3-D Array 
# An array that has 2-D arrays as its elements is called a 3-D array
py_list = [[[42, 43, 45, 65], [1, 2, 3, 4]] , [[1, 2, 3, 4], [34, 56, 22, 12]]] # a python list 
arr = np.array(py_list) # convert the list to a two dim ndarray
print('The type of the array is: ', type(arr))
print('The dimension of the array is:', arr.ndim)
print(arr)

The type of the array is:  <class 'numpy.ndarray'>
The dimension of the array is: 3
[[[42 43 45 65]
  [ 1  2  3  4]]

 [[ 1  2  3  4]
  [34 56 22 12]]]


#### Use ndim to create an n dimension array

In [7]:
n = 4
py_list = [1, 3, 4, 4, 6, 9]
arr = np.array(py_list, ndmin=n)
#verify the dimension 
print(arr.ndim == 4)

True


### Data types in NumPy 
#### Below is a list of all data types in NumPy and the characters used to represent them.
    i - integer
    b - boolean
    u - unsigned integer
    f - float
    c - complex float
    m - timedelta
    M - datetime
    O - object
    S - string
    U - unicode string
    V - fixed chunk of memory for other type ( void )


#### Create an array with a specific data type

In [8]:
# Create an array with data type 4 bytes integer
arr = np.array([1, 2, 3, 4], dtype='i4')
print(arr)
print(arr.dtype)

[1 2 3 4]
int32


In [9]:
#Create an array with data type string
arr = np.array([1, 2, 3, 4], dtype='S')
print(arr)
print(arr.dtype)

[b'1' b'2' b'3' b'4']
|S1


In [10]:
# convert an existing array to an integer type
arr = np.array([1.1, 2.1, 3.1])

newarr = arr.astype('int')
newarr

array([1, 2, 3])

## 2. NumPy Random

In [11]:
values = 10
arr = np.empty(values) # empty array to store values
arr

array([4.66204532e-310, 0.00000000e+000, 6.01347002e-154, 1.04990471e-153,
       6.65858640e+164, 5.98149210e-154, 4.40722557e+169, 5.47480657e-096,
       5.55487189e+141, 6.09343068e-013])

In [12]:
# Generate a random integer from 0 to 100
from numpy import random

x = random.randint(100)

print(x)

77


In [13]:
# Generate a random float from 0 to 1
from numpy import random

x = random.rand()

print(x)

0.4349341827027322


In [14]:
# Generate Random Array
# The randint() method takes a size parameter where you can specify the shape of an array.

from numpy import random

x=random.randint(100, size=(5))

print(x)

[ 9 45 68 38 64]


In [15]:
# Generate a 2-D array with 3 rows, each row containing 5 random integers from 0 to 100
x = random.randint(100, size=(3, 5))

print(x)

[[77  1 44 66 65]
 [30 32  6 16 14]
 [42 98 55 61 16]]


In [16]:
from numpy import random

x = random.choice([3, 5, 7, 9])

print(x)

5


## 3. NumPy Array Copy vs View
Some numpy operations return a view into the original array, with operations altering the original data, while other functions return a new array.   

In [17]:
# copy of an array is a new array and should be used if you plan to manipulate the data 
# independently of the original values

arr1 = np.array([1, 2, 3, 4, 5]) # first arr 
arr2 = arr1.copy() # copy of first arr
arr1[0] = 42 # change index 0 element of arr1 to 42

print(arr1)
print(arr2)

[42  2  3  4  5]
[1 2 3 4 5]


In [18]:
# view of an array is just a view of the original array 

arr1 = np.array([1, 2, 3, 4, 5]) # first arr 
arr2 = arr1.view() # copy of first arr
arr1[0] = 42 # change index 0 element of arr1 to 42

print(arr1)
print(arr2)

[42  2  3  4  5]
[42  2  3  4  5]


In [19]:
# slicing a numpy array produces a view into the original array
# operations on a slice change the original data
array = np.arange(9)
print(array)

slc = array[3:6]
print(slc)
slc = array[3:6] = 42
print(slc)

print(array)

[0 1 2 3 4 5 6 7 8]
[3 4 5]
42
[ 0  1  2 42 42 42  6  7  8]


## 4. NumPy Array Shape and reshape
#### Shape: The shape of an array is the number of elements in each dimension.


In [20]:
arr = np.array([[1, 2, 3, 4], [5, 6, 7, 8]])

print(arr.ndim)
print(arr.shape)


2
(2, 4)


### Reshape  arrays
Being aware of the shape of data is critical in Machine Learning applications since so many of the functions are driven by 
operations based on linear algebra. 

Reshaping means changing the shape of an array. The shape of an array is the number of elements in each dimension.  By reshaping we can add or remove dimensions or change number of elements in each dimension.

#### Reshape From 1-D to 2-D

In [21]:
# Convert the following 1-D array with 12 elements into a 2-D array.
# The outermost dimension will have 4 arrays, each with 3 elements:

arr = np.array([1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12])
print(arr)
print(arr.ndim)
print(arr.shape)

newarr = arr.reshape(4, 3)
print('After reshaping..')
print(arr)
print(newarr.ndim)
print(newarr.shape)
newarr

[ 1  2  3  4  5  6  7  8  9 10 11 12]
1
(12,)
After reshaping..
[ 1  2  3  4  5  6  7  8  9 10 11 12]
2
(4, 3)


array([[ 1,  2,  3],
       [ 4,  5,  6],
       [ 7,  8,  9],
       [10, 11, 12]])

#### Reshape From 1-D to 3-D

In [22]:
# Convert the following 1-D array with 12 elements into a 3-D array.
arr = np.array([1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12])

newarr = arr.reshape(2, 3, 2)

newarr

array([[[ 1,  2],
        [ 3,  4],
        [ 5,  6]],

       [[ 7,  8],
        [ 9, 10],
        [11, 12]]])

### Flattening the arrays
Flattening array means converting a multidimensional array into a 1D array.

In [23]:
arr = np.array([[1, 2, 3], [4, 5, 6]])
print("Original dimensions: ", arr.ndim)
print(arr)
newarr = arr.reshape(-1)
print(f"New array has dimensions {newarr.ndim}: ", newarr)


Original dimensions:  2
[[1 2 3]
 [4 5 6]]
New array has dimensions 1:  [1 2 3 4 5 6]


### Can We Reshape Into any Shape?

In [24]:
# Try converting 1D array with 8 elements to a 2D array with 3 elements in each dimension
arr = np.array([1, 2, 3, 4, 5, 6, 7, 8])

#  newarr = arr.reshape(3, 3)

#  print(newarr)

#### Yes, as long as the elements required for reshaping are equal in both shapes.

We can reshape an 8 elements 1D array into 4 elements in 2 rows 2D array but we cannot reshape it into a 3 elements 3 rows 2D array as that would require 3x3 = 9 elements.



 ### The Unknown Dimension
You are allowed to have one "unknown" dimension. Meaning that you do not have to specify an exact number for one of the dimensions in the reshape method. Pass -1 as the value, and NumPy will calculate this number for you.

In [25]:
# Convert 1D array with 8 elements to 3D array with 2x2 elements:

arr = np.array([1, 2, 3, 4, 5, 6, 7, 8])

newarr = arr.reshape(2, 2, -1)

print(newarr)
print("New array's dimensions:", newarr.ndim)
print("First level: ", newarr[0])  # if the goal is to work with a 2x2 shaped object accessing this level will be needed
print("Second level:", newarr[1][0])
print("Third level: ", newarr[1][0][1])

[[[1 2]
  [3 4]]

 [[5 6]
  [7 8]]]
New array's dimensions: 3
First level:  [[1 2]
 [3 4]]
Second level: [5 6]
Third level:  6


### New Axis

In [26]:
arr = np.array([1, 2, 3, 4, 5, 6, 7, 8])
arr
arr.shape

(8,)

In [27]:
new_arr = arr[:, np.newaxis]
print(new_arr.shape)
new_arr

(8, 1)


array([[1],
       [2],
       [3],
       [4],
       [5],
       [6],
       [7],
       [8]])

In [28]:
new_2 = arr[np.newaxis, :]
new_2

array([[1, 2, 3, 4, 5, 6, 7, 8]])

### Transpose

Aligning the shape of data becomes critical with some Machine Learning techniques such as Singular Value Decomposition or Principal Component Analysis.  The Transpose _numpy_ function is useful for switching the rows and columns and aligning the data.   

In [29]:
arr = np.arange(8).reshape(2,4)
print(arr)
arr.shape

[[0 1 2 3]
 [4 5 6 7]]


(2, 4)

In [30]:
arr_T = arr.T
arr_T
arr_T.shape

(4, 2)

In [31]:
arr_T.transpose()

array([[0, 1, 2, 3],
       [4, 5, 6, 7]])

## 5. NumPy Array Join

In [32]:
# Join two 2-D arrays along rows (axis=1):
arr1 = np.array([[1, 2], [3, 4]])

arr2 = np.array([[5, 6], [7, 8]])

arr = np.concatenate((arr1, arr2), axis=1)

print(arr)

[[1 2 5 6]
 [3 4 7 8]]


#### Joining Arrays Using Stack Functions
Stacking is same as concatenation: the only difference is that stacking is done along a new axis.

In [33]:
arr1 = np.array([1, 2, 3])

arr2 = np.array([4, 5, 6])

arr = np.stack((arr1, arr2), axis=1)

print(arr)

[[1 4]
 [2 5]
 [3 6]]


In [34]:
# Stacking Along Rows

arr1 = np.array([1, 2, 3])

arr2 = np.array([4, 5, 6])

arr = np.hstack((arr1, arr2))

print(arr)

[1 2 3 4 5 6]


In [35]:
# Stacking Along Columns
arr1 = np.array([1, 2, 3])

arr2 = np.array([4, 5, 6])

arr = np.vstack((arr1, arr2))

print(arr)

[[1 2 3]
 [4 5 6]]


In [36]:
# Stacking Along Height (depth), element with same height will be stacked together
arr1 = np.array([1, 2, 3])

arr2 = np.array([4, 5, 6])

arr = np.dstack((arr1, arr2))

print(arr)

[[[1 4]
  [2 5]
  [3 6]]]


## 6. Sorting and Index Operations

### Sort

In [37]:
# numpy can either sort in place 
arr = np.random.randint(10, size=(10))
print(arr)
arr.sort()
print(arr)

[6 0 4 9 4 4 8 1 4 1]
[0 1 1 4 4 4 4 6 8 9]


In [38]:
# or numpy can return a separate sorted array
arr = np.random.randint(10, size=(10))
print(arr)
sorted_arr = np.sort(arr)
print(sorted_arr)
print(arr)

[8 9 7 8 4 2 6 2 6 9]
[2 2 4 6 6 7 8 8 9 9]
[8 9 7 8 4 2 6 2 6 9]


### NumPy Searching Arrays
_np.where_ enables you to return **the index** of the terms that meet a search criteria in the data

In [39]:
# case 1: single term

arr = np.array([1, 2, 3, 4, 5, 4, 4])

x = np.where(arr == 4) # search term = 4

print(x) # the indices of the original array that equal the search term

(array([3, 5, 6]),)


In [40]:
# case 2: derived value

arr = np.array([1, 2, 3, 4, 5, 6, 7, 8])

x = np.where(arr%2 == 0) # produce the indices of the even number values in the original array

print(x)

(array([1, 3, 5, 7]),)


In [41]:
# case 3 : sort first and then search 

arr = np.array([4, 1, 8, 9]) # after sorting [1,4,8,9]

x = np.searchsorted(arr, 9) # returns the index of search term = 9 once the array is sorted

print(x)

3


### Conditional assignment
numpy can make conditional assignment to values in an array using the _np.where_ function similar to the ternary function: `x if <condition> else y`

The statement takes the form: `np.where(<condition on the array>, <if condition is True>, <if condition is False>`.  

In [42]:
arr = np.random.randint(10, size=(10))
arr

array([1, 6, 1, 2, 9, 8, 8, 9, 8, 1])

In [43]:
 # if the member of arr is odd nothing is done, if even the value is increased by 1
all_odd = np.where(arr % 2 == 1, arr, arr + 1)
all_odd

array([1, 7, 1, 3, 9, 9, 9, 9, 9, 1])

### Boolean Masks
The truth values of conditional statements can be used to filter the array

In [44]:
evens = np.where(arr % 2 == 0) # returns the indices of the even numbers in arr
print("The original array: ", arr)
print(f"The even indices {evens}")
print(f"The even members of arr {arr[evens]}")  # selects within the array for the indices passing the condition

The original array:  [1 6 1 2 9 8 8 9 8 1]
The even indices (array([1, 3, 5, 6, 8]),)
The even members of arr [6 2 8 8 8]


### Argsort

In [45]:
# np.argsort(array) returns an array of the index in ascending order; the original array has to place the value
# with the given index numbers in the order specified by the return from np.argsort

arr = np.random.randint(10, size=(10))
arr

array([4, 5, 2, 3, 3, 9, 5, 4, 7, 4])

In [46]:
order_indices = np.argsort(arr)
order_indices

array([2, 3, 4, 0, 7, 9, 1, 6, 8, 5])

In [47]:
print("The original order: ", arr)
print("The indices needed to place the values in order: ", order_indices)
print("Once the values at the given indices have been moved: ", arr[order_indices])

The original order:  [4 5 2 3 3 9 5 4 7 4]
The indices needed to place the values in order:  [2 3 4 0 7 9 1 6 8 5]
Once the values at the given indices have been moved:  [2 3 3 4 4 4 5 5 7 9]


When _np.argsort_ is applied to a multidimensional array the row to use for ordering must be specified.  This row will sort the other rows according to it's column values.  

In [48]:
arr = np.random.randint(10, size = (3, 10))
arr

array([[9, 9, 9, 4, 7, 6, 7, 2, 3, 6],
       [1, 7, 2, 7, 6, 0, 4, 2, 5, 3],
       [5, 9, 4, 4, 2, 8, 6, 7, 4, 3]])

In [49]:
# the index values of the middle row values needed to put the middle row in order
middle_sort = np.argsort(arr[1])
middle_sort

array([5, 0, 2, 7, 9, 6, 8, 4, 1, 3])

In [50]:
# ordering the full 3x10 arry by the middle row's order
print(arr)
print(arr[:, middle_sort])

[[9 9 9 4 7 6 7 2 3 6]
 [1 7 2 7 6 0 4 2 5 3]
 [5 9 4 4 2 8 6 7 4 3]]
[[6 9 9 2 6 7 3 7 9 4]
 [0 1 2 2 3 4 5 6 7 7]
 [8 5 4 7 3 6 4 2 9 4]]


Rows 0 and 2 are ordered according the the index values from row 1.

### Lexsort

In [51]:
# np.lexsort establishes a sort order based on the order of multiple arrays based on their independent values
# in order to ensure ordering when there are duplicate values 
# the arrangement is slightly counter-intuitive: np.lexsort(<secondary_array order>, <primary_array order)

In [52]:
arr_a = np.array(['rain', 'light', 'mirage', 'sound'])
arr_b = np.array(['sun', 'moon', 'moon', 'bird'])

In [53]:
order = np.lexsort([arr_a, arr_b])
sorted_arrs = zip(arr_a[order], arr_b[order])

In [54]:
for item in sorted_arrs:
    print(item)

('sound', 'bird')
('light', 'moon')
('mirage', 'moon')
('rain', 'sun')


The second set of values is the primary order, duplicate values are then ordered according to the first.  

## 7. NumPy ufuncs
#### What are ufuncs?
_ufuncs_ stands for "Universal Functions" and they are NumPy functions that operate on the ndarray object.

#### Add the Elements of Two Lists

In [55]:
x = [1, 2, 3, 4]
y = [4, 5, 6, 7]
z = np.add(x, y)

print(z)

[ 5  7  9 11]


#### Subtraction

In [56]:
arr1 = np.array([10, 20, 30, 40, 50, 60])
arr2 = np.array([20, 21, 22, 23, 24, 25])

newarr = np.subtract(arr1, arr2)

print(newarr)

[-10  -1   8  17  26  35]


#### Multiplication

In [57]:
arr1 = np.array([10, 20, 30, 40, 50, 60])
arr2 = np.array([20, 21, 22, 23, 24, 25])

newarr = np.multiply(arr1, arr2)

print(newarr)

[ 200  420  660  920 1200 1500]


#### Division

In [58]:
arr1 = np.array([10, 20, 30, 40, 50, 60])
arr2 = np.array([3, 5, 10, 8, 2, 33])

newarr = np.divide(arr1, arr2)

print(newarr)

[ 3.33333333  4.          3.          5.         25.          1.81818182]


#### Power   
$ x^2 $

In [59]:
arr1 = np.array([10, 20, 30, 40, 50, 60])
arr2 = np.array([3, 5, 6, 8, 2, 33])

newarr = np.power(arr1, arr2)

print(newarr)

[         1000       3200000     729000000 6553600000000          2500
             0]


Numpy will produce an error if integers are raised to a negative power:

In [69]:
integer_values = np.array([1, 2, 3])
# neg_pow = np.power(integer_values, -2)

A work around is to pass the integer as a declared float data-type to enable Numpy to manage the values: 

In [65]:
integer_values_as_float = (integer_values).astype(float)
neg_power = np.power(integer_values_as_float, -2)
neg_power

array([1.        , 0.25      , 0.11111111])

#### Remainder

In [None]:
arr1 = np.array([10, 20, 30, 40, 50, 60])
arr2 = np.array([3, 7, 9, 8, 2, 33])

newarr = np.mod(arr1, arr2)

print(newarr)

#### Quotient and Mod

In [None]:
arr1 = np.array([10, 20, 30, 40, 50, 60])
arr2 = np.array([3, 7, 9, 8, 2, 33])

newarr = np.divmod(arr1, arr2)

print(newarr)

#### Other ufunc

#### Finding LCM (Lowest Common Multiple)

In [None]:
# case 1 
num1 = 4
num2 = 6

x = np.lcm(num1, num2)

print(x)

In [None]:
# case 2
# To find the Lowest Common Multiple of all values in an array, you can use the reduce() method.
arr = np.array([3, 6, 9])

x = np.lcm.reduce(arr)

print(x)

#### Finding GCD (Greatest Common Denominator)

In [None]:
# case 1
num1 = 6
num2 = 9

x = np.gcd(num1, num2)

print(x)

In [None]:
# case 2
arr = np.array([20, 8, 32, 36, 16])

x = np.gcd.reduce(arr)

print(x)

#### Create Sets in NumPy (set elements are unique)

In [None]:
arr = np.array([1, 1, 1, 2, 3, 4, 5, 5, 6, 7])

x = np.unique(arr)

print(x)

#### Finding Union

In [None]:
arr1 = np.array([1, 2, 3, 4])
arr2 = np.array([3, 4, 5, 6])

newarr = np.union1d(arr1, arr2)

print(newarr)

#### Finding Intersection

In [None]:
arr1 = np.array([1, 2, 3, 4])
arr2 = np.array([3, 4, 5, 6])

newarr = np.intersect1d(arr1, arr2, assume_unique=True)

print(newarr)

### Other useful NumPy operations

#### Trigonometric Functions

In [None]:
theta = np.linspace(0, np.pi, 5)
print('sin(theta): ', np.sin(theta))
print('cos(theta): ', np.cos(theta))

#### Logarithms

In [None]:
x = np.random.randint(1, 4, size=4)
print("x =", x)
print("ln(x) = ", np.log(x))
print("log2(x) = ", np.log2(x))

#### Aggregation: max and min

In [None]:
arr = np.random.rand(20)
print(arr)
# maximum
print('maximum:', np.max(arr))
# Minimum
print('minimum: ', np.min(arr))

#### Argmax and Argmin
    syntax: numpy.argmax(a, axis=None, out=None)
    The numpy.argmax() function returns indices of the max element of the array in a particular axis. 
    Parameters:
    
    a: array_like input array
    axis: int, optional
    By default, the index is into the flattened array, otherwise along the specified axis.
    
    out: array, optional
    If provided, the result will be inserted into this array. It should be of the appropriate shape and dtype.
    
    Returns: index_array ndarray of ints
    

In [None]:
a = np.arange(6).reshape(2,3) + 10
print(a)
# case 1: if axis is not specified, the array will be flattened
print(np.argmax(a))
print(np.argmin(a))

In [None]:
# case 2: specify the axis to 0 means, the operation is performed down the column
print(np.argmax(a, axis=0))
print(np.argmin(a, axis=0))

In [None]:
# case 3: specify the axis to 1 means, the operation is performed down the row
print(np.argmax(a, axis=1))
print(np.argmin(a, axis=1))

#### Multidimensional aggregates

In [None]:
np.random.seed(10)
daily_temperatures = np.random.randint(20, 100, size=(4, 6))
print(daily_temperatures)
#find max & min in different dimensions 
print('lowest daily temperature: ',daily_temperatures.min(axis=0))
print('highest daily temperature: ',daily_temperatures.max(axis=0))

## 8. Polynomials

### Defining Polynomials

It is possible to model a polynomial by returning a function that uses the variables:

$y = 2x^2 + x + 1$

In [3]:
def polynomial():
    return lambda x: 2 * x**2 + x + 1

In [18]:
poly = polynomial()

In [5]:
x = 2
poly(2)

11

In [7]:
def polynomial_2():
    return lambda a,b: 2 * a**2 * b + 2 * b + 1

In [8]:
poly_2 = polynomial_2()

In [9]:
poly_2(1, 2)

9

### Using numpy

$y = x^2 + 2x - 3$

In [21]:
poly_3 = np.poly1d([1, 2, -3])

In [22]:
poly_3(2)

5

### Derivative of a polynomial

In [14]:
poly_3_der = poly_3.deriv()

In [15]:
poly_3_der(2)

8

### Solving the roots of a Polynomial

In [23]:
np.roots(poly_3)

array([-3.,  1.])