### Intro To NumPy:
NumPy, which stands for Numerical Python, is a powerful library for numerical computing in Python. It provides support for large, multi-dimensional arrays and matrices, along with a collection of mathematical functions to operate on these elements. It extends the base capabilities of python to add a richer data set including more numeric types, vectors, matrices, and many matrix functions. NumPy and python work together fairly seamlessly. Python arithmetic operators work on NumPy data types and many NumPy functions will accept python data types.

In [91]:
import numpy as np    # it is an unofficial standard to use np for numpy
import time

### Vectors
Vectors, as you will use them in this course, are ordered arrays of numbers. A vector does not, for example, contain both characters and numbers. The number of elements in the array is often referred to as the dimension though mathematicians may prefer rank.

### NumPy Arrays
NumPy's basic data structure is an indexable, n-dimensional array containing elements of the same type (dtype).

In [92]:
#Vector Creation

a = np.zeros(4);                             print(f"np.zeros(4):\na = {a},\na's shape = {a.shape},\na's data type = {a.dtype}\n")
a = np.zeros((4,));                          print(f"np.zeros((4,)):\na = {a},\na's shape = {a.shape},\na's data type = {a.dtype}\n")
a = np.random.random_sample(4);             print(f"np.random.random_sample(4):\na = {a},\na' shape = {a.shape},\na's data type = {a.dtype}")

np.zeros(4):
a = [0. 0. 0. 0.],
a's shape = (4,),
a's data type = float64

np.zeros((4,)):
a = [0. 0. 0. 0.],
a's shape = (4,),
a's data type = float64

np.random.random_sample(4):
a = [0.20451517 0.27931367 0.43080526 0.67084387],
a' shape = (4,),
a's data type = float64


In [93]:
#manual method
arr = np.array([1,2,3,4]);          print(f"NumPy Array: {arr}, Array Shape: {arr.shape}, Array data type: {arr.dtype}")

NumPy Array: [1 2 3 4], Array Shape: (4,), Array data type: int32


### Shape and Reshape Numpy Arrays:
The shape of an array defines its dimensions. Learn how to manipulate the shape of NumPy arrays and reshape them according to your requirements.

In [124]:
# Shape and Reshape
arr = np.array([1, 2, 3, 4, 5, 6, 7, 8, 9])
reshaped_arr = arr.reshape(-1, 3)
print("Original Array:", arr)
print(f"Reshaped Array:\n {reshaped_arr}\n Reshaped Array Shape: {reshaped_arr.shape}")

Original Array: [1 2 3 4 5 6 7 8 9]
Reshaped Array:
 [[1 2 3]
 [4 5 6]
 [7 8 9]]
 Reshaped Array Shape: (3, 3)


### Sorting Numpy Arrays The Right Way:
NumPy provides efficient functions for sorting arrays. Understand the various sorting algorithms and how to use them appropriately.

In [109]:
# Sorting a NumPy array
arr = np.array([3, 1, 4, 1, 5, 9, 2, 6, 5, 3, 5])
sorted_arr = np.sort(arr)
print("Original Array:", arr)
print("Sorted Array:", sorted_arr)

Original Array: [3 1 4 1 5 9 2 6 5 3 5]
Sorted Array: [1 1 2 3 3 4 5 5 5 6 9]


### Searching Numpy Arrays The Easy Way:
Learn about techniques for searching elements in NumPy arrays, including the use of boolean indexing and searchsorted function.

In [110]:
# Searching in a NumPy array
arr = np.array([1, 2, 3, 4, 5])
index = np.where(arr == 3)
print("Original Array:", arr)
print("Index of 3:", index[0])

Original Array: [1 2 3 4 5]
Index of 3: [2]


### How To Filter Numpy Arrays:
Filtering allows us to extract specific elements from an array based on certain conditions. Understand the techniques for filtering NumPy arrays effectively.

In [111]:
# Filtering in a NumPy array
arr = np.array([1, 2, 3, 4, 5])
filtered_arr = arr[arr > 2]
print("Original Array:", arr)
print("Filtered Array:", filtered_arr)

Original Array: [1 2 3 4 5]
Filtered Array: [3 4 5]


### Mean, Meadian, and Standard Deviation Calculation

In [112]:
# Mean, Median, and Standard Deviation Calculation with NumPy
arr = np.array([1, 2, 3, 4, 5, 6, 6, 7, 8, 8, 9])

# Mean calculation
mean_value = np.mean(arr)
print("Mean:", mean_value)

# Median calculation
median_value = np.median(arr)
print("Median:", median_value)

#Standard Deviation
arr = np.array([1, 2, 3, 4, 5])
std_deviation = np.std(arr)
print("Standard Deviation:", std_deviation)

Mean: 5.363636363636363
Median: 6.0
Standard Deviation: 1.4142135623730951


### Indexing and Slicing
**Indexing:** Indexing means referring to an element of an array by its position within the array. 

**Slicing:** Slicing means getting a subset of elements from an array based on their indices.

In [94]:
#vector indexing
a = np.arange(10)
print(a)

#access an elememt
print(f"a[2].shape: {a[2].shape} a[2] = {a[2]}")#Accessing an element returns a scalar 

#access the last element
print(f"\na[-1] = {a[-1]}")

#outside range of the vector
try:
    c = a[10]
except Exception as e:
    print("\nThe error message you'll see is:")
    print(e)

[0 1 2 3 4 5 6 7 8 9]
a[2].shape: () a[2] = 2

a[-1] = 9

The error message you'll see is:
index 10 is out of bounds for axis 0 with size 10


In [95]:
#Slicing
a = np.arange(10)
print(f"a                  =  {a}")

#5 consecutive elements (start:stop:step) -> start to stop includng stop element but not the start element
c = a[2:7];     print("a[2:7] or a[2:7:1] = ", c)

#3 elements seperated by two
c = a[2:7:2];     print("a[2:7:2]           = ", c)

#all elements at and above index 3
c = a[3:];        print("a[3:]              = ", c)

#all elements below index 3
c = a[:3];        print("a[:3]              = ", c)

#all elements
c = a[:];         print("a[:]               = ", c);

a                  =  [0 1 2 3 4 5 6 7 8 9]
a[2:7] or a[2:7:1] =  [2 3 4 5 6]
a[2:7:2]           =  [2 4 6]
a[3:]              =  [3 4 5 6 7 8 9]
a[:3]              =  [0 1 2]
a[:]               =  [0 1 2 3 4 5 6 7 8 9]


### Single Vector Operations

In [96]:
a= np.array([1,2,3,4])
print(f"a             : {a}")

#negate all the elements of the vector array
b = -a
print(f"b = -a        : {b}")

#sum of all elements returns a scalar
b = np.sum(a)
print(f"b = np.sum(a) : {b}")

#mean
b = np.mean(a)
print(f"b = np.mean(a): {b}")

#square of all the elements
b = a**2
print(f"b = a**2      : {b}")

a             : [1 2 3 4]
b = -a        : [-1 -2 -3 -4]
b = np.sum(a) : 10
b = np.mean(a): 2.5
b = a**2      : [ 1  4  9 16]


### Vector Vector element-wise operations

In [97]:
a = np.array([ 1, 2, 3, 4])
b = np.array([-1,-2, 3, 4])
print(f"Binary operators work element wise: {a + b}")

Binary operators work element wise: [0 0 6 8]


In [98]:
#try a mismatched vector operation
c = np.array([1,2])
try:
    d = a + c
except Exception as e:
    print("The error message you'll see is:")
    print(e)

The error message you'll see is:
operands could not be broadcast together with shapes (4,) (2,) 


### Scalar Vector dot product
Vectors can be 'scaled' by scalar values. A scalar value is just a number. The scalar multiplies all the elements of the vector.

In [99]:
a = np.array([1, 2, 3, 4])

# multiply a by a scalar
b = 5 * a 
print(f"b = 5 * a : {b}")

b = 5 * a : [ 5 10 15 20]


### Vector Vector dot product
The dot product multiplies the values in two vectors element-wise and then sums the result. Vector dot product requires the dimensions of the two vectors to be the same. 

**Using a for loop**, implement a function which returns the dot product of two vectors. The function to return given inputs $a$ and $b$:
$$ x = \sum_{i=0}^{n-1} a_i b_i $$
Assume both `a` and `b` are the same shape.

In [100]:
def my_dot(a,b):
    """
   Compute the dot product of two vectors
 
    Args:
      a (ndarray (n,)):  input vector 
      b (ndarray (n,)):  input vector with same dimension as a
    
    Returns:
      x (scalar): 
    """
    x = 0
    for i in range(a.shape[0]):
        x = x + a[i] * b[i]
    return x

In [101]:
# self designed dot produt using for loop
a = np.array([ 1, 2, 3, 4])
b = np.array([-1, 4, 3, 4])

#start  = timer()
print(f"my_dot(a,b) = {my_dot(a,b)}")
#end = timer()

#print(f"Time of execution using for loop = {end-start}")

my_dot(a,b) = 32


In [102]:
a = np.array([ 1, 2, 3, 4])
b = np.array([-1, 4, 3, 4])

#start = timer()
print(f"np.dot(a,b) = {np.dot(a,b)}")
#end = timer()

#print(f"Time of execution using inbuilt vector dot product function = {end-start}")

np.dot(a,b) = 32


### The Need for Speed: vector vs for loop
We utilized the NumPy library because it improves speed memory efficiency.

In [103]:
#Timeit module provides a simple way to time small bits of Python code. 
#It has both a Command-Line Interface as well as a callable one.
import timeit 
from timeit import default_timer as timer

In [107]:
np.random.seed(1)
a = np.random.rand(10000000)
b = np.random.rand(10000000)

start = timer()
c = np.dot(a,b)
end = timer()

print(f"np.dot = {c:.4f}\nVectorized version takes {1000*(end-start):.4f} ms")

start = timer()
c = my_dot(a,b)
end = timer()

print(f"my_dot = {c:.4f}\nLoop version takes {1000*(end-start):.4f} ms")

np.dot = 2501072.5817
Vectorized version takes 7.1486 ms
my_dot = 2501072.5817
Loop version takes 1893.3546 ms


So, vectorization provides a large speed up in this example. This is because NumPy makes better use of available data parallelism in the underlying hardware. GPU's and modern CPU's implement Single Instruction, Multiple Data (SIMD) pipelines allowing multiple operations to be issued in parallel. This is critical in Machine Learning where the data sets are often very large.

### Matrix Creation

In [114]:
a = np.zeros((1, 5))                                       
print(f"a shape = {a.shape}\na = {a}\n")                     

a = np.zeros((2, 1))                                                                   
print(f"a shape = {a.shape}\na = {a}\n") 

a = np.random.random_sample((1, 1))  
print(f"a shape = {a.shape}\na = {a}") 

a shape = (1, 5)
a = [[0. 0. 0. 0. 0.]]

a shape = (2, 1)
a = [[0.]
 [0.]]

a shape = (1, 1)
a = [[0.04997798]]


In [117]:
# NumPy routines which allocate memory and fill with user specified values
a = np.array([[5], [4], [3]]);   print(f"a shape = {a.shape}\nnp.array: a = {a}")
a = np.array([[5],   # One can also
              [4],   # separate values
              [3]]); #into separate rows
print(f"a shape = {a.shape}\nnp.array: a = {a}")

a shape = (3, 1)
np.array: a = [[5]
 [4]
 [3]]
a shape = (3, 1)
np.array: a = [[5]
 [4]
 [3]]


### Operations on Matrices

### Indexing

In [137]:
a = np.arange(12).reshape(-1,2)
#Means the reshaped array should have strictly 2 columns and the
#number of rows is figured out by dividing the length of the array with no. of columns
print(f"a.shape = {a.shape}\na = {a}\n")

print(f"a[2,0].shape = {a[2,0].shape}\na[2,0] = {a[2,0]}\ntype(a[2,0]) = {type(a[2, 0])})\n")

print(f"a[2].shape = {a[2].shape}\na[2] = {a[2]}\ntype(a[2]) = {type(a[2])})")

a.shape = (6, 2)
a = [[ 0  1]
 [ 2  3]
 [ 4  5]
 [ 6  7]
 [ 8  9]
 [10 11]]

a[2,0].shape = ()
a[2,0] = 4
type(a[2,0]) = <class 'numpy.int32'>)

a[2].shape = (2,)
a[2] = [4 5]
type(a[2]) = <class 'numpy.ndarray'>)


### Slicing

In [145]:
a = np.arange(20).reshape(-1,10)
print(f"a = {a}\n")

print(f"a[0, 2:7:1] = {a[0, 2:7:1]}\na[0, 2:7:1].shape = {a[0, 2:7:1]}\ntype(a[0, 2:7:1]) = {type(a[0, 2:7:1])}\n")

print(f"a[:, 2:7:1] = {a[:, 2:7:1]}\na[:, 2:7:1].shape = {a[:, 2:7:1].shape}\ntype(a[:, 2:7:1]) = {type(a[:, 2:7:1])}")

print(f"a[1, :] = {a[1, :]}\na[1, :].shape = {a[1, :]}")

a = [[ 0  1  2  3  4  5  6  7  8  9]
 [10 11 12 13 14 15 16 17 18 19]]

a[0, 2:7:1] = [2 3 4 5 6]
a[0, 2:7:1].shape = [2 3 4 5 6]
type(a[0, 2:7:1]) = <class 'numpy.ndarray'>

a[:, 2:7:1] = [[ 2  3  4  5  6]
 [12 13 14 15 16]]
a[:, 2:7:1].shape = (2, 5)
type(a[:, 2:7:1]) = <class 'numpy.ndarray'>
a[1, :] = [10 11 12 13 14 15 16 17 18 19]
a[1, :].shape = [10 11 12 13 14 15 16 17 18 19]
