# Lesson: numpy Package

numpy is a very important package, described as "the fundamental package for scientific computing with Python". It is the underpinning for pandas and many AI and ML packages.

It provides the np.ndarray data type (array for short). Unlike lists, all the element in arrays must be the same data type (usually ints or float). We can do element-wise operations on these arrays to make our code simpler and shorter.  Operations on arrays work a lot faster than operations on lists.


In [None]:
import numpy as np

## Create numpy arrays

Create an numpy array from a list.  

We will use this these arrays in many of the examples below.

In [None]:
squares = np.array([1, 4, 9, 16, 25])
print("squares:", squares)
print("type:", type(squares))

array1 = np.array([3,4,5])
array2 = np.array([30,40,50])
print("array1:", array1)
print("array2:", array2)


A numpy array can be 1, 2 or more dimensions.

In [None]:
#  A numpy array can also be 2 or more dimensions
list_of_lists = [[1,2,3], [4,5,6]]
array2d = np.array(list_of_lists) # from a list of lists
array2d

An array has many useful properties

In [None]:
print("shape:", array2d.shape)
print("number of dimensions:", array2d.ndim)

Create numpy arrays using numpy functions

In [None]:
np_zeros = np.zeros(3) # initialise with all elements with value 0
np_zeros

In [None]:
np_ones = np.ones(3) # # initialise with all elements with value 1
np_ones

## Index and Slice Arrays

We can index and slice array in the usual fashion.

In [None]:
# returns the element at index 2
indexed = squares[2] 
indexed

In [None]:
# returns the elements from index 2 to 4 (not including 4)
sliced = squares[2:4] 
sliced

With a 2D array, we use the [i, j] format as the slicer.

In [None]:
print("element in first row, first column of array2d:", array2d[0,0])
print("type:", type(array2d[0,0]))

## Array Operations
We can take advantage of element-wise operations. We don't need to loop through the elements of the array.

Add (or multiply, subtract, divide...) a constant scalar value to each element in the array

In [None]:
print("squares", squares)
print("squares + 10", squares + 10)

Add (or multiply...) two arrays.  This is an element-wise operation.  The arrays must be the same size.

In [None]:
print("add two arrays:", array1 + array2)
print("multiply two arrays:", array1 * array2)

In [None]:
# dot product of two arrays is the sum of the product of their corresponding elements
np.dot(array1, array2)

### Broadcasting
Broadcasting allows operations between arrays of different shapes.

In [None]:
print("array1:\n", array1)
print("array2d:\n", array2d)
print("array1 + array2d:\n", array1 + array2d)


## Filter arrays with boolean expressions

A boolean expresion on an array will return an array of booleans the same length as the array.

In [None]:
squares < 15

We can use this boolean array in slice notation to returns a smaller array with only those elements that meet the criteria.

In [None]:
squares[squares < 15]

## Statistical (aggregaton) operations

We can aggregate array of numeric elements.


In [None]:
print("squares:", squares)
print("total sum:", squares.sum())
print("smallest value:", squares.min())
print("largest value:", squares.max())
print("average:", squares.mean())


## Reshape Arrays

In [None]:
evens = np.arange(2, 26, 2) # start at 2, stop before 26, step by 2
print("evens\n", evens)
print("shape:", evens.shape) # note the trailing comma in the result to indicate a tuple of 1 element

In [None]:
reshaped = evens.reshape(3, 4) # reshape to 3 rows, 4 columns
print("reshaped\n", reshaped)
print("shape:", reshaped.shape)

## Stack Arrays

In [None]:
# Create a couple of 2D arrays
array1 = np.array([[1,2,3], [4,5,6]])
array2 = np.array([[7,8,9], [10,11,12]])
print("array1\n", array1)
print("array2\n", array2)


In [None]:
# Stack vertically
vstacked = np.vstack((array1, array2)) 
print("vstacked\n", vstacked)

In [None]:
# Stack horizontally
hstacked = np.hstack((array1, array2))
print("hstacked\n", hstacked)

## Random number generation

Create an numpy array  with a set of random values

In [None]:
np.random.randn(5) # standard normal distribution

In [None]:
np.random.randn(4, 2) # 4x2 array of random numbers

In [None]:
# simulate 10 throws of a fair die - a discrete, uniform distribution
dice_throws = np.random.randint(low = 1, high = 6, size = 10) 
dice_throws