# Modules and NumPy

## Modules

In [None]:
# a Python module is just a fancy name for a python file that contains code which you can use in a different python program.
# This system allows you to keep your code 'modular', meaning your code is separated into different sections,
# and each section is designed to serve a specific purpose.

## Multi-Dimensional Lists (Matrices)

In [33]:
# I've mentioned before how an element in a list is simply a variable, and you can do to it anything
# you can do with a variable.
# Just like a variable can contain a list, a list can also contain other lists, like this:
matrix = [
    [1, 2, 3],
    [4, 5, 6],
    [7, 8, 9]
]
print(matrix)

[[1, 2, 3], [4, 5, 6], [7, 8, 9]]


In [28]:
# One thing you have to fully understand to work well with matrices is how to get an element inside.
# To get the element in the first 'row' and the second 'column', you do:
x = matrix[0][1]
print(x)

2


In [29]:
# It helps to break it down into steps:

# This gives the first row:
row = matrix[0]
print(row)

[1, 2, 3]


In [30]:
# and then, to get the second element in this row, we already know what to do:
x = row[1]
print(x)

2


In [31]:
# so combining those steps into one line gives us:
x = matrix[0][1]

## NumPy

NumPy (the name comes from Numerical Python) is an extremely commonly used Python library designed to make it convenient to work with large or complex numerical datasets efficiently and easily.

The library is designed around enhancing Python's existing lists: instead of basic Python lists, numpy uses it's own type of list called an 'ndarray' (usually simply referred to as a numpy array), which can do practically everything we've seen the basic python list do, and more.

### Importing NumPy

In [1]:
import numpy as np

### Creating a NumPy array from an existing list or matrix 

In [30]:
# simply call the np.array function, and pass in your existing list as an argument
a1 = [1,2,3,4]
a = np.array(a1)
print(a)

b1 = [5,6,7,8]
b = np.array(b1)
print(b)

[1 2 3 4]
[5 6 7 8]


In [32]:
# numpy also handles matrices
c1 = [
    [4, 5, 6, 7],
    [8, 9, 10, 11]
]
c = np.array(c1)
print(c)

[[ 4  5  6  7]
 [ 8  9 10 11]]


### Creating a 'default' NumPy array

In [22]:
# array of all zeros
zeros = np.zeros(5)
print(zeros)

[0. 0. 0. 0. 0.]


In [23]:
constants = np.full(5, 7)
print(constants)

[7 7 7 7 7]


In [24]:
# range from starting number (inclusive) to ending number (exclusive), with given step
nums = np.arange(1,15,2)
print(nums)

[ 1  3  5  7  9 11 13]


In [25]:
# another range function, similar to above, but instead of step you give it number of elements to create
# NOTE: keep this one in mind, it's very handy for creating evenly spaced graph labels
spaced = np.linspace(1,15,5)
print(spaced)

[ 1.   4.5  8.  11.5 15. ]


In [27]:
# generate random numbers in the range [0,1)
randoms = np.random.random(5)
print(randoms)

[0.57599596 0.33467527 0.11094669 0.86316818 0.29447629]


## Useful properties of a NumPy array

In [34]:
# get the 'shape' of an array. This means rows, columns, and so on. We will only deal with 2-dimensional arrays.
print(a.shape)
print(c.shape)

(4,)
(2, 4)


In [35]:
# len() function gives the rows
print(len(a))
print(len(c))

4
2


In [36]:
# size property gives the total number of elements
print(a.size)
print(c.size)

4
8


## Operations with NumPy arrays

In [37]:
# being a library designed for numerical processing,
# of course numpy makes it easy to do mathematical operations
# on entire arrays. For example:

# mutiply all values in an array by a number:
print(a * 2)

[2 4 6 8]


In [38]:
# add one array to another
# NOTE: be careful, if your arrays don't have the same shape,
# numpy will try to 'guess' what your intention was.
# It may or may not guess correctly, so make sure you double check the result.
print(a + b)

[ 6  8 10 12]


In [40]:
# here, a and c are different shapes, but since a has 4 rows and c has 4 columns,
# numpy makes a guess that you want to add a to each row of c.
print(a + c)

ValueError: operands could not be broadcast together with shapes (3,) (2,4) 

In [44]:
# numpy also has a few convenient functions used a lot in data processing
print(np.sqrt(a))
print(np.log(a))
print(np.sin(a))
# but we won't really use many of these for our purposes

[1.         1.41421356 1.73205081 2.        ]
[0.         0.69314718 1.09861229 1.38629436]
[ 0.84147098  0.90929743  0.14112001 -0.7568025 ]


## Slicing and filtering

In [47]:
# Remember how we could slice lists? numpy arrays can do the same and MUCH more:

# get the first row of a 2D matrix:
print(c[0])

[4 5 6 7]


In [48]:
# get the first two columns in all rows:
print(c[:,:2])

[[4 5]
 [8 9]]


In [49]:
# How about filtering? easy:
# get all elements in c that are even:
print(c[c % 2 == 0])

[ 4  6  8 10]


In [50]:
# get all elements less than or equal to 7:
print(c[c <= 7])

[4 5 6 7]


In [51]:
# if you just use the expressions inside the brackets by themselves,
# it returns a numpy array of true/false values,
# showing exactly which elements pass or fail the condition
print(c % 2 == 0)

[[ True False  True False]
 [ True False  True False]]


## Aggregate functions

In [58]:
# Another common data processing task is aggregate functions,
# i.e. functions that take a bunch of data and process it down into just one number
# for example: sum, avg, max, min, etc.

# numpy has built in functions for the most commonly used ones:
print(c.sum())
print(c.mean())
print(c.max())
print(c.min())

60
7.5
11
4
