# Packages and Modules

A Python **module** is a single .py file containing Python code to be used elsewhere.

A Python **package** is a collection of related modules.

Through packages and modules Python shows its true potential as there are many publically available, well-designed packages providing immensely useful functionalities.

## Built-in Packages and Modules

Python by default comes with a number of built-in, already available packages and modules. We will look at a few here.

Packages and modules can be loaded using the `import` keyword

### `math`

`math` offers a number of mathematical operations

In [None]:
import math

Use any method within a package via syntax `<package-name>.<method>`. All available methods in the `math` package [at this link](https://docs.python.org/3/library/math.html)

Method examples

In [None]:
a = math.floor(1.15) # rounds down number to closest whole value
print(a)
b = math.floor(3.99)
print(b)

In [None]:
a = math.sqrt(4) # square root of a number
print(a)
b = math.sqrt(18.76)
print(b)

In [None]:
a = math.log(10) # logarithm of a number
print(a)
b = math.log(16.5)
print(b)

In [None]:
a = math.pi # pi
print(a)
b = math.e # Euler's number
print(b)

## `random`

`random` offers various functions for generating random numbers

In [None]:
import random

Method examples

In [None]:
a = random.randint(0, 10) # generate a random integer between 0 and 10
print(a)

In [None]:
_list = ['apple', 'orange', 'banana']
a = random.choice(_list) # randomly sample an element from a data structure
print(a)

In [None]:
a = random.random() # generate a random number between 0 and 1
print(a)

A number of other built-in packages and modules can be found at [this link](https://docs.python.org/3/library/index.html). Personally, I do **not** recommend trying to learn about these all up front here. Rather, as you continue to code in Python, you will naturally run into instances where you want to do something specific and an internet search, or question to an LLM, about what you want to do will introduce you to new packages and modules

## 3rd Party Packages and Modules

A vast amount of useful and well-maintained packages and modules have been developed by independent groups. Once these packages are installed into a Python environment, using Anaconda is recommended, they can be loaded into a Python script via an `import` statement.

The enviornment [here](..\environment.yml) that we already installed has a number of frequently used packages for data science. We will look at a few here. A specific tutorial on the `pandas` package can be found in the [pandas-tutorial](../../pandas-tutorial/README.md)

## `numpy`

`numpy` offers a `ndarray` data structure and a substantial amount of mathematical and statistical operations. [Documentation can be found here](https://numpy.org/doc/stable/)

Package import can be aliased using the `as` keyword. For example, `numpy` is almost always aliased with `np`

In [None]:
import numpy as np

In [None]:
a = np.array([[1, 2, 3, 10],
              [4, 5, 6, 11],
			  [7, 8, 9, 12]])

print(type(a)) # numpy.ndarray
print(a)

In [None]:
print(a.shape) # 2 dimensional array with 3 rows by 4 columns

#### Numpy elements can be accessed exactly like a Python list along each axis of the array

In [None]:
firstrow = a[0]
print(firstrow)

In [None]:
subarray = a[1:, 0:2] # all rows after 1 and the first and second columns
print(subarray)

#### Arrays can be created in a variety of ways. Here are a few

In [None]:
a = np.zeros((2, 3)) # array of all zeros
print(a)

In [None]:
a = np.ones((2, 3)) # array of all ones
print(a)

In [None]:
a = np.arange(4) # integers 0 through 3
print(a)

In [None]:
a = np.linspace(0, 10, num=5) # 5 points evenly spaced between 0 and 10
print(a)

In [None]:
a = np.array([1, 2, 3, 4]) # create an array from a list
print(a)

In [None]:
a = np.random.normal(0, 1, (2, 3)) # Array of standard normal random variable samples
print(a)

#### Arrays can easily be filtered

In [None]:
_filter = a < 0
subarray = a[_filter] # get all negative elements
print(subarray)

In [None]:
a[_filter] = 0 # replace all negativie elements with 0
print(a)

The & (and) and | (or) operators can be used to combine filters

In [None]:
a = np.random.normal(0, 1, (2, 3)) # Array of standard normal random variable samples
print(a)

In [None]:
_filter = (a < 0) | (a > 1)
subarray = a[_filter] # get all negative elements or elements greater than 1
print(subarray)

In [None]:
a[_filter] = np.floor(2*np.abs(a[_filter])) # replace all elements found by the filter with the floor of 2 times the absolute value of each element
print(a)

#### Mathematical operations on arrays

+, -, /, * are standard addition, subtraction, division, and multiplication **elementwise** between two arrays of the same size

In [None]:
a = np.arange(1, 5) # numbers 1 through 4
a = np.reshape(a, (2, 2)) # reshape to 2x2 matrix
print(a)

In [None]:
b = np.random.randint(-5, 5, (2, 2))
print(b)

In [None]:
addition = a+b
print(addition)

In [None]:
subtraction = a-b
print(subtraction)

In [None]:
multiplication = a*b
print(multiplication)

In [None]:
division = b/a
print(division)

In [None]:
# Mathematical operations will apply a scalar to each element of an array

print(a, '\n')
print(6+a, '\n')
print(6.53-a, '\n')
print(a/4, '\n')
print(4/a, '\n')
print(8*a, '\n')

In [None]:
b_max = np.max(b) # get max element of an array
b_min = np.min(b) # get min element of an array
print(b_min, b_max)

In [None]:
b_sum = np.sum(b) # add all element of an array
print(b_sum)

`numpy` beginners guide found [here](https://numpy.org/doc/stable/user/absolute_beginners.html)

Similar to above, I personally do **not** recommend trying to learn all that numpy has to offer up front here as it is a very large package. Rather, start with the beginners guide and as you continue to code in Python, you will naturally run into instances where you want to do something specific and an internet search, or question to an LLM, about what you want to do will introduce you to new functionalities of numpy

## `matplotlib`

`matplotlib` offers a number of plotting and visualization methods. [Documentation can be found here](https://matplotlib.org/stable/index.html)

In [None]:
import matplotlib.pyplot as plt # import pyplot for plotting
import numpy as np

#### Example plots

In [None]:
x = [0, 1, 2, 3, 4, 5]
y = [0, -1, 2, -3, 4, -5]

plt.plot(x, y) # plot (x, y) pairs
plt.show() # show plot

In [None]:
x = np.linspace(-2*np.pi, 2*np.pi, 100)
y_cos = np.cos(x)
y_sin = np.sin(x)

plt.plot(x, y_cos) # plot (x, y_cos) pairs
plt.plot(x, y_sin) # plot (x, y_sin) pairs
plt.xlabel('x axis')
plt.ylabel('Trigonometric Function Value')
plt.title('Example plot of two sets of data')
plt.show() # show plot

In [None]:
x = np.arange(10)
exp_growth = np.exp(x)
exp_decay = np.exp(-x+len(x)-1)

plt.plot(x, exp_growth, linestyle='--', label='EXP growth')
plt.plot(x, exp_decay, marker='o', label='EXP decay')
plt.legend()
plt.show() # show plot

Quick start guide found [here](https://matplotlib.org/stable/users/explain/quick_start.html)

Tutorials found [here](https://matplotlib.org/stable/tutorials/index.html)

Again, I personally do **not** recommend trying to learn all that matplotlab has to offer up front here as it is a very large package. Rather, start with the quick start guide and as you continue to code in Python, you will naturally run into instances where you want to do something specific and an internet search, or question to an LLM, about what you want to do will introduce you to new functionalities of matplotlib

---

# Exercises

1. Import the `matplotlib` and `numpy` packages using aliased names

2. Create a numpy ndarray of 25 evenly spaced points over the range [-10, 10]

3. Generate a numpy array of uniformly distributed randomly sampled points over the range [0, 3] with the same shape as the data in Exercise 2.

4. Display a line plot of the data from Exercises 2 and 3 showing the point markers. Add a title and axis labels to the plot.

5. Display a scatter plot of the data from Exercises 2 and 3. Add a title and axis labels to the plot

6. Generate a numpy array of 100000 randomly sampled points from the standard normal distribution

7. Plot a histogram of the data from Exercise 6 to visualize the probability density function for the data 

8. Repeat Exercise 7, but now use 100 bins in your histogram and set the histogram *density* keyword argument to True. Afterwords, additionally display the following data as a line graph on the same plot

In [None]:
x = np.linspace(-4, 4, 1000)
y = np.exp(-x**2/2)/np.sqrt(2*np.pi)