Before you turn this problem in, make sure everything runs as expected. First, **restart the kernel** (in the menubar, select Kernel$\rightarrow$Restart) and then **run all cells** (in the menubar, select Cell$\rightarrow$Run All).

Make sure you fill in any place that says `YOUR CODE HERE` or "YOUR ANSWER HERE", as well as your name and collaborators below:

In [None]:
NAME = ""
COLLABORATORS = ""

---

<h1>Table of Contents<span class="tocSkip"></span></h1>
<div class="toc"><ul class="toc-item"><li><span><a href="#What-is-a-Python-Module?" data-toc-modified-id="What-is-a-Python-Module?-1"><span class="toc-item-num">1&nbsp;&nbsp;</span>What is a Python Module?</a></span></li><li><span><a href="#How-to-import-existing-modules" data-toc-modified-id="How-to-import-existing-modules-2"><span class="toc-item-num">2&nbsp;&nbsp;</span>How to import existing modules</a></span><ul class="toc-item"><li><span><a href="#from-...-import-..." data-toc-modified-id="from-...-import-...-2.1"><span class="toc-item-num">2.1&nbsp;&nbsp;</span><code>from</code> ... <code>import</code> ...</a></span></li><li><span><a href="#from-...-import-*" data-toc-modified-id="from-...-import-*-2.2"><span class="toc-item-num">2.2&nbsp;&nbsp;</span><code>from</code> ... <code>import *</code></a></span></li><li><span><a href="#Example-with-some-Python-built-in-modules" data-toc-modified-id="Example-with-some-Python-built-in-modules-2.3"><span class="toc-item-num">2.3&nbsp;&nbsp;</span>Example with some Python built-in modules</a></span></li><li><span><a href="#Making-your-own-Modules" data-toc-modified-id="Making-your-own-Modules-2.4"><span class="toc-item-num">2.4&nbsp;&nbsp;</span>Making your own Modules</a></span></li></ul></li><li><span><a href="#Packages-in-Python" data-toc-modified-id="Packages-in-Python-3"><span class="toc-item-num">3&nbsp;&nbsp;</span>Packages in Python</a></span><ul class="toc-item"><li><span><a href="#Create-Your-Own-Package" data-toc-modified-id="Create-Your-Own-Package-3.1"><span class="toc-item-num">3.1&nbsp;&nbsp;</span>Create Your Own Package</a></span></li></ul></li><li><span><a href="#Exercise" data-toc-modified-id="Exercise-4"><span class="toc-item-num">4&nbsp;&nbsp;</span>Exercise</a></span></li><li><span><a href="#Popular-Python-Packages" data-toc-modified-id="Popular-Python-Packages-5"><span class="toc-item-num">5&nbsp;&nbsp;</span>Popular Python Packages</a></span><ul class="toc-item"><li><span><a href="#Numpy" data-toc-modified-id="Numpy-5.1"><span class="toc-item-num">5.1&nbsp;&nbsp;</span>Numpy</a></span></li><li><span><a href="#Pandas" data-toc-modified-id="Pandas-5.2"><span class="toc-item-num">5.2&nbsp;&nbsp;</span>Pandas</a></span></li><li><span><a href="#Matplotlib" data-toc-modified-id="Matplotlib-5.3"><span class="toc-item-num">5.3&nbsp;&nbsp;</span>Matplotlib</a></span></li><li><span><a href="#Scikit-Learn-(Advanced)" data-toc-modified-id="Scikit-Learn-(Advanced)-5.4"><span class="toc-item-num">5.4&nbsp;&nbsp;</span>Scikit-Learn (Advanced)</a></span></li><li><span><a href="#Keras-(Advanced)" data-toc-modified-id="Keras-(Advanced)-5.5"><span class="toc-item-num">5.5&nbsp;&nbsp;</span>Keras (Advanced)</a></span></li></ul></li></ul></div>

# Python Modules

**After this class, you will know:**
 
* 1) What is a Python module?

* 2) How to import existing modules?

* 3) How to create your own module?

* 4) How to create your own package?

* 5) What are some of the most popular Python packages?

## What is a Python Module?

* You have seen how you can **reuse** codes in your program by defining functions. What if you want to reuse a number of functions in **other** programs that you write? The answer is **Module**.

* A module is basically a file containing all your functions and variables that you have defined. To reuse the module in other programs, the filename of the module **must** have a `.py` extension.

**Example**

In [None]:
import math

In [None]:
math.exp(2)

In [None]:
# The “sys” module in Python provides functions and variables that
# interact with the Python runtime environment.
import sys

In [None]:
# print the current version of Python
sys.version

In [None]:
sys.path

In [None]:
a = 1
print('The size of integer a (in byte) is: ', sys.getsizeof(a))

In [None]:
?sys.getsizeof(a)

* Let us go deeper to see what happened in above two statements.
    * First, we **import** the `sys` module using the *import* statement. Basically, it tells Python that we want to use this module. The `sys` module contains functionality related to the Python interpreter and its environment.
    * When Python executes the `import sys` statement, it **looks for** the sys.py module in one of the directores listed in its `sys.path` variable. If the file is found, then the statements in the main block of that module will run and then the module is made available for you to use. Note that the initialization is done only **the first time** when we import a module. 
    * The `sys.path` variable contains the `current directory`, `PYTHONPATH`, and the `installation-dependent default`. If you want to build **your own** module, you have to put the module under the above sys.path directory.

## How to import existing modules

### `from` ... `import` ...

* Python's **from** statement allows you to import specific attributes and functions **directly** from a module into the current namespace. The **from...import** has the following syntax

```py
from modulename import name1, name2, ... nameN
```

* For example, we first `import sys`, then further access the module attribute by `sys.version` and the module function by `sys.getsizeof`. In below example, you can see that we can directly import the module attributes and function by using the `from...import...` statement.

**Example**

In [None]:
from sys import version, getsizeof

In [None]:
version

In [None]:
print('Python integer size is: ', getsizeof(1))

### `from` ... `import *`

* It is also possible to import **all** attributes and functions from a module into the current namespace by using the `from ... import *` statement.

**Example**

In [None]:
from sys import *

In [None]:
path

* Since we import all things in module `sys`, we can output one of `sys` attributes `path`.

**Common Practice**

* Even though using `from...import` statement saves us time to include module name in calling an attribute or function, they are **not** suggested. It is still recommended to use original `import module` method to import attributes and function, since it will make your program much more **readable**.

### Example with some Python built-in modules

* https://docs.python.org/


In [None]:
# The “os” module in Python is used to interact with 
# the operating system and offers OS-level functionalities.
import os

directory = os.getcwd()
print(directory)

In [None]:
# The “sys” module in Python provides functions and variables
# that interact with the Python runtime environment.
import sys

print("Python version:", sys.version)
print("Command line arguments:", sys.argv)

In [None]:
# The calendar module allows operations and manipulations related to calendars.
import calendar
cal_october = calendar.month(21378937894237592656, 5)
print(cal_october)

In [None]:
#The “datetime” module allows for manipulation and reading of date and time values
import datetime
date_today = datetime.date.today()
time_now = datetime.datetime.now().time()
print(date_today)
print(time_now)

In [None]:
# The math module offers mathematical functions used for advanced arithmetic operations.
import math
"""😂"""
sqrt_val = math.sqrt(64)
pi_const = math.pi
print(sqrt_val)
print(pi_const)

In [None]:
# The "statistics" module provides functions for calculating mathematical 
# statistics of numeric (Real-valued) data.
from statistics import *

data = [20.7, -2.3, 19.2, 18.3, 0, 14.4]
print(sorted(data))
print("Median is: ", median(data))
print("Average is: ", mean(data))
print("Mode is: ", mode(data))
print("Sample standard deviation is: ", stdev(data))

### Making your own Modules

* Creating your own module is simple. You just need to write a `.py` file including the functions/attributes which need to be imported, and place the `.py` file under one of directories `sys.path` includes. For example, the directory this notebook is saved.

  * Go to your JupyterHub Home Page, select the "Files" tab, then use the New File to start a text editor.
  
  * Type the following code:
  
  ```python
  def MyModule():
        print("This is my first module.")
  ```
  
  * Save the file with the name **mymodule1.py**.

In [None]:
import mymodule1 

In [None]:
dir(mymodule1)

In [None]:
mymodule1.MyModule()

## Packages in Python

* *Function* helps reuse blocks of code, *Module* helps reuse functions in other programs. In this section, we introduce how to organize modules into a **package**.

* A package is a hierarchical file directory structure that defines a single Python application environment that consists of modules and sub-packages and sub-sub-packages, and so on.

* There has many famous and widely used Python packages, [here](https://pythontips.com/2013/07/30/20-python-libraries-you-cant-live-without/) lists a few of them. It is for sure that you will use some of them in the future.

* We can also build our own Python package. Following example demonstrates how to do it. 

### Create Your Own Package

1. We want to create a package called "GPA_Booster", which includes a set of modules helping students to improve their GPA in various subjects, e.g. Calculus, Linear Algebra, Data Structure, etc.

2. Create a folder named as "GPA_Booster" and put it under one of directories in `sys.path` so that Python can find your package when you use `import` statement.

3. Go into the "GPA_Booster" folder and create three .py files, which are Calculus.py, LinearAlgebra.py, DataStructure.py.

4. In "Calculus.py", define a function like below:
```py
def Calculus_func():
        print("This function will help you improve Calculus GPA.")
```

5. Create `GPA_Booster/LinearAlgebra.py` file and `GPA_Booster/DataStructure.py` file similarly, and put them at the same place as Calculus.py.

6. Now, create one more file `__init__.py` in the "GPA_Booster" folder, `GPA_Booster/__init__.py`

7. To make all of your functions available when you import GPA_Booster package, you need to put below statements in `__init__.py`.
```py
from .Calculus import Calculus_func
from .LinearAlgebra import LinearAlgebra_func
from .DataStructure import DataStructure_func
__all__ = ['Calculus_func', 'LinearAlgebra_func', 'DataStructure_func']
```

8. After you add these lines to `__init__.py`, you have all of these functions available when you import GPA_Booster package.

* If `__init__.py` file contains a list called `__all__`, then only the names listed there will be public. In the above example, we include all of the three functions relating to the three modules.

In [None]:
# import the package
import GPA_Booster

In [None]:
# call the function
GPA_Booster.Calculus_func()
GPA_Booster.LinearAlgebra_func()
GPA_Booster.DataStructure_func()

Change the `__all__` to the following and try again. 

```python
from .Calculus import Calculus_func
from .LinearAlgebra import LinearAlgebra_func
from .DataStructure import DataStructure_func
__all__ = ['Calculus_func', 'LinearAlgebra_func']
```

In [1]:
from GPA_Booster import *

In [2]:
Calculus_func()

This function will help you improve Calculus GPA.


In [3]:
LinearAlgebra_func()

This function will help you improve Linear Algebra GPA.


In [4]:
DataStructure_func()

NameError: name 'DataStructure_func' is not defined

## Exercise


**Question**. Create a module named `mygrades` that contains the following functions:

```py
    increase_quiz(grades)
    increase_assignment(grades)
    increase_project(grades)
    increase_exam(grades)
    increase_all(grades)
```

`grades` is a list containing `[quiz, assignment, project, exam]`. Calling the function once will increase the respective grade by 10. and `increase_all(grades)` will increase each grade by 10. The maximum grade for each category is 100. 

Moreover the package initiate a list `default_grades = [25, 25, 25, 25]`.

**Example**
```py
    increase_quiz([20, 40, 40, 20]) -> [30, 40, 40, 20]
    increase_quiz([100, 40, 40, 20]) -> [100, 40, 40, 20]
    increase_project([100, 40, 40, 20]) -> [100, 40, 50, 20]
    increase_all([100, 40, 40, 20]) -> [100, 50, 50, 30]
    increase_all([100, 40, 40, 20]) -> [100, 50, 50, 30]
```

The following program should run without error:
```py
    import mygrades as mg
    mg.increase_quiz(mg.default_grades) -> [35, 25, 25, 25]
    mg.increase_exam(mg.default_grades) -> [25, 25, 25, 35]
    mg.increase_assigment([95,95,50,50]) -> [95, 100, 50, 50]
    
```

**Warning** 
* If you change your code in your package you need to re-start the python kernel or reload the package by adding the following code:
```py
import importlib
importlib.reload(mygrades)
```

In [None]:

# The following test if you have created your module. 
import mygrades as mg

mg.increase_quiz(mg.default_grades)
mg.increase_exam(mg.default_grades)
mg.increase_assignment([95,95,50,50])

# If you change your code in your package you need to re-start the python kernel or reload with the following
import importlib
importlib.reload(mg)


## Popular Python Packages

### Numpy 

https://numpy.org/

**What Is Numpy?**

Numpy is considered as one of the most popular machine learning library in Python. TensorFlow and other libraries uses Numpy internally for performing multiple operations on Tensors. Array interface is the best and the most important feature of Numpy.

**Features Of Numpy**
* Interactive: Numpy is very interactive and easy to use.
* Mathematics: Makes complex mathematical implementations very simple.
* Intuitive: Makes coding real easy and grasping the concepts is easy.
* Lot of Interaction: Widely used, hence a lot of open source contribution.

**Uses of Numpy?**

This interface can be utilized for expressing images, sound waves, and other binary raw streams as an array of real numbers in N-dimensional. For implementing this library for machine learning having knowledge of Numpy is important for full stack developers.

### Pandas
https://pandas.pydata.org/

**What Is Pandas?**

Pandas is a machine learning library in Python that provides data structures of high-level and a wide variety of tools for analysis. One of the great feature of this library is the ability to translate complex operations with data using one or two commands. Pandas have so many inbuilt methods for grouping, combining data, and filtering, as well as time-series functionality.

**Features Of Pandas**

Pandas make sure that the entire process of manipulating data will be easier. Support for operations such as Re-indexing, Iteration, Sorting, Aggregations, Concatenations and Visualizations are among the feature highlights of Pandas.

**Applications of Pandas?**

Currently, there are fewer releases of pandas library which includes hundred of new features, bug fixes, enhancements, and changes in API. The improvements in pandas regards its ability to group and sort data, select best suited output for the apply method, and provides support for performing custom types operations.

Data Analysis among everything else takes the highlight when it comes to usage of Pandas. But, Pandas when used with other libraries and tools ensure high functionality and good amount of flexibility.

In [None]:
import pandas as pd

**Example:**

Store passenger data of the Titanic. For a number of passengers, we know the name (characters), age (integers) and sex (male/female) data.

To manually store data in a table, create a DataFrame. When using a Python dictionary of lists, the dictionary keys will be used as column headers and the values in each list as columns of the DataFrame.

A DataFrame is a 2-dimensional data structure that can store data of different types (including characters, integers, floating point values, categorical data and more) in columns. It is similar to a spreadsheet, a SQL table or the data.frame in R.

In [None]:
df = pd.DataFrame(
    {
        "Name": [
            "Braund, Mr. Owen Harris",
            "Allen, Mr. William Henry",
            "Bonnell, Miss. Elizabeth",
        ],
        "Age": [22, 35, 58],
        "Sex": ["male", "male", "female"],
    }
)

In [None]:
df

Working with the data in the column Age

In [None]:
df["Age"]

Create a Series from scratch as well:

In [None]:
ages = pd.Series([22, 35, 58], name="Age")
ages

We can do this on the DataFrame by selecting the Age column and applying max():

In [None]:
df["Age"].max()

Or to the Series:

In [None]:
ages.max()

Some basic statistics of the numerical data of my data table. 

The describe() method provides a quick overview of the numerical data in a DataFrame. As the Name and Sex columns are textual data, these are by default not taken into account by the describe() method.

In [None]:
df.describe()

### Matplotlib 
https://matplotlib.org/

Matplotlib is a comprehensive library for creating static, animated, and interactive visualizations in Python. Matplotlib makes easy things easy and hard things possible.

* Create publication quality plots.
* Make interactive figures that can zoom, pan, update.
* Customize visual style and layout.
* Export to many file formats .
* Embed in JupyterLab and Graphical User Interfaces.
* Use a rich array of third-party packages built on Matplotlib.

In [None]:
import matplotlib.pyplot as plt
import numpy as np

Define the data

In [None]:
N = 5
menMeans = (20, 35, 30, 35, -27)
womenMeans = (25, 32, 34, 20, -25)
menStd = (2, 3, 4, 1, 2)
womenStd = (3, 5, 2, 3, 3)
ind = np.arange(N)    # the x locations for the groups
width = 0.35       # the width of the bars: can also be len(x) sequence

Stacked bar plot with error bars

In [None]:
fig, ax = plt.subplots()

p1 = ax.bar(ind, menMeans, width, yerr=menStd, label='Men')
p2 = ax.bar(ind, womenMeans, width, bottom=menMeans, yerr=womenStd, label='Women')

ax.axhline(0, color='grey', linewidth=0.8)
ax.set_ylabel('Scores')
ax.set_title('Scores by group and gender')
ax.set_xticks(ind, labels=['G1', 'G2', 'G3', 'G4', 'G5'])
ax.legend()

# Label with label_type 'center' instead of the default 'edge'
ax.bar_label(p1, label_type='center')
ax.bar_label(p2, label_type='center')
ax.bar_label(p2)

plt.figure(figsize = (15, 15))
plt.show()

Horizontal bar chart

In [None]:
# Fixing random state for reproducibility
np.random.seed(19680801)

# Example data
people = ('Tom', 'Dick', 'Harry', 'Slim', 'Jim')
y_pos = np.arange(len(people))
performance = 3 + 10 * np.random.rand(len(people))
error = np.random.rand(len(people))

fig, ax = plt.subplots()

hbars = ax.barh(y_pos, performance, xerr=error, align='center')
ax.set_yticks(y_pos, labels=people)
ax.invert_yaxis()  # labels read top-to-bottom
ax.set_xlabel('Performance')
ax.set_title('How fast do you want to go today?')

# Label with specially formatted floats
ax.bar_label(hbars, fmt='%.2f')
ax.set_xlim(right=15)  # adjust xlim to fit labels

plt.show()

Some of the more advanced things that one can do with bar labels

In [None]:
fig, ax = plt.subplots()

hbars = ax.barh(y_pos, performance, xerr=error, align='center')
ax.set_yticks(y_pos, labels=people)
ax.invert_yaxis()  # labels read top-to-bottom
ax.set_xlabel('Performance')
ax.set_title('How fast do you want to go today?')

# Label with given captions, custom padding and annotate options
ax.bar_label(hbars, labels=['±%.2f' % e for e in error],
             padding=8, color='b', fontsize=14)
ax.set_xlim(right=16)

plt.show()

Plotting categorical variables

You can pass categorical values (i.e. strings) directly as x- or y-values to many plotting functions:

In [None]:
data = {'apple': 10, 'orange': 15, 'lemon': 5, 'lime': 20}
names = list(data.keys())
values = list(data.values())

fig, axs = plt.subplots(1, 3, figsize=(9, 3), sharey=True)
axs[0].bar(names, values)
axs[1].scatter(names, values)
axs[2].plot(names, values)
fig.suptitle('Categorical Plotting')

This works on both axes:

In [None]:
cat = ["bored", "happy", "bored", "bored", "happy", "bored"]
dog = ["happy", "happy", "happy", "happy", "bored", "bored"]
activity = ["combing", "drinking", "feeding", "napping", "playing", "washing"]

fig, ax = plt.subplots()
ax.plot(activity, dog, label="dog")
ax.plot(activity, cat, label="cat")
ax.legend()

plt.show()

Plotting the coherence of two signals

An example showing how to plot the coherence of two signals.

In [None]:
# Fixing random state for reproducibility
np.random.seed(19680801)

dt = 0.01
t = np.arange(0, 30, dt)
nse1 = np.random.randn(len(t))                 # white noise 1
nse2 = np.random.randn(len(t))                 # white noise 2

# Two signals with a coherent part at 10Hz and a random part
s1 = np.sin(2 * np.pi * 10 * t) + nse1
s2 = np.sin(2 * np.pi * 10 * t) + nse2

fig, axs = plt.subplots(2, 1)
axs[0].plot(t, s1, t, s2)
axs[0].set_xlim(0, 2)
axs[0].set_xlabel('time')
axs[0].set_ylabel('s1 and s2')
axs[0].grid(True)

cxy, f = axs[1].cohere(s1, s2, 256, 1. / dt)
axs[1].set_ylabel('coherence')

fig.tight_layout()
plt.show()

### Scikit-Learn (Advanced)
https://scikit-learn.org/stable/

**What Is Scikit-learn?**

It is a Python library is associated with NumPy and SciPy. It is considered as one of the best libraries for working with complex data.

There are a lot of changes being made in this library. One modification is the cross-validation feature, providing the ability to use more than one metric. Lots of training methods like logistics regression and nearest neighbors have received some little improvements.

**Features Of Scikit-Learn**
* Cross-validation: There are various methods to check the accuracy of supervised models on unseen data.
* Unsupervised learning algorithms: Again there is a large spread of algorithms in the offering – starting from clustering, factor analysis, principal component analysis to unsupervised neural networks.
* Feature extraction: Useful for extracting features from images and text (e.g. Bag of words)

**Where are we using Scikit-Learn?**

It contains a numerous number of algorithms for implementing standard machine learning and data mining tasks like reducing dimensionality, classification, regression, clustering, and model selection.

### Keras (Advanced)
https://keras.io/

**What Is Keras?**

Keras is considered as one of the coolest machine learning libraries in Python. It provides an easier mechanism to express neural networks. Keras also provides some of the best utilities for compiling models, processing data-sets, visualization of graphs, and much more.

In the backend, Keras uses either Theano or TensorFlow internally. Some of the most popular neural networks like CNTK can also be used. Keras is comparatively slow when we compare it with other machine learning libraries. Because it creates a computational graph by using back-end infrastructure and then makes use of it to perform operations. All the models in Keras are portable.

**Features Of Keras**

It runs smoothly on both CPU and GPU.
Keras supports almost all the models of a neural network – fully connected, convolutional, pooling, recurrent, embedding, etc. Furthermore, these models can be combined to build more complex models.
Keras, being modular in nature,  is incredibly expressive, flexible, and apt for innovative research.
Keras is a completely Python-based framework, which makes it easy to debug and explore.

**Where are we using Keras?**

You are already constantly interacting with features built with Keras — it is in use at Netflix, Uber, Yelp, Instacart, Zocdoc, Square, and many others. It is especially popular among startups that place deep learning at the core of their products.

Keras contains numerous implementations of commonly used neural network building blocks such as layers, objectives, activation functions, optimizers and a host of tools to make working with image and text data easier. 

Plus, it provides many pre-processed data-sets and pre-trained models like MNIST, VGG, Inception, SqueezeNet, ResNet etc.

Keras is also a favorite among deep learning researchers, coming in at #2. Keras has also been adopted by researchers at large scientific organizations, in partic,ular CERN and NASA.