### MEDC0106: Bioinformatics in Applied Biomedical Science

<p align="center">
  <img src="../../resources/static/Banner.png" alt="MEDC0106 Banner" width="90%"/>
  <br>
</p>

---------------------------------------------------------------

# 03 - Modules and packages

*Written by:* Oliver Scott

**This notebook provides a general introduction to using modules and packages in Python.**

Do not be afraid to make changes to the code cells to explore how things work!

### What are modules?

In programming, a **module** is a part of software that contains code for a specific function. When developing large projects, breaking code into modules helps keep it **readable** and **maintainable**. For example, if we were building a game, one module might handle physics, while another manages what is rendered on the screen.

In Python, a module is simply a file with a `.py` extension, containing functions, variables, and classes. When we import this code into Python, we use the name of the file without the extension. For example, a file called *test.py* defines a module, and its corresponding name in Python would be *test*.

Python includes a relatively large collection of useful modules, such as `math`; however, we can also define our own modules or download them from a third party.

### What are packages?

Packages are simply collections of modules and other packages, organised within a directory. To qualify as a package, a special `__init__.py` file must be present, indicating that the directory defines a package. For example:

```
audio/
    __init__.py
    formats/
        __init__.py
        mp3.py
        wav.py
    effects/
        __init__.py
        echo.py
        fade.py  
```

The structure above defines a package named `audio`, which contains two sub-packages: `formats` and `effects`. The `formats` package includes two modules, `mp3` and `wav`, while the `effects` package contains the modules `echo` and `fade`. In this notebook, we won’t dwell on package structures, but it’s useful to understand their layout. If you’d like to learn more about creating and structuring packages, consider exploring the extra resources linked below.

### The Python Standard Library

Python includes an extensive [collection](https://docs.python.org/3/library/) of modules and packages that are valuable to any programmer. Becoming familiar with the standard library is essential, as it offers solutions to many common programming tasks, enabling you to solve problems quickly without writing new code. A helpful guide to the standard library is available on the Python website [here](https://docs.python.org/3.8/tutorial/stdlib.html).

### Third-party extensions

The Python Standard Library, while extensive, is not exhaustive. Additional components can be installed from the [Python Package Index](https://pypi.org/) (PyPI), where you’ll find everything from individual modules and packages to complete application development frameworks. The scientific community also frequently uses [Anaconda](https://www.anaconda.com/products/individual), which allows easy installation of complex packages, including those with compiled code (e.g., C, C++, Rust). If you are viewing these notebooks through Binder, it has already installed several third-party packages that we will use in these sessions.

----

## Contents

1. [Importing modules](#Importing-modules)
2. [Writing modules](#Writing-modules)
3. [Packages](#Packages)
4. [Third-party modules and packages](#Third-party-modules-and-packages)
5. [Discussion](#Discussion)

----

### Extra resources

This introduction to Python is by no means comprehensive. Below are some links to external resources for learning Python if you are interested.

- [Real Python](https://realpython.com/) - Free Python tutorials
- [CodeAcademy](https://www.codecademy.com/learn/learn-python-3) - Python lessons
- [Cheat-Sheets](https://ehmatthes.github.io/pcc_2e/cheat_sheets/cheat_sheets/) - Python reference sheets

----

## Importing Modules

In Python modules can be imported using the `import` statement:

```python
import mymodule
```

Attributes contained within the imported molecule can then be accessed using the 'dot' `.` syntax:

```python
mymodule.myvariable
mymodule.myfunction()
```

The examples below show the most common ways to import functionallity from modules.

In [None]:
import math

print("Pi:", math.pi)  # Modules can define useful variables

double_pi = math.pi * 2  # We can use the imported variables just like normal variables 

# Modules may also define functions that we can utilise in our own code
print("Degrees in Pi radians:", math.degrees(math.pi))
print("Degrees in 2*Pi radians:", math.degrees(double_pi))

Python's `from` statement allows you to import specific attributes from a module.

```python
from mymodule import attr1[, attr2[, attr3, ..., attrN]]
```

In [None]:
from math import pi, degrees  # Importing two attributes (variable, function)

print("Pi:", pi)
print("Degrees in Pi radians:", degrees(pi))

You can also import attributes using custom names. This is useful for shortening a module, function, or variable name, or to avoid conflicts with names in your own code. **Aliasing** does not affect functionality.

```python
import mymodule as module
from mymodule import attr1 as attr
```

In the example below we import `pi` from math as `PI`:

In [None]:
from math import pi as PI

print("Pi:", PI)

Jupyter allows you to see a modules contents in a handy popup window. Just type the module name followed by a dot and hit `TAB`.

<p align="center">
  <img src="../../resources/static/tooltip.png" alt="Tooltip"/>
  <br>
</p>

Try this in the cell block below:

1. Uncomment the line below.
2. Move your cursor after the dot.
3. Press the Tab key on your keyboard.

In [None]:
# math.

Using the `help` function also gives us an overview of a module's contents.

***Note:*** *you can also use the help function to give information about functions and classes*.

In [None]:
help(math)

## Packages

Packages are a convenient way to compartmentalise a large amount of functionality. The syntax for using packages is very similar to that of modules. We can import a package using the same syntax.

```python
# Import mypackage
import mypackage

# Import a subpackage
from mypackage import subpackage

# Import a module from a subpackage
from mypackage.subpackage import mymodule

# Import a function from mymodule
from mypackage.subpackage.mymodule import myfunction
```

If we refer back to the `audio` package example from earlier importing the module wav would be as simple as:

```python
from audio.formats import wav
```

We could then use functions defined in wav:

```python
wav.read_wav_file(...)
```

## Third-party modules and packages

We can also import third-party extensions using the `import` statement. Binder has handled all the installation procedure for us in this case, but we often use a package installer like [pip](https://pypi.org/project/pip/) or [conda](https://docs.conda.io/en/latest/) to install packages. In fact we can install packages from inside Jupyter using pip like so:

```
!pip install numpy
```

If you were to try this you would probably get a message similar to this:

```
Requirement already satisfied: numpy in /srv/conda/envs/notebook/lib/python3.9/site-packages (1.21.2)
```

Two of the most popular Python packages are [numpy](https://numpy.org/) and [pandas](https://pandas.pydata.org/), which we will learn more about in the next session. Below is a brief example of importing these packages and using some of their functionalities.

In [None]:
import numpy

# Create a numpy array (similar to a list of numbers)
array = numpy.array([1,2,3,4,5,6])

# We can compute some statistics using numpy
print('Mean:', numpy.mean(array))
print('STD:', numpy.std(array))

It is very common to see numpy and pandas shortened to `np` and `pd`, respectively.

```python
import pandas as pd
import numpy as np
```

The functionality remains the same while making the code look a little cleaner.

## Discussion

Python has become a very popular programming language in recent years due to its expressiveness and its 'easy-to-read' syntax. The scientific Python stack is now vast and hence it is easy to become productive when using Python in a project. Learning the basics of Python may end up being a major asset in your future career path.

Feel free to add more cells and experiment with the concepts you have learnt.

In the next session we will learn to use the popular packages [numpy](https://numpy.org/) and [pandas](https://pandas.pydata.org/), which are very powerful tools for data analysis.

Now you should try to complete the **exercises** in `04_exercises.ipynb`.

If you want to learn more there are some extra external resources linked at the beginning of this notebook. You can click [here](#Contents) to go back to the top.