# 3. Modules and packages

Sometimes, one needs more than the built-in functions and structures. Just like plugins in Fiji, one can use extensions of Python called modules. Also sometimes several modules are combined into a so-called packages. These extensions provide for example the ability to look for files in folder, use specific mathematical operations, create new data structures such as 2D images etc...

## 3.1 Built-in modules

Python comes directly with a large list of built-in modules designed for various tasks that you can find [here](https://docs.python.org/3/py-modindex.html). To illustrate how this works, we are going to look at one of those called math which provides mathematical functions.

To load a module we simply write:

In [1]:
import math

To learn about the functioning of the module we can read the documentation [online](https://docs.python.org/3/library/os.html#module-os).

All the function of module are called using the same kind of syntax as methods: math.function()
Reading the documentation we see that for example that we have these two functions:

In [2]:
help(math.radians)

Help on built-in function radians in module math:

radians(x, /)
    Convert angle x from degrees to radians.



In [2]:
help(math.cos)

Help on built-in function cos in module math:

cos(x, /)
    Return the cosine of x (measured in radians).



Let's first convert a given angle into radians:

In [4]:
angle = math.radians(90)
print(angle)

1.5707963267948966


And let's calculate the cosine of this:

In [5]:
math.cos(angle)

6.123233995736766e-17

We get a tiny number. Because of the limited precision, the value is not exactly 0.

## 3.2 External modules

The modules offered by Python are still limited in scope, and often one needs to use modules written by external projects. Most of the time, such modules are hosted on services that makes their installation very simple. If you use the Anaconda interface you can do it interactively. Otherwise you can use the pip or conda commands. For example, a small project called tifffile allows one to read most microscopy images. To install it one would use the following command in a command line tool: pip install tifffile

Luckily one can also install modules directly in Jupyter by using the same command prefaced with the ! sign:

In [6]:
!pip install tifffile

Collecting tifffile
  Using cached tifffile-2020.2.16-py3-none-any.whl (130 kB)
Collecting imagecodecs>=2020.1.31
  Using cached imagecodecs-2020.2.18-cp38-cp38-macosx_10_13_x86_64.whl (7.9 MB)
Installing collected packages: imagecodecs, tifffile
Successfully installed imagecodecs-2020.2.18 tifffile-2020.2.16


For this course, we are going to use three important packages widely used by the scientific community: Numpy to handle images, Matplotlib to plot images, and Pandas to handle tabular data and statistics. They are pre-installed on the computer hosting this notebook, and are also already installed through the Anaconda distribution. Just like the tifffile module, you can install those using conda or pip.

Let's start by briefly seeing how to import Numpy. We will learn about it's usage in the next chapter.

In [7]:
import numpy

Just like before we can acces directly to numpy functions e.g. the numpy.mean() function which calculates the mean of a list:

In [8]:
numpy.mean([2,6,3,8,5])

4.8

Some functions with related natures are sometimes grouped into submodules. Those can still be accessed using the same "dot" logic as functions. For example all functions that allow one to create random numbers of different types are grouped in the random [submodule](https://docs.scipy.org/doc/numpy/reference/routines.random.html). To acces the function creating a normally distributed number, one then uses:  

In [9]:
numpy.random.normal()

-2.848820514403985

As for built-in functions, we have acces to a help notice that gives us information on the function:

In [10]:
help(numpy.random.normal)

Help on built-in function normal:

normal(...) method of numpy.random.mtrand.RandomState instance
    normal(loc=0.0, scale=1.0, size=None)
    
    Draw random samples from a normal (Gaussian) distribution.
    
    The probability density function of the normal distribution, first
    derived by De Moivre and 200 years later by both Gauss and Laplace
    independently [2]_, is often called the bell curve because of
    its characteristic shape (see the example below).
    
    The normal distributions occurs often in nature.  For example, it
    describes the commonly occurring distribution of samples influenced
    by a large number of tiny, random disturbances, each with its own
    unique distribution [2]_.
    
    .. note::
        New code should use the ``normal`` method of a ``default_rng()``
        instance instead; see `random-quick-start`.
    
    Parameters
    ----------
    loc : float or array_like of floats
        Mean ("centre") of the distribution.
    scale : fl

In [11]:
numpy.random.normal(0,1,10)

array([ 0.13938756,  1.18722964,  0.46685366,  0.80862247,  0.39153953,
       -1.23395003,  0.43733666,  0.00982073, -0.06822408,  0.98766885])

## 3.3 Other ways to load modules

Typing numpy.random.normal() every time we need that function can quickly become tedious. There are two was to abbreviate that long expression.

First we can shorten the name of the module that we import using:

In [3]:
import numpy as np

Then np just replaces numpy in all the above expression e.g.:

In [4]:
np.random.normal()

-1.2363082282973985

If we know that we are going to use a specific function a lot, and it has a very specific name, that will help us remember where it comes from, we can also just import that function using:

In [14]:
from numpy.random import normal

This allows us to now use this very short version:

In [15]:
normal()

-1.6275759355627402

## 3.4 Function arguments

Until now, we have mostly seen relatively simple built-in functions, which take one argument e.g:

In [16]:
mystring = 'my string'
len('my string')

9

where mystring is the argument.

However when using modules, we will very often see functions that require multiple arguments. We have in fact just seen one when reading the help for the ```normal()``` functions:

In [17]:
help(numpy.random.normal)

Help on built-in function normal:

normal(...) method of numpy.random.mtrand.RandomState instance
    normal(loc=0.0, scale=1.0, size=None)
    
    Draw random samples from a normal (Gaussian) distribution.
    
    The probability density function of the normal distribution, first
    derived by De Moivre and 200 years later by both Gauss and Laplace
    independently [2]_, is often called the bell curve because of
    its characteristic shape (see the example below).
    
    The normal distributions occurs often in nature.  For example, it
    describes the commonly occurring distribution of samples influenced
    by a large number of tiny, random disturbances, each with its own
    unique distribution [2]_.
    
    .. note::
        New code should use the ``normal`` method of a ``default_rng()``
        instance instead; see `random-quick-start`.
    
    Parameters
    ----------
    loc : float or array_like of floats
        Mean ("centre") of the distribution.
    scale : fl

The important line here is ```normal(loc=0.0, scale=1.0, size=None)```
It tells us that the ```normal``` function can take **3** arguments: loc, scale and size, which are defined later in the description (mean, standard deviation and number of points). So instead of just calling:

In [18]:
np.random.normal()

0.17456224804242743

We can specific all these parameters:

In [19]:
np.random.normal(10,2,5)

array([11.6454606 ,  9.89082451, 10.91989786,  6.39336423, 10.38349362])

We have now a list of 5 numbers drawn from a Gaussian with mean = 10 and stdv = 2.

But what did Numpy do before when we didn't give explicit values ? It used the default values (0, 1, None) given in the function description ```normal(loc=0.0, scale=1.0, size=None)```

Of course we don't have to specify all parameters. We can for example just give the first two mean and stdv:

In [20]:
np.random.normal(10,2)

9.528503839955198

But what if we want to specific just the last parameter ? How does Python know that we want to change ```size``` and not ```loc```? **We have to explicitly say which parameter we consider:**

In [21]:
np.random.normal(size = 10)

array([ 0.7054198 , -0.15598511,  0.86876286, -0.16211019, -1.9110427 ,
       -0.94831676,  0.74759153,  0.65742365, -0.18157261, -1.0454794 ])

The parameters of functions do not necessarily have default values. For example if we call the ```cos()``` function without parameters we get an error:

In [5]:
np.cos()

ValueError: invalid number of arguments

**The general rule is: parameters that do not have default values come FIRST when one calls the function. Any optional parameter follows. If parameters are used in their original order, no need to specify which one is meant, otherwise they HAVE TO BE SPECIFIED**