# Libraries and modules

## What is a library?

A library is a collection of modules that are grouped together because they have similar or related functionalities.

A library or its modules are sometimes called `package` (there are slight differences between the terms, which don't matter at the moment).

Examples of libraries:
* [pandas](https://pypi.org/project/pandas/): "provides fast, flexible, and expressive data structures designed to make working with "relational" or "labeled" data both easy and intuitive".
* [fuzzywuzzy](https://pypi.org/project/fuzzywuzzy/): "fuzzy string matching like a boss. It uses Levenshtein Distance to calculate the differences between sequences in a simple-to-use package."
* [scikit-learn](https://pypi.org/project/scikit-learn/): "a Python module for machine learning".
* [haversine](https://pypi.org/project/haversine/): "Calculate the distance (in various units) between two points on Earth using their latitude and longitude."

They are tools other people have written and shared with the community.

Python is famous for its easy way to integrate other's code into your own. You can do this with libraries.

![](https://imgs.xkcd.com/comics/python.png)
Source: https://xkcd.com/353

Library modules can be imported into your code just by calling `import`.

## The Python standard library

The standard library contains many modules that are very useful. It is distributed together with Python: modules provided in the standard library can always be used, you don't need to install them.

You can learn more about the standard library and its modules here: https://docs.python.org/3/library/

If you want to use these modules in your code, you need to `import` them into your script (or notebook).

#### Example 1: The `math` module

The `math` module provides common mathematical functions.

See the documentation: https://docs.python.org/3/library/math.html#module-math

For example, imagine we want to find the square of 5 (expressed $5^2$, `5^2`, or `5*5`). We don't necessarily have to use the `math` module, we could just do:

In [None]:
5*5

Or if we want to return the 5 to the power of 10 (expressed $5^{10}$, `5^10`, or `5*5*5*5*5...`), we could do:

In [None]:
5*5*5*5*5*5*5*5*5*5

However, there is a function in the `math` module for this calculation.

See the docs: https://docs.python.org/3/library/math.html#math.pow

First, we need to import the module:

In [None]:
import math

Then, in order to use any of the imported module's functions, we use the "dot notation": we write the name of the module + dot + the function you wish to use.

As you can see in the documentation, the syntax for using this function is: `math.pow(x, y)`, which returns `x` raised to the power `y`.

For example, `math.pow(5, 2)` calculates $5^2$, and `math.pow(5, 10)` calculates $5^{10}$.

In [None]:
import math

number = 5
number_power_of_two = math.pow(5,2)
print(number_power_of_two)

Note that the same can be written a bit differently:

You can import just a function from the module, using the following syntax:

```
from [module] import [function]
```

For example:

```
from math import pow
```

Then, to use it in your code, instead of using the dot notation `math.pow(x, y)`, you can call the function directly: `pow(x, y)`.

The code in the following cell does exactly the same as in the cell above:

In [None]:
from math import pow

number = 5
number_power_of_two = pow(5,2)
print(number_power_of_two)

#### Example 2: The `statistics` module

The `statistics` module provides functions for calculating basic statistics.

Documentation: https://docs.python.org/3/library/statistics.html

For example, you can find the arithmetic mean of some data, using the `mean` function:

Documentation: https://docs.python.org/3/library/statistics.html#statistics.mean

In [None]:
from statistics import mean

list_of_numbers = [3, 4, 5, 2, 6, 3, 6, 3, 7, 9]

print(mean(list_of_numbers))

#### Example 3: The `random` module

The `random` module is used a lot by researchers to randomise data or select random elements from a sample.

Let's imagine we want to get a random element from a list. We can use the `choice` function.

See documentation: https://docs.python.org/3/library/random.html#random.choice

Run the example (several times!):

In [None]:
import random

list_of_authors = ["Shelley", "Austen", "Dickens", "Woolf", "Wilde", "Forster"]

random_author = random.choice(list_of_authors)

print(random_author)

## External libraries

There are many very useful modules that belong to libraries that have been developed outside Python.

These libraries not only need to be imported, they also need to be installed before being able to using them.

In other words, `import` does not install libraries. It just makes them available to your current notebook session, assuming they are already installed.

Most libraries should be findable in the `pypi` repository: https://pypi.org

### Installing libraries

If you are working locally, make sure you are in the correct environment.

You can install libraries from the terminal with one of the following two commands:
```
conda install pandas
```

```
pip install pandas
```

You can do the same from a jupyter notebook cell, but in this case, you need to start the command with `!` or `%`:
```
!pip install pandas
```

Then you can import the newly-installed library with `import`:

In [None]:
import pandas

A final note on `import`: you can provide a kind of "alias" to the library name: a shorter name that you will use throughout your code. For example, it's common practice to import the `pandas` library and rename it to `pd`. For example:

In [None]:
import pandas as pd

Note that, if you do this, you will need to call its functions with the short name you've defined (`pd`), not `pandas`.

## Don't reinvent the wheel

It is very likely that many of the functionalities you will need already exist. People have been developing python code and applications non-stop for 30 years!.

Whenever possible, reuse code that has been shared in a package, don't reinvent the wheel. Think that widely-used libraries have been validated by lots of users, and their functions have been fixed and updated to be more efficient, etc.