# Packages

-    [Install package](#Install-package)  
-    [Import package](#Import-package)  
-    [Math package](#Math-package)  
-    [Subpackages](#Subpackages)

Adding all functions and methods that have been written up to now to the same Python distribution would be a mess. There would be tons of code that you will never use. Python handles that in *packages*, which is a directory of Python scripts. Each script is called *module*. These modules specify functions, methods, and new Python types aimed at solving particular problems.

There are thousands of packages available in the internet. Most commonly used for data science are **numpy** to efficiently work with arrays, **matplotlib** for data visualization, and **scikit-learn** for machine learning.

### **Install package**

Not all the packages are available in Python by default. To use these packages, you first have to install them on your system and then put code in the script to tell Python that you want to use these packages.  
To install packages you can use `pip`, a package maintenance system for Python.

In [1]:
!pip3 install numpy # install numpy package.

Collecting numpy
  Downloading numpy-1.19.4-cp38-cp38-manylinux2010_x86_64.whl (14.5 MB)
[K     |████████████████████████████████| 14.5 MB 227 kB/s eta 0:00:01     |███████████████████████▊        | 10.8 MB 8.7 MB/s eta 0:00:01
[?25hInstalling collected packages: numpy
Successfully installed numpy-1.19.4


### **Import package**

**numpy** is now installed on your system. But before using the package, you need to import it first, or a specific module of the package by using `import` statement.

In [5]:
import numpy

A commonly used function in **numpy** is `array` which takes a list as the input. To call the `array` function, you need to refer it to the **numpy** package with the *dot notation*. Otherwise it will return an error.

In [8]:
array([1, 2, 3])

NameError: name 'array' is not defined

In [9]:
numpy.array([1, 2, 3])

array([1, 2, 3])

You can also shorten the `numpy` prefix by refering to it with a different name.

In [12]:
import numpy as np
np.array([1, 2, 3])

array([1, 2, 3])

Suppose you only want to use the `array` function from the **numpy** package. Instead of importing the whole package, you can import this function only. By this you don't need to specify the dot notation when using `array`.

In [13]:
from numpy import array
array([1, 2, 3])

array([1, 2, 3])

The more standard `import numpy` is often preferred, as it makes very clear that you are working with the **numpy** package.

### **Math package**

For a fancy clustering algorithm, you want to find the circumference, `C`, and area, `A`, of a circle. When the radius of the circle is `r`, you can calculate `C` and `A` as:  
`C=2πr`  
`A=πr²`  
To use the constant `pi`, you'll need the **math** package.

In [14]:
import math

Define `r` and calculate `C` and `A`:

In [15]:
r = 0.43
C = 2 * math.pi * r
A = math.pi ** 2

Print out the result of the calcuation:

In [16]:
print("Circumference: " + str(C))
print("Area: " + str(A))

Circumference: 2.701769682087222
Area: 9.869604401089358


Let's say the Moon's orbit around planet Earth is a perfect circle, with a radius `r` (in km). 

In [18]:
r  = 192500

Use selective import to import the `radians` function from **math** package.

In [19]:
from math import radians

Calculate the distance travelled by the Moon over 12 degrees of its orbit:

In [20]:
r * radians(12)

40317.10572106901

### **Subpackages**

Packages may contain many subpackages inside. To import a function from one of these subpackages, you also need to refer to which subpackage with *dot notation*. For example, if you want to use the function `inv()`, which is in the **linalg** subpackage of the **scipy** package. You want to be able to use this function as follows:

In [22]:
from scipy.linalg import inv as my_inv
my_inv

<function scipy.linalg.basic.inv(a, overwrite_a=False, check_finite=True)>