# Modules in Python

One of the advantages of Python that makes it so versatile for a wide range of tasks is the broad ecosystem of tools and packages that offer more specialized functionality on top of the "bare" Python.

## Loading Modules: the ``import`` Statement

For loading built-in and third-party modules, Python provides the ``import`` statement.

#### <font color='green'>Good</font>
import <font color='green'>sys</font>

from os import <font color='green'>path</font>

import statistics <font color='green'>as stats</font>

from custom_package import <font color='green'>mode</font>

from statistics import <font color='green'>mean, median</font>

#### <font color='red'>Bad:</font> silently overwrites previous imports
from math import <font color='red'><b>*</b></font>

from pylab import <font color='red'><b>*</b></font>

For today we will import the **numpy** module. A powerful and flexible maths package

In [2]:
import numpy as np # Because I am too lazy to write numpy every time

# ![](http://www.numpy.org/_static/numpy_logo.png) 
##### NumPy supports arrays which are very useful to numerical computations
* Arrays are N dimensional: 1d (vector), 2d (plane),...,N dim
* Arrays are (generally) faster than lists
* Many packages use numpy arrays to store data
* Arrays can be used to make calculations in one command, without `for` loops or list compreension

We will start with some randomly generated data to test numpy's functionality:

array([[ -4,   9,  -5,  -6,   2],
       [-10,   3,   6,   0,  -4],
       [-10,   5,   6,  -4,   2],
       [ -1,   1,  -9, -10,   8],
       [ -1,  -5,   6,   3,  -3]])

### Looking for help?

* Documentation: http://docs.scipy.org/doc/numpy/reference/
* Google is your friend! Especially links to Stack Overflow
* Use help function (tab will show options available)

In [5]:
help(np.mean)

Help on function mean in module numpy.core.fromnumeric:

mean(a, axis=None, dtype=None, out=None, keepdims=<no value>)
    Compute the arithmetic mean along the specified axis.
    
    Returns the average of the array elements.  The average is taken over
    the flattened array by default, otherwise over the specified axis.
    `float64` intermediate and return values are used for integer inputs.
    
    Parameters
    ----------
    a : array_like
        Array containing numbers whose mean is desired. If `a` is not an
        array, a conversion is attempted.
    axis : None or int or tuple of ints, optional
        Axis or axes along which the means are computed. The default is to
        compute the mean of the flattened array.
    
        .. versionadded:: 1.7.0
    
        If this is a tuple of ints, a mean is performed over multiple axes,
        instead of a single axis or all the axes as before.
    dtype : data-type, optional
        Type to use in computing the mean.  Fo

### Creating an array from a list

In [34]:
a1d = np.array([3, 4, 5, 6])
a1d

array([3, 4, 5, 6])

In [7]:
a2d = np.array([[10.,   20, 30], [9, 8, 5]])
a2d

array([[10., 20., 30.],
       [ 9.,  8.,  5.]])

Can you guess what the following slices are equal to? Print them to check your understanding.

In [25]:
a2d[0,0]

10.0

In [26]:
a2d[0,1:]

array([10., 20., 30.])

In [27]:
a2d[:,2]

array([30.,  5.])

**Excercise** Create a 2D NumPy array from the following list and assign it to the variable "a":

In [28]:
# [[2, 3.2, 5.5, -6.4, -2.2, 2.4],
#  [1, 22, 4, 0.1, 5.3, -9],
#  [3, 1, 2.1, 21, 1.1, -2]]

### Array attributes

#### ndarray.ndim
the number of axes (dimensions) of the array. In NumPy, the number of dimensions is referred to as rank.

In [30]:
a2d.ndim

2

#### ndarray.shape
the dimensions of the array

In [31]:
a2d.shape

(2, 3)

### Functions for creating arrays
#### ``arange([start,] stop[, step,], dtype=None)``
#### evenly spaced, defined by step

In [13]:
np.arange(1, 9, 2)

array([1, 3, 5, 7])

###### ``linspace(start, stop, num=50, endpoint=True, retstep=False, dtype=None)``


#### evenly spaced, defined by length

In [15]:
np.linspace(0, 1, 11)   # start, end, num-points

array([0. , 0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9, 1. ])

**Excercise**

Create arrays of evenly spaced numbers

In [None]:
# Numbers from 1 to 10 in steps of 1

# From 0 to -2 in steps of -0.4

# 100 steps from - pi to pi (hint, use np.pi)

####  Create array filled with zeros

In [44]:
np.zeros((2, 3))

array([[0., 0., 0.],
       [0., 0., 0.]])

#### Creates array with random numbers

In [17]:
np.random.rand(4)       # From a uniform distribution beween 0 and 1

array([0.50755507, 0.0211933 , 0.43352176, 0.44631306])

In [18]:
np.random.normal(0,1,size=4)      # Gaussian (mean,std dev, num samples)

array([ 0.65034618, -0.51433646,  0.53942869,  1.52676162])

In [20]:
np.random.randint(-10,high=10,size=(5,5)) # Random integers in a specified range
# How does this function work? Try uncommenting the next line to read the documentation for this function
#np.random.randint?

array([[ -4, -10,   8,  -8,   2],
       [  7,  -6,   4,  -8,  -6],
       [  7,   4,  -5,   2,  -5],
       [ -2,   8,  -7,  -1,   8],
       [  7,   7,  -5,   1,   1]])

#### Grid generation
* A common task is to generate a pair of arrays that represent data coordinates. 
* Useful for interpolation of mapping contours.
* When orthogonal 1D coordinate arrays already exist, NumPy's `meshgrid` function is very useful:

In [22]:
x = np.linspace(-5, 5, 3)
y = np.linspace(10, 40, 4)
print(x)
print(y)

[-5.  0.  5.]
[10. 20. 30. 40.]


In [23]:
x2d, y2d = np.meshgrid(x, y)
print(x2d)
print(y2d)

[[-5.  0.  5.]
 [-5.  0.  5.]
 [-5.  0.  5.]
 [-5.  0.  5.]]
[[10. 10. 10.]
 [20. 20. 20.]
 [30. 30. 30.]
 [40. 40. 40.]]


Transpose arays with .T

In [24]:
y2d.T

array([[10., 20., 30., 40.],
       [10., 20., 30., 40.],
       [10., 20., 30., 40.]])

### Statistical methods of arrays

In [37]:
print('array a1d                       :', a1d)
print('Minimum and maximum             :', a1d.min(), a1d.max())
print('Index of minimum and maximum    :', a1d.argmin(), a1d.argmax())
print('Sum and product of all elements :', a1d.sum(), a1d.prod())
print('Mean and standard deviation     :', a1d.mean(), a1d.std())
print('Median and 75 percentile           :', np.median(a1d), np.percentile(a1d,75))

array a1d                       : [3 4 5 6]
Minimum and maximum             : 3 6
Index of minimum and maximum    : 0 3
Sum and product of all elements : 18 360
Mean and standard deviation     : 4.5 1.118033988749895
Median and 75 percentile           : 4.5 5.25


### Operations over a given axis

In [39]:
print(a2d)
print('sum array  :',a2d.sum())
print('sum axis 0  :',a2d.sum(axis=0))
print('sum axis 2 :',a2d.sum(axis=1))

[[10. 20. 30.]
 [ 9.  8.  5.]]
sum array  : 82.0
sum axis 0  : [19. 28. 35.]
sum axis 2 : [60. 22.]


**Excercise** Using the array 'a' we created earlier, find:
* The maximum value
* The 90th percentile 
* The mean along axis 0
* The sum along axis 1

## Shape manipulation

In [40]:
b = np.array([[1, 2, 3], [4, 5, 6]])

In [41]:
b.flatten()

array([1, 2, 3, 4, 5, 6])

In [42]:
b.reshape(3,2)

array([[1, 2],
       [3, 4],
       [5, 6]])

In [43]:
b.repeat(3)

array([1, 1, 1, 2, 2, 2, 3, 3, 3, 4, 4, 4, 5, 5, 5, 6, 6, 6])