# Python Modules - NumPy and Scipy

## Objectives

* Know about numerical functions provided by NumPy (`numpy`)
* Understand the concept of a NumPy array (`numpy.ndarry`)
* Know about scientific functions provided by SciPy (`scipy`)
* Know how to do linear regression with `scipy.stats.linregress`
* Know how to access the `numpy` and `scipy` documentation

**Time**: 15 minutes

## NumPy

NumPy (`numpy`) is a large and extremely well developed module focussed on simple and complex mathematical functions and datatypes in Python. NumPy is a large module and we will only introduce you to a couple of functions today.

<div style="background-color:#cdefff; border-radius: 5px; padding: 10pt"><strong>Task:</strong> Create a new code cell beneath this cell and import the <code style="background-color:#cdefff">numpy</code> module. It is convential to give <code style="background-color:#cdefff">numpy</code> module the alias <code style="background-color:#cdefff">np</code>.</div>

<br>
<div style="background-color:#cdefff; border-radius: 5px; padding: 10pt"><strong>Task:</strong> Find the NumPy Documentation on-line. Can you easily navigate the documentation to find useful functions?</div>

One of the key features of `numpy` is the introduction of a new datatype: the `numpy.ndarray`.

<div style="background-color:#cdefff; border-radius: 5px; padding: 10pt"><strong>Task:</strong> Search the on-line NumPy documentation to find the <code style="background-color:#cdefff">numpy.ndarray</code> page. Create a new Markdown cell beneath this one - list the four most important features of a <code style="background-color:#cdefff">numpy.ndarray</code> as discussed by the group.</div>

<div style="background-color:#cdefff; border-radius: 5px; padding: 10pt"><strong>Task:</strong> Run the code cell beneath this one to see how to create a simple <code style="background-color:#cdefff">numpy.ndarray</code>. Note how <code style="background-color:#cdefff">numpy</code> has easy methods for calculating things like the mean and standard deviation of an array without having to write loops.</div>

In [None]:
# Create a 3x3 array with the number 1 to 9
myArray = np.array([[1,2,3],
                 [4,5,6],
                 [7,8,9]])
print(myArray)

# Calculate the mean and standard deviation of myArray
print('The mean of myArray is {0:.2f} ...'.format(myArray.mean()))
print('...and the standard deviation is {0:.2f}.'.format(myArray.std()))

# Calculate the mean of each column and of each row
print('The mean of each column of myArray is:')
print(myArray.mean(axis=0))
print('The mean of each row of myArray is:')
print(myArray.mean(axis=1))

The `numpy.ndarray` is particularly important if you plan to analyse images or 2D+ data, e.g. geological recordings.

However, in the next notebook we will introduce the pandas DataFrame. This is another new data type and is built upon the `numpy.ndarray`. Many `numpy.ndarray` methods are also defined for pandas DataFrames.

## SciPy

SciPy (`scipy`) is another large and extremely well developed module but is focussed on mathematical, scientific and engineering functions and datatypes for Python. SciPy is also to large to cover in detail so we will only introduce you to one key function right now.

### Importing SciPy

Importing `scipy` is a little bit unusual. `scipy` has several large submodules and if you want to access functions in these submodules, they must be loaded as individual modules. For example, say you wanted to do some linear regression (which is in the `scipy.stats` submodule) and some image processing (using functions from `scipy.ndimage`) you need to import both sumodule. E.g.:

```python
import scipy.stats
import scipy.ndimage
```

<div style="background-color:#cdefff; border-radius: 5px; padding: 10pt"><strong>Task:</strong> Run the following cell. The error is a little unusual, but make a note that (for SciPy) this indicates that you've not imported that submodule. Correct the cell so that it runs without errors.</div>

In [None]:
import scipy

# Load data
x = np.arange(0,9,1)  # create an array of the numbers 0 to 9
y = np.arange(0,18,2)  # create an array of the numbers 0 to 18 in steps of 2
im = np.zeros([10,10])  # create a 10 by 10 array of zeros, i.e. an empty image

# Linear Regression of x and y
slope, intercept, r_value, p_value, std_err = scipy.stats.linregress(x, y)

# Apply Gaussianm Filter to im
imBlurred = scipy.ndimage.gaussian_filter(im, sigma=5)

### Linear Regression

The SciPy module contains a lot of useful stats functions including t-tests and linear regressions. Due to time constraints we will only explain the linear regression function (which we've already used above).

<div style="background-color:#cdefff; border-radius: 5px; padding: 10pt"><strong>Task:</strong> Load the documentation for <code style="background-color:#cdefff">scipy.stats.linregress</code>. Create a Markdown cell beneath this one and write, in simple English, what each of the two parameters and five outputs mean.</div>

## Key Points

* NumPy and SciPy increase the functionality of Python significantly
* NumPy and SciPy provide mathematical, statistical, scientific and engineering functions
* Whilst NumPy and SciPy documentation can look overwhelming, it can easily be interpreted