# The NumPy Package

## What is it?

NumPy is a Python package used for scientific computing. Often cited as a fundamental package in this area, this is for good reason. It provides a high performance multidimensional array object, and a variety of routines for operations on arrays. NumPy is not part of a basic Python installation. It has to be installed after the Python installation.

The implemented multi-dimensional arrays are very efficient as NumPy was mostly written in C; the precompiled mathematical and numerical functions and functionalities of NumPy guarantee excellent execution speed.

At its core NumPy provides the ndarray (multidimensional/n-dimensional array) object. This is a grid of values of the same type indexed by a tuple of non-negative integers. This is explained in further detail in the basics section below.

## The Basics

In Numpy, axes are called dimensions, lets take a hypothetical set of coordinates and see what exactly this means.

The coordinates of a point in 3D space [1, 4, 5] has one axis that contains 3 elements ie. has a length of 3.
Let's take another example to reinforce this idea.

Using the illustrated graph below we can see that the set of coordinates has 2 axes. Axis 1 has a length (shown in colour) of 2, and axis 2 has a length of 3.

<mark>FIGURE 1</mark>
![Graph displaying axes of coordinate](./images/pointInSpace.jpg)


One of NumPy's most powerful features is it's array class: ```ndarray``` This is not to be confused with the Python's array class ```array.array``` which has less functionality and only handles one-dimensional arrays.

##### Let's use some of ndarray's functions to prove and reinforce what we learnt above.

#### A Simple Example

Before we use NumPy we have to import it.
It is very common to see NumPy renamed to "np".



In [36]:
import numpy as np

From here let's create an array of values that represent distances eg. in metres

In [37]:
mvalues = [45.26, 10.9, 26.3, 80.5, 24.1, 66.1, 19.8, 3.0, 8.8, 132.5]

Great, now let's do some NumPy stuff on that. We're going to turn our array of values into a one-dimensional NumPy array and print it to the screen.

In [38]:
M = np.array(mvalues)
print(M)

[ 45.26  10.9   26.3   80.5   24.1   66.1   19.8    3.     8.8  132.5 ]


Now let's say that we wanted to convert all of our values in metres to centimetres. This is easily achieved using a NumPy array and scalar multiplication.

In [39]:
print(M*100)

[ 4526.  1090.  2630.  8050.  2410.  6610.  1980.   300.   880. 13250.]


If we print out M again you can see that the values have not been changed.

In [40]:
print(M)

[ 45.26  10.9   26.3   80.5   24.1   66.1   19.8    3.     8.8  132.5 ]


Now if you wanted to do the same thing using standard Python, as shown below, then hopefully you can clearly see the advantage in using NumPy instead.

In [41]:
mvalues = [ i*100 for i in mvalues] 
print(mvalues)

[4526.0, 1090.0, 2630.0, 8050.0, 2410.0, 6609.999999999999, 1980.0, 300.0, 880.0000000000001, 13250.0]


The values for mvalues have also all been permanently changed now, whereas we didn't need to alter them when using NumPy 

Earlier I explained that NumPy provides the ndarray object. "M" is an instance of the class ```numpy.ndarray```, proven below:

In [42]:
type(M)

numpy.ndarray

# The NumPy Random Package

In order to use the random package the name of package must be specified followed by the function. 

eg. ```np.random.rand(3,3)``` (It's implied that NumPy has been imported)

As a precursor, you will notation that looks like this **[num, num]** across this notebook

This is known as interval notation. This brief explanation from Wikipedia nicely encapsulates the notations found in this notebook.

    "In mathematics, a (real) interval is a set of real numbers with the property that any number that lies between two numbers in the set is also included in the set."
    
    "For example, (0,1) means greater than 0 and less than 1. A closed interval is an interval which includes all its limit points, and is denoted with square brackets. For example, [0,1] means greater than or equal to 0 and less than or equal to 1. A half-open interval includes only one of its endpoints, and is denoted by mixing the notations for open and closed intervals. (0,1] means greater than 0 and less than or equal to 1, while [0,1) means greater than or equal to 0 and less than 1."


## Simple random data
Let's go over some of the functions that NumPy provides to help us deal with simple random data. 

### rand
```rand``` is used to generate random values in a given shape. The dimensions of the input should be positive and if they are not or they're empty a float will be returned. The example below creates an array of specified shape and populates it with random samples from a uniform distribution over [0, 1).

Let's use that example and construct a 2D array of random values. If you wanted a 3D array or greater you would simply add an additional parameter to the function.

In [237]:
np.random.rand(3,3)

array([[0.3926624 , 0.6571114 , 0.91638746],
       [0.49278316, 0.39118869, 0.66385803],
       [0.56017542, 0.41416327, 0.6876804 ]])

### randn

```randn``` Returns a sample (or samples) from the “standard normal” distribution.

A standard normal distribution is a normal distribution where the average value is 0, and the variance is 1. 

When the function is provided a positive "int_like or int-convertible arguments", randn generates an array of specified shape (d0, d1, ..., dn), filled with random floats sampled from this distribution. Like ```rand```, a single float is returned if no argument is provided.

In [238]:
np.random.randn(3,2)

array([[ 1.28554165,  1.88706658],
       [ 1.24007967, -1.19765102],
       [-0.30822111,  0.32330164]])

Additional computations can be added for greater specificity

In [239]:
np.random.randn(3,2) + 8

array([[10.22105041,  8.0167781 ],
       [ 9.42433697,  7.28535126],
       [ 9.34692905,  8.90351029]])

### randint



```randint``` returns random integers from low (inclusive) to high (exclusive).

the parameters for ```randint``` are **(low, high=None, size=None, dtype='l')**

**low and high** simply refer to the lowest and highest numbers that can be drawn from the distribution,

**size** refers to the output shape,

**dtype** is optional but if you have a desired output type you may specifiy it here.

The example below generates a 4 x 4 array of ints between 0 and 4, inclusive:

In [240]:
np.random.randint(5, size=(4, 4))

array([[0, 0, 3, 3],
       [1, 2, 1, 1],
       [0, 0, 1, 3],
       [1, 1, 3, 2]])

randint is often used for more simple operations eg. generate a random number between 1-10

In [241]:
np.random.randint(1,10)

1

### random_integers

`random_integers` Returns random integers of type np.int from the “discrete uniform” distribution in the closed interval [low, high]. The np.int type translates to the C long type used by Python 2 for “short” integers

`random_integers` is similar to ```randint```, only for the closed interval [low, high]. 

0 is the lowest value if high is omitted when using `random_integers`, whereas it is 1 in the case that high is omitted when using `randint`.

This function however has been deprecated so it is advised you use ```randint``` instead.

ie. `np.random.random_integers(10) --> randint(1, 10 + 1)`

### random_sample