<center> 
# R406: Using Python for data analysis and modelling

<br> <br> 

## Lecture 8: Introduction to NumPy

<br>

<center> **Andrey Vassilev**

<br> 

<center> **2016/2017**
 

# Outline

1. NumPy basics: arrays and array operations
2. Linear algebra with NumPy
3. Random sampling with NumPy

# What is NumPy?

From the [homepage](http://www.numpy.org/) of the NumPy project:

>NumPy is the fundamental package for scientific computing with Python. It contains among other things:
>   - a powerful N-dimensional array object
>   - sophisticated (broadcasting) functions
>   - tools for integrating C/C++ and Fortran code
>   - useful linear algebra, Fourier transform, and random number capabilities

You have briefly seen it before. Traditionally it is imported as follows:

In [None]:
import numpy as np

# NumPy arrays

- The basic NumPy object is the *array*.
- An array in its simplest form is similar to a matrix, i.e. it is a rectangular table of numbers.
- An array can be a higher-dimensional object (think a "cube" of numbers, `n` equally-sized "cubes" etc.).
- An array is an object of class `ndarray`.
- Arrays can hold various objects. We'll focus on the case of numeric values.

An array can be created from different objects:

In [None]:
# one-dimensional array from a list
x1 = np.array([1.0, 3.0, 5.15]) 
x1

In [None]:
print(type(x1))

In [None]:
# two-dimensional array from a list of lists
# (practically a 2 X 3 matrix)
x2 = np.array([[1.0, 3.0, 5.15],[7,6,5]]) # types are upcasted as needed
x2

In [None]:
# one-dimensional array from a tuple
x3 = np.array((1.0, 3.0, 5.15))
x3

We shall learn more advanced functionality for dealing with data sources but, as a first encounter, you can also import an array from a text file.

In [None]:
%%writefile A.csv
2.04796174,3.90837432,2.59414031,0.66654074,4.63299543,4.55432788,4.61540282
4.9486033,0.72658201,3.17077112,2.46879128,0.4254717,2.50250232,3.78406652
4.4470623,1.8189737,4.91585375,2.99834827,1.63081687,4.8331579,1.14999237

In [None]:
x = np.loadtxt('A.csv',delimiter=',')
x

# NumPy data types

- Unlike, for example, Python lists, NumPy arrays contain data of the same type. 
- Arrays can contain different data types (aka `dtypes`). (See the [docs](https://docs.scipy.org/doc/numpy/reference/arrays.dtypes.html) for a complete description.)
- Examples of `dtypes` include:
   - floats: `float`, `float16`, `float64` etc.
   - ints: `int`, `int32`, `int64` etc.
   - Booleans
- These (and others) can be explicitly specified when constructing an array via the `dtype` argument.

In [None]:
x = np.array([0,1,2,3],dtype = 'int')
x

In [None]:
x = np.array([0,1,2,3],dtype = 'float')
x

In [None]:
x = np.array([0,1,2,3],dtype = 'float32')
x

In [None]:
x = np.array([0,1,2,3],dtype = 'bool')
x

# Commonly used arrays

- It is also possible to create special kinds of arrays, e.g. arrays filled with ones, zeros or empty arrays of predefined dimensions.
- This is done using special functions from NumPy.

## Range-type arrays

The `arange()` function returns a range in the form of an array object:

In [None]:
np.arange(5)

In [None]:
np.arange(3,15)

In [None]:
np.arange(3,15,2)

## Zero arrays

These contain zeros, as the name suggests. They take as arguments either an integer or a sequence of integers containing the dimensions:

In [None]:
np.zeros(6)

In [None]:
np.zeros([2,3])

In [None]:
np.zeros((2,3,2))

## Arrays of ones

These are constructed through the function `ones`. It works similarly to `zeros`:

In [None]:
np.ones(3)

In [None]:
np.ones((5,5), dtype='int')

## The identity matrix/array

The function `eye` creates a 2-D identity array, i.e. with ones on the main diagonal and zeros everywhere else. Note that the syntax is different from the preceding examples.

In [None]:
np.eye(3)

In [None]:
np.eye(3,5) # It doesn't have to be square

In [None]:
np.eye(3,5,1) # We can specify upper and lower diagonals 
              # using positive or negative integers

## Empty arrays

We can also construct empty arrays for subsequent use. They will be filled with random values (whatever was in the memory segment allocated, often zeros). They are (marginally) faster to create than, for example, an array of zeros or ones of the same dimensions.

In [None]:
np.empty((3,4))

## Filling with a value

An array can be filled with a specific value:

In [None]:
np.full((4,5),3.14)

## Arrays of identical shapes

We can also instruct NumPy to create an array of zeros, ones, etc. that has the same dimensions as a pre-specified array.

In [None]:
a = np.array([[1,2,3,4],
              [5,6,7,8]])

In [None]:
np.ones_like(a)

In [None]:
np.zeros_like(a)

# Array operations

- We have seen different arrays but so far we have done little with them.
- Arrays are objects and therefore have attributes:
    - `ndim`, which contains the number of dimensions
    - `shape`, which contains the dimensions themselves as a tuple
    - `size`, which contains the number of elements of the array
    - `itemsize`, which contains the size (in bytes) of each array element, and 
    - `nbytes`, which lists the total size (in bytes) of the array

In [None]:
a = np.array([[1,2,3,4],
              [5,6,7,8]])

In [None]:
a.ndim

In [None]:
a.shape

In [None]:
a.size

In [None]:
a.itemsize

In [None]:
a.nbytes # size * itemsize

## Changing array dimensions

An array can be reshaped by modifying its `shape` attribute...

In [None]:
print(a)

In [None]:
a.shape = (4,2)
print(a)

In [None]:
a.shape = (2,2,2)
print(a)

...or by using the `reshape()` method.

In [None]:
a.reshape((2,4))

## Transposing an array

We can access a transposed version of an array via the `.T` attribute...

In [None]:
a = np.array([[1,2,3,4],
              [5,6,7,8]])
a.T

...or via the `transpose()` method.

In [None]:
a.transpose()

## Array indexing

An array can be indexed similarly to a list:

In [None]:
a[0,0]

In [None]:
a[1,3]

In [None]:
a[0] # A single index refers to the first dimension (rows here)

## Array slicing

Arrays can also be sliced in the familiar manner:

In [None]:
a[1,0:3]

In [None]:
a[0,2:]

In [None]:
a[1,:]

In [None]:
a[:,2]

## Array assignment

This works the same way as for lists, with one exception — the array type is strictly observed.

In [None]:
a = np.array([[1,2,3,4],
              [5,6,7,8]])
a[1,1] = 66.6 # This will be truncated to an integer
a

Slicing works the same way as well:

In [None]:
a[1,1:-1] = np.array([666,777])
a

However, unlike lists, array slices return views instead of copies, meaning that we can modify the original array through them:

In [None]:
print(a)
b = a[1,1:-1]
print(b)
b[0] = 5
print(b)
a

Compare with:

In [None]:
L1 = [1,2,3,4]
L2 = L1[2:]
L2[0] = 666
L1

## Array concatenation

There are several ways of combining arrays

## Array splitting

# UFuncs and vectorized operations

+,-,`**` etc.

# Comparisons, masks and Boolean logic

# A selection of NumPy functions



## Absolute values

## Exponents and logarithms

## Trigonometric functions

# Vectorizing our own: `np.vectorize()`



# Linear algebra with NumPy: an overview

- Matrices: defining matrices
- Matrix operations: addition, multiplication (@), inverses
- Linear systems
- Eigenvalues and eigenvectors
- Determinants

# Defining matrices

# Matrix operations

addition, subtraction

## Matrix multiplication

## Matrix inversion

# Solving linear systems

# Matrix rank

# Computing eigenvalues and eigenvectors

# Determinants

# Generating random values

np.random.rand


## Random normal sampling

np.random.randn

## Permuting an array

np.random.permutation(arr)

## Exponentially distributed arrays

numpy.random.exponential

## The multivariate normal

numpy.random.multivariate_normal