# METR3613: Meteorological Measurements, Dr. Scott Salesky 
## The Numerical Python (Numpy) package

The goal of this notebook is to introduce/review the Numerical Python package (otherwise known as Numpy). While Python is a very useful programming language in general (and can be used for many different applications), we are primarily interested in using it for scientific computing. Numpy provides additional functionality to Python, including additional data structures and functions, that make it very useful to manipulating numerical data. We will use a number of different functions from Numpy in the METR3613 homework assignments. Some examples of the syntax and different capabilities of Numpy are illustrated in the examples below. 

## How to use this notebook

Because a Python notebook can be run in your browser, one cell at a time, you are encouraged to read through this notebook at your own pace and to make sure you understand the syntax of the examples below. One of the best ways to understand what the code does is to edit it and to change values of some of the variables and to define your own variables, functions, etc. So feel free to make this document your own and to try out writing code following the examples that are given below. 
***

In [2]:
#Import some packages that will be useful later on...
%matplotlib inline
import matplotlib.pyplot as plt

#### Overview of Numpy
The `numpy` package is used to do numerical calculations in Python. It provides vector and matrix data structures for Python, as well as built-in numerical functions. 

To use `numpy`, you will need to import the module as seen below:

In [3]:
import numpy as np

This imports all of the Numpy functions as `np.functionname` that we be accessed in our code. 

## Creating Numpy arrays
The main way we work with data in Numpy is through an array. An array is similar to a Python list, but is much more general. We can define a 2-D array (think of a matrix), or an array with higher dimensions, e.g. a 4-D array for meteorological variables that are a function of $x$, $y$, $z$, and $t$. There are several ways that we can create Numpy arrays. 

### From lists
If we already have a list object, we can use the `np.array` function.

In [4]:
#A vector: the argument to the array function is a Python list
v = np.array([1,2,3,4])
v

array([1, 2, 3, 4])

In [5]:
#A matrix: the argument to the array function is a nested Python list
M = np.array([[1,2],[3,4]])
M

array([[1, 2],
       [3, 4]])

Both the `v` and `M` variables are of type `ndarray` provided by the `numpy` module. 

In [6]:
type(v), type(M)

(numpy.ndarray, numpy.ndarray)

The only difference between `v` and `M` is their shapes. We can get information on the shape of an array through the `ndarray.shape` property.

In [7]:
v.shape

(4,)

In [8]:
M.shape

(2, 2)

We can check how many elements are in an array through the `ndarray.size` property:

In [9]:
M.size

4

Or, we could use `np.shape` and `np.size`:

In [10]:
np.shape(M)

(2, 2)

In [11]:
np.size(M)

4

### But why use a Numpy array instead of a list? 
#### Isn't this unnecessarily complicated? 
There are several reasons to use Numpy arrays rather than Python lists. 
- Python lists are very general and can contain any kind of object. They do not support mathematical operations such as matrix multiplications, etc. 
- Numpy arrays are **homogeneous** (containing only one type of variable) and **statically typed** (the type of the elements is determined when the array is created). 
- Numpy arrays are memory efficient. 
- Mathmatical functions (such as multiplication and addition of Numpy arrays) can be implemented with much faster execution speed. 

Using the `dtype` (data type) property of an `ndarray`, we can see what type the data of an array has:

In [12]:
M.dtype

dtype('int64')

We get an error if we try to assign a value of the wrong type to an element in an array:

In [13]:
M[0,0] = "hello"

ValueError: invalid literal for long() with base 10: 'hello'

If we want, we can define the type of variables contained in an array when we define it using the `dtype` keyword:

In [None]:
M = np.array([[1,2],[3,4]], dtype='complex')
M

Common data types that can be used with `dtype` are `int`, `float`, `complex`, `bool`, `object`, etc. 

We can also define the precision of the data types, e.g. `int64`, `int16`, `float128`, `complex128`.

### Using array-generating functions

For large arrays, it's not practical to intialize the data manually using explicit Python lists. Instead, we can use some of the built-in functions in Numpy that generate arrays for us. 

#### arange
Create an array between the start and stop values with increment specified by the step. 

In [None]:
#Create a range
x = np.arange(0,10,1) #Arguments: start, stop, step
x

In [None]:
x = np.arange(-1,1,0.1)
x

#### linspace and logspace
Creates an array that is equally spaced on a linear or logarithmic scale

In [None]:
#Using linspace, both end points ARE included
np.linspace(0,10,25)

In [None]:
np.logspace(0,10,10,base=np.e)

#### mgrid

In [None]:
x, y = np.mgrid[0:5, 0:5] #Similar to meshgrid in Matlab

In [None]:
x

In [None]:
y

#### diagonal matrix

In [None]:
#A diagonal matrix
np.diag([1,2,3])

#### zeros and ones

In [None]:
#Note: This is an incredibly useful way to create empty arrays with a given data type. 
np.zeros((3,3),dtype='float')

In [None]:
np.ones((3,3))

## File Input and Output
Numpy contains many functions for reading and writing different kinds of data files. However, you will not need to write code to read in data files yourself in this course. This code will be provided. However, for more information on File IO with Numpy, see
[this link](https://docs.scipy.org/doc/numpy/reference/routines.io.html). 

## Manipulating arrays
Similar to Python lists, we can index the elements in a Numpy array. 

In [None]:
#V is a vector and only has one dimension, so it takes one index
v[0]

In [None]:
#M is a matrix or a 2 dimensional array and takes two indices
M = np.array([[1,2],[3,4]])
M[1,1]

If we omit the index of a multidimensional array, it returns the whole row

In [None]:
M[1]

The same thing can be achieved using `:` instead of an index:

In [None]:
M[1,:] #row 1

In [None]:
M[:,1] #column 1

We can assign new values to an array element:

In [None]:
M[0,0] = -5

In [None]:
M

In [None]:
#This also works for rows and columns
M[1,:] = 0
M

In [None]:
M[:,0] = -1
M

### Index slicing
Similar to Python lists, we can use the syntax `M[lower:upper:step]` to extact part of an array. 

In [None]:
A = np.array([1,2,3,4,5,6,7,8,9,10])
A

In [None]:
A[1:3]

We can omit any of the parameters in `M[lower:upper:step]`

In [None]:
A[::] #lower, upper, step take default values

In [None]:
A[::2] #step is 2, lower and upper default to beginning and end of array

In [None]:
A[:3] #first 3 elements

In [None]:
A[3:] #elements from index 3

Negative indices count from the end of the array.

In [None]:
A[-1] #last element of the array

In [None]:
A[-3:] #the last 3 elements

Index slicing works the same for multidimensional arrays

In [None]:
A = np.array([[n+m for n in range(5)] for m in range(5)])
A

In [None]:
#A block from the original array
A[1:4,1:4]

In [None]:
#Strides of 2
A[::2,::2]

## Basic data processing
Numpy arrays are very useful for storing real data (this is what we will use them for in METR3613). Numpy provides a numpy of functions to calculate statistics of datasets in arrays. 

In [15]:
#Air temperature data from Norman Mesonet station on August 1, 2018
Tdata = np.array([27.5, 27.5, 27.4, 27.3, 27.2, 27.3, 27.2, 27.1, 26.9, 26.6, 26.3, 25.9, 25.9, 25.7, 25.4, 25.6,
25.3, 25.2, 25.0, 24.8, 24.5, 24.5, 24.5, 24.4, 24.3, 24.2, 23.9, 23.9, 23.0, 22.9, 22.8, 22.9,
22.5, 21.5, 21.9, 22.5, 22.8, 22.1, 21.1, 21.3, 21.7, 21.7, 21.2, 21.1, 21.3, 21.1, 21.1, 20.6,
20.0, 20.4, 20.3, 19.6, 20.2, 20.2, 20.0, 19.8, 19.8, 20.4, 20.5, 19.7, 19.6, 19.7, 19.9, 19.7,
19.5, 19.2, 19.2, 19.2, 19.3, 19.1, 19.2, 19.1, 19.0, 18.9, 18.7, 18.9, 18.7, 18.7, 19.0, 18.6,
18.5, 18.4, 18.0, 17.9, 18.0, 18.1, 18.2, 18.4, 18.3, 18.1, 18.5, 18.7, 18.2, 18.0, 18.2, 18.3,
18.4, 18.0, 18.1, 18.0, 18.3, 18.0, 18.2, 18.2, 18.3, 17.5, 17.6, 17.6, 17.4, 17.8, 17.8, 17.6,
17.7, 17.7, 17.8, 17.8, 17.6, 17.3, 17.6, 18.0, 17.6, 17.5, 17.5, 17.4, 17.0, 16.9, 17.1, 17.1,
17.1, 17.0, 16.7, 16.9, 16.6, 16.5, 16.2, 16.7, 16.6, 16.6, 16.6, 16.5, 16.5, 16.5, 16.3, 16.6,
16.9, 17.0, 17.1, 17.2, 17.2, 17.6, 18.2, 18.3, 18.5, 18.8, 19.2, 19.5, 19.9, 20.0, 20.3, 20.5,
20.9, 21.3, 21.5, 21.6, 21.8, 22.0, 22.2, 22.3, 22.6, 22.7, 23.0, 23.4, 24.2, 24.5, 24.5, 24.9,
24.8, 24.9, 25.5, 25.7, 26.0, 26.4, 26.3, 26.3, 26.4, 26.3, 26.5, 26.7, 26.7, 27.1, 27.0, 26.9,
27.2, 27.4, 27.5, 27.6, 27.5, 28.1, 28.0, 28.1, 27.9, 28.1, 28.3, 28.6, 28.4, 28.3, 28.6, 28.5,
28.8, 28.8, 28.7, 28.7, 29.1, 29.0, 29.3, 29.0, 29.1, 28.8, 29.2, 29.3, 29.2, 29.8, 29.9, 29.9,
28.5, 28.7, 28.5, 29.6, 29.7, 29.2, 29.7, 29.9, 29.7, 29.9, 29.4, 29.8, 29.9, 29.6, 29.8, 30.0,
30.5, 30.1, 30.4, 30.0, 29.7, 29.4, 30.0, 29.8, 30.4, 30.4, 30.5, 30.7, 30.8, 30.8, 30.3, 30.5,
30.8, 30.4, 30.4, 30.2, 30.6, 30.4, 30.5, 30.7, 30.0, 29.6, 29.8, 30.1, 29.5, 29.2, 29.2, 29.7,
30.1, 29.8, 29.2, 29.1, 29.2, 29.0, 29.5, 29.6, 29.3, 29.5, 29.1, 29.2, 29.7, 29.8, 29.9, 29.6])

In [16]:
#Get the shape of the array
Tdata.shape

(288,)

#### mean

In [17]:
np.mean(Tdata)

23.643402777777776

#### standard deviation and variance

In [18]:
np.std(Tdata)

4.949336187700445

In [19]:
np.var(Tdata)

24.495928698881173

#### min and max values

In [20]:
Tdata.min()

16.2

In [21]:
Tdata.max()

30.8

## Further reading
- (http://www.numpy.org/) - Numpy homepage
- (https://docs.scipy.org/doc/numpy/user/quickstart.html) - The official Numpy tutorial
- (https://docs.scipy.org/doc/numpy/reference/index.html) - Numpy reference documentation (describes many additional functions we didn't cover here)
- (https://docs.scipy.org/doc/scipy/reference/) SciPy (Scientific Python) documentation. SciPy contains additional functions beyond what is provided in NumPy along (integration, optimization, signal processing, interpolation, linear algebra, ODEs, etc.)