# Numpy and Matplotlib #

These are two of the most fundamental parts of the scientific python "ecosystem". Most everything else is built on top of them.


In [1]:
# import numpy


What did we just do? We _imported_ a package. This brings new variables (mostly functions) into our interpreter. We access them as follows.

In [2]:
# find out what is in our namespace


In [3]:
# find out what's in numpy


In [4]:
# find out what version we have


The numpy documentation is crucial!

http://docs.scipy.org/doc/numpy/reference/

## NDArrays ##

The core class is the numpy ndarray (n-dimensional array).

In [29]:
from IPython.display import Image
Image(url='http://docs.scipy.org/doc/numpy/_images/threefundamental.png')

In [5]:
# create an array from a list


In [6]:
# find out the datatype


In [7]:
# find out the shape


In [8]:
# what is the shape


In [9]:
# another array with a different datatype and shape


In [10]:
# check dtype and shape


__Important Concept__: The fastest varying dimension is the last dimension! The outer level of the hierarchy is the first dimension. (This is called "c-style" indexing)

## More array creation ##

There are lots of ways to create arrays.

In [11]:
# create some uniform arrays


In [12]:
# create some ranges


In [13]:
# arange is left inclusive, right exclusive


In [14]:
# linearly spaced


In [15]:
# log spaced


In [16]:
# two dimensional grids


## Indexing ##

Basic indexing is similar to lists

In [17]:
# get some individual elements of xx


In [18]:
# get some whole rows and columns


In [19]:
# get some ranges using slice notation


There are many advanced ways to index arrays. You can read about them in the manual. Here is one example.

In [21]:
# use a boolean array as an index


In [22]:
# the array got flattened


## Array Operations ##

There are a huge number of operations available on arrays. All the familiar arithemtic operators are applied on an element-by-element basis.

### Basic Math ##

In [20]:
# calc something


At this point you might be getting curious what these arrays "look" like. So we need to introduce some visualization.

In [21]:
# import matplotlib


In [22]:
# pcolormesh our array


## Manipulating array dimensions ##

In [23]:
# transpose


In [24]:
# transpose


In [25]:
# reshape an array (wrong size)


In [26]:
# reshape an array (right size) and mess it up


In [27]:
# tile an array


## Broadcasting ##

Broadcasting is an efficient way to multiply arrays of different sizes


In [30]:
Image(url='http://scipy-lectures.github.io/_images/numpy_broadcasting.png',
     width=720)

In [31]:
# multiply f by x


In [32]:
# multiply f by y (shouldn't work)


In [33]:
# use newaxis special syntax


In [34]:
# look at result


## Reduction Operations ##

In [35]:
# sum


In [36]:
# mean


In [37]:
# std


In [38]:
# apply on just one axis


In [39]:
# plot


## Fancy Plotting ##

Enough lessons, let's have some fun.

In [39]:
# create axes


In [40]:
# big plot


## Real Data ##

ARGO float profile from North Atlantic

In [174]:
# download with curl
!curl -O http://www.ldeo.columbia.edu/~rpa/argo_float_4901412.npz

  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100  140k  100  140k    0     0  1468k      0 --:--:-- --:--:-- --:--:-- 1480k


In [41]:
# load numpy file and examine keys


In [42]:
# access some data


In [43]:
# there are "nans", missing data, which screw up our routines


## Masked Arrays ##

This is how we deal with missing data in numpy

In [44]:
# create masked array


In [45]:
# max and min


In [46]:
# load other data


In [47]:
# scatter plot


# Individual Exercise #

* Calculate and plot the depth mean profile of T, S and P with error bars to indicate the standard deviation.
* Calculate the timeseries of the depth-averaged profiles of T, S, and P