In [1]:
# It is convention to import numpy as `np`
import numpy as np

# Arrays

You can make an array from a regular python list of numbers.

In [2]:
np.array([1, 7, 4, 2])

array([1, 7, 4, 2])

There are also functions for making specific arrays, such as a range of numbers. For example, make an array from 0 to 9:

In [3]:
a = np.arange(10)

a # Display a value in a jupyter notebook by just putting it on a line by itself.

array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])

Get the length of the array:


In [4]:
a.size

10

Get the number of dimensions:

In [5]:
a.ndim

1

The `shape` of the array is the length of each dimension. In this case, it's just 1 dimension of length 10.

In [6]:
a.shape

(10,)

Basic indexing of one-dimensional numpy arrays is similar to a regular python list.

Get the zeroth element:

In [7]:
a[0]

0

Get the last element:

In [8]:
a[-1]

9

Get the elements from [2,5):

In [9]:
a[2:5]

array([2, 3, 4])

Get every other element:

In [10]:
a[::2]

array([0, 2, 4, 6, 8])

# Matrices

A numpy matrix is just an array with more than one dimension.

Create a two dimensional array:

In [11]:
a = np.array([[1,2,3], [4,5,6], [7,8,9]])

a

array([[1, 2, 3],
       [4, 5, 6],
       [7, 8, 9]])

Get the total number of elements in the array:

In [12]:
a.size

9

Get the number of dimensions:

In [13]:
a.ndim

2

Get the length of each of the dimensions:

In [14]:
a.shape

(3, 3)

Indexing two-dimensional arrays can be trickier.  Getting a single row is the same as a multidimensional python list:

In [15]:
a[0]

array([1, 2, 3])

As is getting a single element:

In [16]:
a[0][0]

1

Numpy also allows you to get a single column, something more difficult with python lists.
To do this, numpy uses commas to separate the indicies for each dimension.
So, for instance, to get the zeroth column you would do:

In [17]:
a[:,0]

array([1, 4, 7])

Let's break that down.  The first colon just says to select all the values, 
the same as if you used the selector on a list:

In [18]:
[1,2,3][:]

[1, 2, 3]

Then the comma is to separate the first dimension from the second.
If we just had `:,:` we would get the whole matrix:

In [19]:
a[:,:]

array([[1, 2, 3],
       [4, 5, 6],
       [7, 8, 9]])

But instead, after the comma we put just a 0, meaning that
we want just the zeroth column.  We could have put a 1 to get the first column:

In [20]:
a[:,1]

array([2, 5, 8])

Or get just the first and second columns:

In [21]:
a[:,:2]

array([[1, 2],
       [4, 5],
       [7, 8]])

We can use the same notation for rows as well.  For instance, to get the first two rows:

In [22]:
a[:2,:]

array([[1, 2, 3],
       [4, 5, 6]])

And then combining both we can get just the top left corner:

In [23]:
a[:2,:2]

array([[1, 2],
       [4, 5]])

# Array Creation Functions

We've already seen how to create an array from a python list, or from a range of numbers.  But numpy also has built-in functions to create specific arrays.  For instance, to create an array of all zeros:

In [24]:
np.zeros(10)

array([ 0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.])

You can specify the shape when creating the array.  For instance, to create a 3x3 matrix of ones:

In [25]:
np.ones((3,3))

array([[ 1.,  1.,  1.],
       [ 1.,  1.,  1.],
       [ 1.,  1.,  1.]])

Besides `np.arange`, numpy can also create other number series.

Create an array of 10 elements evenly spaced between 10 and 100:

In [26]:
start = 10
stop = 100
n = 10

np.linspace(start,stop,n)

array([  10.,   20.,   30.,   40.,   50.,   60.,   70.,   80.,   90.,  100.])

Or create a series of 10 elements logarithmically spaced between 10 and 100.  This is useful if you want more numbers at the lower end of the range than at the higher.  Here the start and stop are given as exponents.

In [27]:
start = 1 # 10^1 = 10
stop = 2 # 10^2 = 100
n = 10

np.logspace(start,stop,n)

array([  10.        ,   12.91549665,   16.68100537,   21.5443469 ,
         27.82559402,   35.93813664,   46.41588834,   59.94842503,
         77.42636827,  100.        ])

# Generating Random Arrays

[`numpy.random`](http://docs.scipy.org/doc/numpy/reference/routines.random.html) provides functions for generating arrays from a number of different distributions.

Seeding the random generator ensures that each time a program is run, the same sequence of random numbers is generated.  This is useful for making your analysis reproduceable.

In [28]:
np.random.seed(0) # Seed the random generator with 0

Generate a 2x5 matrix of random numbers chosen uniformly from [0,1):

In [29]:
np.random.random((2,5))

array([[ 0.5488135 ,  0.71518937,  0.60276338,  0.54488318,  0.4236548 ],
       [ 0.64589411,  0.43758721,  0.891773  ,  0.96366276,  0.38344152]])

Take numbers from the normal distribution with mean 0 and standard deviation 1:

In [30]:
np.random.normal(size=(2,5))

array([[ 0.14404357,  1.45427351,  0.76103773,  0.12167502,  0.44386323],
       [ 0.33367433,  1.49407907, -0.20515826,  0.3130677 , -0.85409574]])

Take numbers from the normal distribution with mean 3 and standard deviation 2:

In [31]:
np.random.normal(loc=3, scale=2, size=(2,5))

array([[-2.10597963,  4.30723719,  4.7288724 ,  1.51566996,  7.53950925],
       [ 0.09126865,  3.09151703,  2.6256323 ,  6.06555843,  5.93871754]])

Take numbers from a poisson distribution with lambda=5:

In [32]:
np.random.poisson(lam=5, size=(2,5))

array([[5, 4, 5, 4, 3],
       [3, 7, 3, 3, 4]])

There are many other distributions built-in such as binomial, chisquare, etc. detailed in the numpy documentation.

# Reshape

Often you will need to change the shape of an array.  For example, `np.arange` can only make flat arrays.  For example, to make a range into a 3x5 matrix:

In [33]:
a = np.arange(15).reshape((3,5))

a

array([[ 0,  1,  2,  3,  4],
       [ 5,  6,  7,  8,  9],
       [10, 11, 12, 13, 14]])

Or to now turn it into a 5x3 matrix:

In [34]:
a.reshape((5,3))

array([[ 0,  1,  2],
       [ 3,  4,  5],
       [ 6,  7,  8],
       [ 9, 10, 11],
       [12, 13, 14]])

For the special case above, where the matrix is transposed, you can also use the attribute `T`:

In [35]:
a.T

array([[ 0,  5, 10],
       [ 1,  6, 11],
       [ 2,  7, 12],
       [ 3,  8, 13],
       [ 4,  9, 14]])

The overall size of the matrix can't change:

In [36]:
a.reshape((5,5))

ValueError: total size of new array must be unchanged

To get a one-dimensional array again use `flatten`:

In [37]:
a.flatten()

array([ 0,  1,  2,  3,  4,  5,  6,  7,  8,  9, 10, 11, 12, 13, 14])

# Mathematical Operations

numpy makes it really easy to do mathematical operations on arrays and matrices. Here are a few examples:

Scalar addition:

In [38]:
np.arange(10) + 5

array([ 5,  6,  7,  8,  9, 10, 11, 12, 13, 14])

Scalar multiplication:

In [39]:
np.arange(10) * 5

array([ 0,  5, 10, 15, 20, 25, 30, 35, 40, 45])

Scalar comparisons:

In [40]:
np.arange(10) < 5

array([ True,  True,  True,  True,  True, False, False, False, False, False], dtype=bool)

Array addition:

In [41]:
np.arange(10) + np.arange(10)

array([ 0,  2,  4,  6,  8, 10, 12, 14, 16, 18])

Elementwise multiplication:

In [42]:
np.arange(10) * np.arange(10)

array([ 0,  1,  4,  9, 16, 25, 36, 49, 64, 81])

Dot product:

In [43]:
np.arange(10).dot(np.arange(10))

285

Sum:

In [44]:
np.arange(10).sum()

45

Element-wise sine:

In [45]:
np.sin(np.arange(10))

array([ 0.        ,  0.84147098,  0.90929743,  0.14112001, -0.7568025 ,
       -0.95892427, -0.2794155 ,  0.6569866 ,  0.98935825,  0.41211849])

Check out the [documentation](http://docs.scipy.org/doc/numpy-1.10.1/reference/routines.math.html) for the full list of math functions.

# Statistics

numpy has several basic statistical functions built-in.  Such as mean:

In [46]:
np.arange(10).mean()

4.5

And standard deviation:

In [47]:
np.arange(10).std()

2.8722813232690143

Check out the [documentation](http://docs.scipy.org/doc/numpy-1.10.0/reference/routines.statistics.html) for the full list of statistics functions.

# Further Reading

* [NumPy Official Quickstart](https://docs.scipy.org/doc/numpy-dev/user/quickstart.html)
* [Full NumPy Documentation](http://docs.scipy.org/doc/numpy/)