This notebook is an introduction to Python and Numpy.
Hopefully it will be enough to get you started with manipulating arrays of numbers. 

The first thing is to tell Python that you are using Numpy. This is called *importing it into the namespace*. It is a good idea to give each module that you import a short name, since you will type them a lot. 

Python comments start with a #. They are ignored by the interpreter for the language. 

To follow this notebook along, press ctrl+enter in each of the code blocks (which start with In[ ]). This runs the commands in that notebook cell. The last one in the cell is printed out, the printouts of the others is supressed. To see all of the outputs, just put them in separate cells.

When using the notebook, pressing tab when partway through a command will show you the possible completion options.

In [2]:
# Import numpy and use np as its name
import numpy as np

You can get help by using a ? before the name of a command, e.g.

?np.ones

In [35]:
?np.ones

Making Arrays
==

Numpy is based on arrays of numbers.  You specify the size of the matrix when you specify it, by listing the size of each dimension in turn. Note that this is inside another set of brackets inside the function call.

So to make a set of matrices that contains a variety of values, and have various numbers of dimensions and sizes, run the following cells. Note that we can make arbitrarily high dimensional matrices, but then keeping track of the indices is hard.

The *type* of all the numbers is the same, and by default the entries are floating point numbers, to change it specify what you want using *dtype*.

In [51]:
# The most basic way to make an array is just to list the entries. 
# Here is a 1D matrix with 3 elements
np.array([3,4,2])

array([3, 4, 2])

In [52]:
# And here is a 2D matrix, which is a bit more involved to write out
np.array([ [2,3], [1,2], [3,4]])

array([[2, 3],
       [1, 2],
       [3, 4]])

In [5]:
# A 1D array containing the value 1.0 (floating point)
np.ones(2)

array([1., 1.])

In [6]:
# A 2D array containing the value 1.0 in 3 rows and 5 columns
np.ones((3,5))

array([[1., 1., 1., 1., 1.],
       [1., 1., 1., 1., 1.],
       [1., 1., 1., 1., 1.]])

In [13]:
# A 3D array with 2 rows, 3 columns, and 4 in the final dimension
# The zeros are integers, not floats
np.zeros((2,3,4),dtype='int')

array([[[0, 0, 0, 0],
        [0, 0, 0, 0],
        [0, 0, 0, 0]],

       [[0, 0, 0, 0],
        [0, 0, 0, 0],
        [0, 0, 0, 0]]])

In [15]:
# Note that adding a single integer (a scalar) adds it to all the elements of the array
np.ones((2,3))+1

array([[2., 2., 2.],
       [2., 2., 2.]])

In [16]:
# As does multiplying by a scalar 
2*np.ones((2,3))

array([[2., 2., 2.],
       [2., 2., 2.]])

Some more interesting matrices
---

In [22]:
# The identity matrix in 2D, of size 3*4
# Note that there is only 1 set of brackets here
np.eye(3,4)

array([[1., 0., 0., 0.],
       [0., 1., 0., 0.],
       [0., 0., 1., 0.]])

In [23]:
# Make an array of the integers from 0 to 4 
# (Python starts indexing at 0, and finishes when it reaches the number you specify, not after it)
np.arange(5)

array([0, 1, 2, 3, 4])

In [28]:
# Make an array of integers starting at 3, finishing at 8, going up in steps of 2
np.arange(3,9,2)

array([3, 5, 7])

In [30]:
# And starting at 7, stopping before 2, and going down in steps of 1.5
np.arange(7,2,-1.5)

array([7. , 5.5, 4. , 2.5])

In [33]:
# Make a set of 10 linearly spaced numbers starting at 1 and finishing at 3
np.linspace(1,3,10)

array([1.        , 1.22222222, 1.44444444, 1.66666667, 1.88888889,
       2.11111111, 2.33333333, 2.55555556, 2.77777778, 3.        ])

In [39]:
# Make some random numbers -- a 2D (3 by 4) set of Gaussian (normal) random numbers with mean 2 and standard deviation 1
np.random.normal(2,1,size=(3,4))

array([[1.37229834, 3.32538849, 2.62763136, 4.81522918],
       [1.51991168, 2.6875834 , 2.40139893, 1.7055504 ],
       [2.19693255, 1.95243914, 3.11968971, 1.78742197]])

In [46]:
# Make a 1D array of 10 uniform random numbers between 2 and 3
np.random.uniform(2,3,size=10)

array([2.69318714, 2.44264367, 2.12679401, 2.07975169, 2.42927148,
       2.62700674, 2.19207368, 2.79184915, 2.37662232, 2.40302627])

In [49]:
# Make a set of 2D (size 2 by 3)random integers between 0 (inclusive) and 5 (exclusive)
np.random.randint(5,size=(2,3))

array([[3, 1, 0],
       [1, 3, 2]])

In [50]:
# Make a set of 2D (size 2 by 3)random integers between 4 (inclusive) and 6 (exclusive)
np.random.randint(4,6,size=(2,3))

array([[4, 4, 5],
       [4, 4, 4]])

Note that in the last few examples we were using the *random* subpackage of Numpy

Indexing the elements of arrays
===

As mentioned before, Python indexes from 0. So if you want to get the first element of the array use index 0. Then the next is 1, etc.

You can also index from the end of the array, which is -1. Then the last-but-one elements is -2, etc. 

In [54]:
# Here we make a 1D array called A and get the various elements out of it (using the print function)
# And then change one of them, and check that the change worked
A = np.arange(5)
print(A)
print(A[0], A[-1], A[1], A[-2])
A[1] = 7
A[1]

[0 1 2 3 4]
0 4 1 3


7

In [57]:
# And here it is in 2D, where we need 2 indices
A = np.array([[1,2],[3,4],[5,6]])
print(A)
print(A[0,0], A[-1,1],A[0,-1])
A[1,1] = 0
print(A)

[[1 2]
 [3 4]
 [5 6]]
1 6 2
[[1 2]
 [3 0]
 [5 6]]


Since I used that example, it reminded me of a useful function, which is reshape. We give it a matrix, and the new size that we would like it to be (which assumes that it stays the same size, of course).

To get the size of a matrix, we use the np.shape command. There is also np.ndim, which gives the number of dimensions, and np.size, which gives the number of elements.

In [130]:
# Here is another way to make the matrix of the previous example
A = np.arange(6)
print(A, np.shape(A))
A = np.reshape(A,(3,2))
print(A, np.shape(A), np.ndim(A), np.size(A))

[0 1 2 3 4 5] (6,)
[[0 1]
 [2 3]
 [4 5]] (3, 2) 2 6


Slicing
---

It is often useful to extract more than one element of an array. Numpy allows this with the : using a technique called *slicing*. You specify the start, end, and step, but if you miss one of them out, Numpy fills them in as 0, the end, and 1 respectively (so : is the same as 0::1).

In [69]:
# Make a 1D array of the integers 0 to 5
A = np.arange(6)
# Then print out the ones in the 1st to 4th places (inclusive)
print(A[1:5])
print(A[:], A[0::1])

[1 2 3 4]
[0 1 2 3 4 5] [0 1 2 3 4 5]


The same thing works in more dimensions, but there is 1 important difference: Numpy squashes dimensions that only have one element in them. This can be a problem if you aren't careful.

In [80]:
# Make a 2D array of the integers again (note we can do it in 1 line)
A = np.arange(6).reshape((3,2))
# And get the middle row
A[1,:]

array([2, 3])

In [81]:
# Get the second column
A[:,1]

array([1, 3, 5])

Can you see the problem? It is no longer a column, but a row. Which isn't normally what you want.

To fix it, it's better to specify the indices as a range, as below. 

In [82]:
# Now it's a column 
A[:,1:2]

array([[1],
       [3],
       [5]])

In [85]:
# We can also slice in more than 1 dimension at once
# Here is another matrix and a submatrix of it
A = np.arange(9).reshape((3,3))
A[1:,1:]

array([[4, 5],
       [7, 8]])

To reverse an array, we can just read it backwards

In [78]:
A = np.arange(6)
print(A)
A = A[::-1]
print(A)

[0 1 2 3 4 5]
[5 4 3 2 1 0]


Adding, and multiplying matrices
===

Providing the matrices are the correct shapes, we can add matrices together, and multiply them in two different ways: element-wise multiplication, and the correct matrix multiplication. If the shapes of the matrices are wrong, we will get errors. The message tries to explain the problem. 

In [87]:
A = np.ones((2,3))
B = 2*np.ones((2,3))
C = 2*np.ones(6)
print(A+B)
print(A+C)

[[3. 3. 3.]
 [3. 3. 3.]]


ValueError: operands could not be broadcast together with shapes (2,3) (6,) 

In [88]:
print(A*B)

[[2. 2. 2.]
 [2. 2. 2.]]


Matrix multiplication is called *dot* in Numpy. It requires that the second dimension of the first (2D) matrix matches the first dimension of the second one. The shape of the output matrix is then the first dimension of the first matrix by the second dimension of the second one. 

In [94]:
# This gives an error -- shapes are wrong
np.dot(A,B)

ValueError: shapes (2,3) and (2,3) not aligned: 3 (dim 1) != 2 (dim 0)

In [96]:
# To fix it, we transpose the second matrix, so that it is 3*2 instead of 2*3
print(B.T, np.shape(B.T))
# And then the multiplication works
print(np.dot(A,B.T))

[[2. 2.]
 [2. 2.]
 [2. 2.]] (3, 2)
[[6. 6.]
 [6. 6.]]


Concatenation and Split
===

The last thing we will see is how to combine and separate out matrices. 

The concatenate command puts matrices together. Note the square brackets for the set of matrices. The axis parmeter specifies which dimension to use( the default is 0). The relevant sizes have to match up, or you will get errors. These simple uses of concatenate can be replaced by hstack and vstack. 

The stack command more generally puts the matrices into a higher-dimensional array. 

And you can split a matrix into pieces using split. 

Examples follow.



In [99]:
np.concatenate([np.ones((2,3)),2*np.ones((2,3))])

array([[1., 1., 1.],
       [1., 1., 1.],
       [2., 2., 2.],
       [2., 2., 2.]])

In [107]:
np.concatenate([np.ones((2,3)),2*np.ones((2,4))],axis=1)

array([[1., 1., 1., 2., 2., 2., 2.],
       [1., 1., 1., 2., 2., 2., 2.]])

In [105]:
np.concatenate([np.ones((2,3)),2*np.ones((2,4))],axis=1)

array([[1., 1., 1., 2., 2., 2., 2.],
       [1., 1., 1., 2., 2., 2., 2.]])

In [114]:
np.vstack([np.ones((2,3)),2*np.ones((2,3))])

array([[1., 1., 1.],
       [1., 1., 1.],
       [2., 2., 2.],
       [2., 2., 2.]])

In [111]:
np.vstack([np.ones((2,3)),2*np.ones((2,3))])

In [116]:
print(np.stack([np.ones((2,3)),2*np.ones((2,3))]))
np.shape(np.stack([np.ones((2,3)),2*np.ones((2,3))]))

[[[1. 1. 1.]
  [1. 1. 1.]]

 [[2. 2. 2.]
  [2. 2. 2.]]]


(2, 2, 3)

In [124]:
# In 1D splitting just means specifying where you want to split
A = np.arange(7)
print(np.split(A,[2]))
print(np.split(A,[2,5]))

[array([0, 1]), array([2, 3, 4, 5, 6])]
[array([0, 1]), array([2, 3, 4]), array([5, 6])]


In [127]:
# In higher dimensions, you also need to say which dimension you want to split along
A = np.ones((3,4))
print(np.split(A,[1],axis=0))
print(np.split(A,[1,3],axis=1))

[array([[1., 1., 1., 1.]]), array([[1., 1., 1., 1.],
       [1., 1., 1., 1.]])]
[array([[1.],
       [1.],
       [1.]]), array([[1., 1.],
       [1., 1.],
       [1., 1.]]), array([[1.],
       [1.],
       [1.]])]


Finding elements
---

Another very useful command is np.where, which either returns the elements where a logical statement is true, or returns a new matrix the same size with different values in the places where the logical statement is true and false

In [159]:
A = np.arange(6)
print(A)
print(np.where(A>3))

# Another syntax -- return a matrix the same size as A, but with 0 in the places where A>3 and 1 in the rest
np.where(A>3,0,1)

[0 1 2 3 4 5]
(array([4, 5]),)


array([1, 1, 1, 1, 0, 0])

Applying mathematical functions to matrices
===

Hopefully it is fairly obvious what np.sin(A) does

In [160]:
np.sin(A)

array([ 0.        ,  0.84147098,  0.90929743,  0.14112001, -0.7568025 ,
       -0.95892427])

The other thing we can do is compute sums, maximums and minima

In [171]:
A = np.random.uniform(size=(3,4))
print(A)
# These give the sum, min, max over the whole matrix
print(np.sum(A), np.min(A), np.max(A))
# And these sums are over the two axes
print(np.sum(A,axis=0),np.sum(A,axis=1))


[[0.88197024 0.46308556 0.34709009 0.44980294]
 [0.61405926 0.26881644 0.00780931 0.85117246]
 [0.29065664 0.62817137 0.51069397 0.41827628]]
5.731604571923615 0.00780931441241306 0.8819702445113302
[1.78668614 1.36007337 0.86559338 1.71925168] [2.14194884 1.74185747 1.84779826]
