The idea of this notebook is to develop a basic understanding of how `numpy` works, and how to use it properly. First of all, let's import numpy.

In [1]:
import numpy as np

Numpy has a bunch of convenient mathematical functions built-in. Let's compute $\sin(\pi/2)$, $\cos(\pi/2)$, $\tan(\pi/2)$. You'll see how $\cos(\pi/2)$ is only approximately zero and $\tan(\pi/2)$ is a huge number, so be careful about this!

In [7]:
print(np.sin(np.pi/2),np.cos(np.pi/2), np.tan(np.pi/2))

1.0 6.123233995736766e-17 1.633123935319537e+16


The basic numpy data structure is the array, which is similar to the built-in python list, but much more efficient.

In [10]:
print(np.arange(1,100,1))

[ 1  2  3  4  5  6  7  8  9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24
 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48
 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72
 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96
 97 98 99]


You can index a numpy array in many different ways, for example:
- Using single indices and slices, like you would a Python list
- Using an array or list of indices
- Using booleans (`True`, `False`) and conditions

In [19]:
a = np.arange(1,100,1)
print("Slicing and indexing:")
print(a[1], a[1:10])
# Using a list to index the array:
print("Indexing using a list:")
print(a[[0,1,2,3]])
#Let's now take only places where a < 10. There are two ways of doing this:
print("Condition a < 10. Where is it true?")
print(a < 10, np.where(a < 10))
print("Indexing using the condition:")
print(a[a<10], a[np.where(a < 10)])

Slicing and indexing:
2 [ 2  3  4  5  6  7  8  9 10]
Indexing using a list:
[1 2 3 4]
Condition a < 10. Where is it true?
[ True  True  True  True  True  True  True  True  True False False False
 False False False False False False False False False False False False
 False False False False False False False False False False False False
 False False False False False False False False False False False False
 False False False False False False False False False False False False
 False False False False False False False False False False False False
 False False False False False False False False False False False False
 False False False False False False False False False False False False
 False False False] (array([0, 1, 2, 3, 4, 5, 6, 7, 8]),)
Indexing using the condition:
[1 2 3 4 5 6 7 8 9] [1 2 3 4 5 6 7 8 9]


We can also do some fancy things that reshape the array. Let's first create a 2 by 2 array, then use it to index our original array!

In [23]:
b = np.array([[98,87],[45,3]])
print("b:")
print(b)
print("a:")
print(a[b])

b:
[[98 87]
 [45  3]]
a:
[[99 88]
 [46  4]]


We can also change the shape of our array uisng reshape:

In [25]:
b = a.reshape((3,33))
print(b)

[[ 1  2  3  4  5  6  7  8  9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24
  25 26 27 28 29 30 31 32 33]
 [34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57
  58 59 60 61 62 63 64 65 66]
 [67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90
  91 92 93 94 95 96 97 98 99]]


One very important thing in numpy is to try to as often as possible use the broadcasting/vectorization capabilities of numpy. Under the hood, numpy uses C. So every time you call a function in numpy, it calls a C function that operates on your input. Numpy functions can take arrays as inputs, and this is almost always better than using, say, a for loop to compute it. For example, let's make an array from $-\pi$ to $\pi$ with 1000000 steps and compute the cosine of it.

In [36]:
a = np.linspace(-np.pi,np.pi,1000000)

In [53]:
%%timeit
b = []
for i in a:
    b.append(np.cos(i))
b = np.array(b)

115 ms ± 1.44 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)


In [54]:
%%timeit
b = np.cos(a)

1.26 ms ± 15.8 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)


Doing the vectorized operation in numpy is almost 100x faster! This can be done with all array shapes. Let's take, for example, a plane

In [58]:
a = np.linspace(-np.pi, np.pi, 1000)
b = np.linspace(-np.pi, np.pi, 1000)

plane = np.meshgrid(a,b)[0]
print("Meshgrid makes this a matrix of size 1000x1000")
print(plane)


[[-3.14159265 -3.13530318 -3.1290137  ...  3.1290137   3.13530318
   3.14159265]
 [-3.14159265 -3.13530318 -3.1290137  ...  3.1290137   3.13530318
   3.14159265]
 [-3.14159265 -3.13530318 -3.1290137  ...  3.1290137   3.13530318
   3.14159265]
 ...
 [-3.14159265 -3.13530318 -3.1290137  ...  3.1290137   3.13530318
   3.14159265]
 [-3.14159265 -3.13530318 -3.1290137  ...  3.1290137   3.13530318
   3.14159265]
 [-3.14159265 -3.13530318 -3.1290137  ...  3.1290137   3.13530318
   3.14159265]]
