# 2 Numpy

---

> Author: <font color='#f78c40'>Samuel Farrens</font>    
> Year: 2017  
> Email: [samuel.farrens@gmail.com](mailto:samuel.farrens@gmail.com)  
> Website: <a href="https://sfarrens.github.io" target="_blank">https://sfarrens.github.io</a>

---

<a href="http://www.numpy.org/" target="_blank">Numpy</a> is a numerical python package for scientific computing. If you only ever install one Python package this should be it.

The website contains extensive documentation on all the modules available with examples. This notebook barely skims the surface of what you cand do with numpy, but I have provided a few examples to help get you started.

---

## Contents

1. [Installation](#Installation)
1. [Importing Numpy](#Importing-Numpy)
1. [Arrays](#Arrays)
 * [The Basics](#The-Basics)
 * [Multidimensional Arrays](#Multidimensional-Arrays)
 * [Mathematical Operations](#Mathematical-Operations)
1. [Random Numbers](#Random-Numbers)
1. [Reading and Writing Files](#Reading-and-Writing-Files)
 * [ASCII Files](#ASCII-Files)
 * [Binary Files](#Binary-Files)
1. [Linear Alegbra](#Linear-Algebra)
1. [Exercises](#Exercises)
 * [Exercise 2.1](#Exercise-2.1)
 * [Exercise 2.2](#Exercise-2.1)

---

## Installation

To install numpy simply run the following command in a terminal (providing you have already set up pip)

```bash

$ pip install numpy

```

---

## Importing Numpy

To import numpy use the **`import`** command as follows

In [1]:
import numpy

It more convenient to assign the numpy package contents to an alias to avoid having longer expressions.

In [2]:
import numpy as np

In this example the **`as`** statement assigns the numpy package contents to the object `np`.

---

## Arrays

### The Basics

The most essential numpy object is the numpy array (<a href="https://docs.scipy.org/doc/numpy/reference/generated/numpy.array.html" target="_blank">numpy.ndarray</a>).

In [3]:
# a is a list
a = [1, 2, 3, 4]
print('a is', type(a))

# b is a numpy array
b = np.array(a)
print('b is', type(b))

a is <class 'list'>
b is <class 'numpy.ndarray'>


The **`np.array()`** command converts the list `a` to a numpy array `b`.

Numpy arrays have several fantastic properties. Some basic examples are shown below.

For a given array:

In [4]:
# a is 1D numpy array
a = np.array([4, 5, 6, 8, 7, 5, 2, 1, 0, 4, 0, 6, 5, 4, 2, 4, 1, 7, 1, 4])

# print a
print('a =', a)

a = [4 5 6 8 7 5 2 1 0 4 0 6 5 4 2 4 1 7 1 4]


You can print a given element of the array.

In [5]:
# print the 10th element of a
print('the 10th element of a is', a[10])

the 10th element of a is 0


You can also index from the end of the array.

In [6]:
# print the last element of a
print('the last element of a is', a[-1])

# print the second to last element of a
print('the second to last element of a is', a[-2])

the last element of a is 4
the second to last element of a is 1


You can specfiy every element in an array with a colon **`:`**. This can also be used for *slicing* (*i.e.* to specify every element before or after a certain point).

In [7]:
# print every element in a
print('a =', a[:])

# print every element up to index 10
print('first half of elements in a are', a[:10])

# print every element after index 10
print('second half of elements in a are', a[10:])

# print every element between index 2 and 6
print('the 3rd through 6th elements of a are', a[2:6])

a = [4 5 6 8 7 5 2 1 0 4 0 6 5 4 2 4 1 7 1 4]
first half of elements in a are [4 5 6 8 7 5 2 1 0 4]
second half of elements in a are [0 6 5 4 2 4 1 7 1 4]
the 3rd through 6th elements of a are [6 8 7 5]


You can print every $i^{th}$ element using a double set of colons **`::`**

In [8]:
# print every 5th element
print('every fifth element of a is', a[::5])

every fifth element of a is [4 5 0 4]


You can also use double colons to reverse the entire array.

In [9]:
# print a backwards
print('a backwards is', a[::-1])

a backwards is [4 1 7 1 4 2 4 5 6 0 4 0 1 2 5 7 8 6 5 4]


It is possible to subample a list using a list of indices.

In [10]:
# extract elements 1 3 11 and 18 from a
print('elements at indices 1, 3, 11 and 18 of a are', a[[1, 3, 11, 18]])

elements at indices 1, 3, 11 and 18 of a are [5 8 6 1]


You can also mask out items using booleans and hence conditions.

In [11]:
print('the elements of a greater than 3 are', a[a > 3])
print('the elements of a excluding 5 are', a[a != 5])

the elements of a greater than 3 are [4 5 6 8 7 5 4 6 5 4 4 7 4]
the elements of a excluding 5 are [4 6 8 7 2 1 0 4 0 6 4 2 4 1 7 1 4]


You can check if any or all elements in an array have certain properties using the **`any()`** and **`all()`** commands.

In [12]:
print('are there any elements in a greater than 10?', np.any(a > 10))
print('are there all of elements in a less than 10?', np.all(a < 10))

are there any elements in a greater than 10? False
are there all of elements in a less than 10? True


There several built-in functions for calculating the properties of Numpy arrays.

In [13]:
print('a has size', a.size)
print('a has mean', a.mean())
print('a has standard deviation', a.std())

a has size 20
a has mean 3.8
a has standard deviation 2.3579652245103193


In this example the size of the numpy array is displayed using the **`size`** command. The mean and standard deviation of the array elements are calculated using the **`mean()`** and **`std()`** commands, respectively.

The **`where()`** command can be used to find the index of a given elements.

In [14]:
print('a = 8 at index', np.where(a == 8)[0])

a = 8 at index [3]


Numpy includes several nice ways to quickly generate arrays using the commands **`zeros()`**, **`ones()`** and **`arange()`**.

In [15]:
print('- Generate an array of zeros.')
print(' ', np.zeros(10))
print('')

print('- Generate an array of ones.')
print(' ', np.ones(10))
print('')

print('- Generate a range of integers from 0 to 9.')
print(' ', np.arange(10))
print('')

- Generate an array of zeros.
  [0. 0. 0. 0. 0. 0. 0. 0. 0. 0.]

- Generate an array of ones.
  [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]

- Generate a range of integers from 0 to 9.
  [0 1 2 3 4 5 6 7 8 9]



### Multidimensional Arrays

The **`ndim`** command can be used to display the number of dimensions of a given array. while **`shape`** command can be used to display the size of each of the dimensions.

In [16]:
# a is a 2D numpy array
a = np.array([[1, 2, 3], [4, 5, 6], [7, 8, 9]])

print('a has', a.ndim, 'dimensions')
print('the shape of a is', a.shape)

a has 2 dimensions
the shape of a is (3, 3)


**<font color='red'>NOTE:</font>** notice the extra set of square brackets!

It is also possible to change the shape of an existing array using the **`reshape()`** command.

In [17]:
# x is a 1D array
x = np.arange(16)
print('x =', x)
print('')

# x is now a 2D array
x = x.reshape(4, 4)
print('x =', x)

x = [ 0  1  2  3  4  5  6  7  8  9 10 11 12 13 14 15]

x = [[ 0  1  2  3]
 [ 4  5  6  7]
 [ 8  9 10 11]
 [12 13 14 15]]


All of the indexing options shown for 1D arrays are equally applicable to multidimensional arrays. For example, for a 5x5 array

In [18]:
x = np.arange(25).reshape(5, 5)

print('x =', x)

x = [[ 0  1  2  3  4]
 [ 5  6  7  8  9]
 [10 11 12 13 14]
 [15 16 17 18 19]
 [20 21 22 23 24]]


You can specify whole rows or columns.

In [19]:
# first row of x
print('the first row of x is', x[0])

# 3rd column of x
print('the third column of x is', x[:, 2])

the first row of x is [0 1 2 3 4]
the third column of x is [ 2  7 12 17 22]


Or you can specify single elements

In [20]:
# 2nd element in the 4th row of x
print('the second element in the 4th row of x is', x[3, 1])

the second element in the 4th row of x is 16


Slicing also works.

In [21]:
# print every other element in the 1st two rows of x
print('every other element in the first two rows of x are')
print(x[:2, ::2])

every other element in the first two rows of x are
[[0 2 4]
 [5 7 9]]


There several handy numpy commands for transforming multidimensional arrays. The command **`rot90()`** rotates a matrix by 90 degrees.

In [22]:
# rotate x by 180 degrees
print('x rotated by 180 degrees is')
print(np.rot90(x, 2))

x rotated by 180 degrees is
[[24 23 22 21 20]
 [19 18 17 16 15]
 [14 13 12 11 10]
 [ 9  8  7  6  5]
 [ 4  3  2  1  0]]


The command **`transpose()`** will transpose an array.

In [23]:
# transpose x
print('x transpose is')
print(np.transpose(x))

x transpose is
[[ 0  5 10 15 20]
 [ 1  6 11 16 21]
 [ 2  7 12 17 22]
 [ 3  8 13 18 23]
 [ 4  9 14 19 24]]


### Mathematical Operations

One of the most useful features of numpy is that it allows element-wise mathematical operations. It also has built-in <a href="https://docs.scipy.org/doc/numpy/reference/routines.math.html" target="_blank">mathematical operators</a>.

You can multiply every element in an array by a single number.

In [24]:
x = np.arange(5) + 1

print('x =', x)
print('2x =', 2 * x)
print('x^2 =', x ** 2)
print('sqrt(x) =', np.sqrt(x))
print('x * pi =', np.pi * x)
print('log(x) =', np.log(x))

x = [1 2 3 4 5]
2x = [ 2  4  6  8 10]
x^2 = [ 1  4  9 16 25]
sqrt(x) = [1.         1.41421356 1.73205081 2.         2.23606798]
x * pi = [ 3.14159265  6.28318531  9.42477796 12.56637061 15.70796327]
log(x) = [0.         0.69314718 1.09861229 1.38629436 1.60943791]


You can also multiply each element in one array by each element in another of the same size.

In [25]:
y = np.array([7, 9, 2, 3, 2])
print('y =', y)

# check if x and y have the same size
print('x is the same size as y:', x.size == y.size)

# add the elements of x to those of y
print('x + y =', x + y)

y = [7 9 2 3 2]
x is the same size as y: True
x + y = [ 8 11  5  7  7]


This also works for multidimensional arrays with the same shape.

In [26]:
a = np.ones((2, 2)).astype(int) * 2
b = np.arange(4).reshape(2, 2)
print('a =', a)
print('b =', b)
print('')

# check if a and b have the same shape
print('a has the same shape as b:', a.shape == b.shape)
print('')

# multiply the elements of a by those of b
print('a * b =')
print(a * b)

a = [[2 2]
 [2 2]]
b = [[0 1]
 [2 3]]

a has the same shape as b: True

a * b =
[[0 2]
 [4 6]]


---

## Random Numbers

The numpy **`random`** module (<a href="https://docs.scipy.org/doc/numpy/reference/routines.random.html" target="_blank">numpy.random</a>) contain various methods for generating random numbers.

In [27]:
print('- Generate an array of 10 random integers with values between 2 and 9.')
print(' ', np.random.randint(2, 9, 10))
print('')

print('- Generate an array of 5 random floats with values between 0 and 1.')
print(' ', np.random.ranf(5))
print('')

print('- Generate an array of 5 Gaussian distributed random floats with values between -1 and 1.')
print(' ', np.random.randn(5))
print('')

print('- Generate an array of 5 Poisson distributed random integers.')
print(' ', np.random.poisson(1, 5))
print('')

print('- Generate an 2D array of random numbers.')
print(' ', np.random.rand(5, 5))
print('')

- Generate an array of 10 random integers with values between 2 and 9.
  [4 4 3 2 2 5 4 8 6 6]

- Generate an array of 5 random floats with values between 0 and 1.
  [0.58909463 0.98591145 0.83243917 0.22774627 0.43334477]

- Generate an array of 5 Gaussian distributed random floats with values between -1 and 1.
  [-0.64805821 -1.55263507 -1.02854768  0.47775267 -1.1332579 ]

- Generate an array of 5 Poisson distributed random integers.
  [0 2 0 0 0]

- Generate an 2D array of random numbers.
  [[0.27137778 0.66224629 0.6642498  0.31155691 0.35842392]
 [0.84461998 0.9093611  0.5619712  0.5350885  0.82133144]
 [0.21377067 0.42419968 0.82904981 0.04087623 0.12682881]
 [0.83874192 0.86176975 0.16945293 0.4653733  0.6062168 ]
 [0.37718149 0.40608579 0.47345515 0.47165235 0.39217713]]



---

## Reading and Writing Files

Numpy provides some advatages over native Python in terms of reading and writing files.

> If you have not already done so, you will need to download the [Materials](https://minhaskamal.github.io/DownGit/#/home?url=https://github.com/sfarrens/notebooks/tree/master/Python/materials) directory and unzip it.  
> You can use the following command in a terminal:  
> ``` bash
> $ unzip materials.zip
> ```  
> **<font color='red'>NOTE:</font>** you need to place the `Materials` folder in the same directory as this notebook.

### ASCII Files

ASCII (plain text) files can be read using the **`genfromtxt()`** command (<a href="https://docs.scipy.org/doc/numpy/reference/generated/numpy.genfromtxt.html" target="_blank">numpy.genfromtxt</a>) as follows

In [28]:
# Read in the file
data = np.genfromtxt('materials/ninja_turtles.txt', names=True, dtype=None)

# Print the column names
print(data.dtype.names)
print('')

# Print the rows
for line in data:
    print(line)

print('')

# Calculate the average age
print('The average age is', data['Age'].mean())

('Name', 'Age', 'Colour', 'Weapon')

(b'Leonardo', 15, b'blue', b'Katana')
(b'Raphael', 15, b'Red', b'Sai')
(b'Donatello', 14, b'Purple', b'Bo-staff')
(b'Michelangelo', 13, b'Orange', b'Nunchucks')

The average age is 14.25


  


In the example above a file called `ninja_turtles.txt` which is in the directory `materials` is read in. The first row
of this file provides column names which are stored in `dtype.names` using the **`names`** option. Setting the **`dtype`** as `None` allows the command to assign the appropriate data type to each column in the file.

Data can be saved to an ASCII file using the **`savetxt()`** command (<a href="https://docs.scipy.org/doc/numpy/reference/generated/numpy.savetxt.html" target="_blank">numpy.savetext</a>) as follows 

In [29]:
# create a new string
string = 'average turtle age is ' + str(data['Age'].mean())

# save the string to a file
np.savetxt('materials/turtle_age.txt', [string], fmt='%s')

In this example we need to pass the string as a list (a numpy array would also work) to **`savetext()`**. Also, we need to define the format of the data using the **`fmt`** option.

### Binary Files

Numpy arrays can also be read in or saved to special numpy binary files (`.npy` extension) using the **`load()`** and **`save()`** commands (<a href="https://docs.scipy.org/doc/numpy/reference/generated/numpy.save.html" target="_blank">numpy.save</a>, <a href="https://docs.scipy.org/doc/numpy/reference/generated/numpy.load.html" target="_blank">numpy.load</a>), respectively.

In [30]:
# Read in the numpy binary file
data = np.load('materials/ninja_turtles.npy')

# Print the array contents
print(data)
print('')

# Increase ages by 1
data['Age'] += 1
print(data)

# Save new data to a binary file
np.save('materials/ninja_turtles_new.npy', data)

[(b'Leonardo', 15, b'blue', b'Katana') (b'Raphael', 15, b'Red', b'Sai')
 (b'Donatello', 14, b'Purple', b'Bo-staff')
 (b'Michelangelo', 13, b'Orange', b'Nunchucks')]

[(b'Leonardo', 16, b'blue', b'Katana') (b'Raphael', 16, b'Red', b'Sai')
 (b'Donatello', 15, b'Purple', b'Bo-staff')
 (b'Michelangelo', 14, b'Orange', b'Nunchucks')]


Numpy binaries have the fantastic property of retaining formatting and data structures. In addition, they can be read and written much faster than text files.

---

## Linear Algebra

Numpy also handles linear algebra (<a href="https://docs.scipy.org/doc/numpy/reference/routines.linalg.html">numpy.linalg</a>). For example, you can multiply two matrices using the **`dot()`** command.

In [31]:
x = np.random.randint(0, 9, (3, 3))
print('x =')
print(x)
print('')

y = np.random.randint(0, 9, (3, 3))
print('y =')
print(y)
print('')

print('xy =')
print(np.dot(x, y))

x =
[[2 4 6]
 [0 5 3]
 [3 8 4]]

y =
[[5 8 4]
 [3 0 6]
 [3 4 7]]

xy =
[[40 40 74]
 [24 12 51]
 [51 40 88]]


You can also calculate the inverse of a matrix

In [32]:
a = np.array([[1., 2.], [3., 4.]])
print('a =')
print(a)
print('')

print('a^-1 =')
print(np.linalg.inv(a))

a =
[[1. 2.]
 [3. 4.]]

a^-1 =
[[-2.   1. ]
 [ 1.5 -0.5]]


or you can calculate the $l_2$-norm of a matrix using the **`norm`** command

In [33]:
x = np.random.ranf((3, 3))
print('x =', x)
print('')
print('||x|| =', np.linalg.norm(x))

x = [[0.65653567 0.44407602 0.95817188]
 [0.21299093 0.97204924 0.45534021]
 [0.25526379 0.9636281  0.45501517]]

||x|| = 1.9861251285369725


---

## Exercises

### Exercise 2.1

Read in the file called `nearest_stars.txt` in the directory `materials` and determine the standard deviation of the distances of the stars with spectral type `M`.

In [34]:
# Implement your code here

### Exercise 2.2

Generate a three dimensional array of random floats with shape (3, 4, 4) using a random seed of 8. For each 4 x 4 matrix in the array calculate the $l_1$-norm and then calculate the mean of these values.

**HINT:** Make sure you check the documentation pages for commands that are not mentioned in this notebook.

In [35]:
# Implement your code here

You can find some example answers to the exercises [here](./Answers-to-exercises.ipynb).

---

> **Continue to [next topic](./Matplotlib.ipynb)**