# 2 Numpy

---

> Author: <font color='#f78c40'>Samuel Farrens</font>    
> Year: 2017  
> Email: [samuel.farrens@gmail.com](mailto:samuel.farrens@gmail.com)  
> Website: <a href="https://sfarrens.github.io" target="_blank">https://sfarrens.github.io</a>

---

<a href="http://www.numpy.org/" target="_blank">Numpy</a> is a numerical python packing for scientific computing. The website contains extensive documentation on all the modules available. If you only ever install one Python package this should be it.

---

## Contents

1. [Installation](#Installation)
1. [Importing Numpy](#Importing-Numpy)
1. [Arrays](#Arrays)
 * [The Basics](#The-Basics)
 * [Multidimensional Arrays](#Multidimensional-Arrays)
 * [Mathematical Operations](#Mathematical-Operations)
1. [Random Numbers](#Random-Numbers)
1. [Reading and Writing Files](#Reading-and-Writing-Files)
 * [ASCII Files](#ASCII-Files)
 * [Binary Files](#Binary-Files)
1. [Linear Alegbra](#Linear-Algebra)
1. [Exercises](#Exercises)
 * [Exercise 1](#Exercise-1)

---

## Installation

To install numpy simply run the following command in a terminal (providing you have already set up pip)

```bash

    $ pip install numpy

```

---

## Importing Numpy

To import numpy use the **`import`** command as follows

In [1]:
import numpy

It more convenient to assign the numpy package contents to an alias to avoid having longer expressions.

In [2]:
import numpy as np

In this example the **`as`** statement assigns the numpy package contents to the object `np`.

---

## Arrays

### The Basics

The most essential numpy object is the numpy array (<a href="https://docs.scipy.org/doc/numpy/reference/generated/numpy.array.html" target="_blank">numpy.ndarray</a>).

In [3]:
a = [1, 2, 3, 4]
print 'a is', type(a)

b = np.array(a)
print 'b is', type(b)

a is <type 'list'>
b is <type 'numpy.ndarray'>


The **`np.array()`** command converts the list `a` to a numpy array `b`.

Numpy arrays have several fantastic properties. Some basic examples are the following

In [4]:
a = np.array([4, 5, 6, 8, 7, 5, 2, 1, 0, 4, 0, 6, 5, 4, 2, 1, 1, 7, 4, 4])
print 'a =', a
print 'a backwards is', a[::-1]
print 'a has size', a.size
print 'a has mean', a.mean()
print 'a has standard deviation', a.std()
print 'the 3rd through 6th elements of a are', a[2:6]
print 'the last element of a is', a[-1]
print 'the elements of a greater than 3 are', a[a > 3]
print 'the elements of a excluding 5 are', a[a != 5]

a = [4 5 6 8 7 5 2 1 0 4 0 6 5 4 2 1 1 7 4 4]
a backwards is [4 4 7 1 1 2 4 5 6 0 4 0 1 2 5 7 8 6 5 4]
a has size 20
a has mean 3.8
a has standard deviation 2.35796522451
the 3rd through 6th elements of a are [6 8 7 5]
the last element of a is 4
the elements of a greater than 3 are [4 5 6 8 7 5 4 6 5 4 7 4 4]
the elements of a excluding 5 are [4 6 8 7 2 1 0 4 0 6 4 2 1 1 7 4 4]


In this example the size of the numpy array is displayed using the **`size`** command. The mean and standard deviation of the array elements are calculated using the **`mean()`** and **`std()`** commands, respectively.

Numpy includes several nice ways to quickly generate arrays using the commands **`zeros()`**, **`ones()`** and **`arange()`**.

In [5]:
print '- Generate an array of zeros.'
print ' ', np.zeros(10)
print ''

print '- Generate an array of ones.'
print ' ', np.ones(10)
print ''

print '- Generate a range of integers from 0 to 9.'
print ' ', np.arange(10)
print ''

- Generate an array of zeros.
  [ 0.  0.  0.  0.  0.  0.  0.  0.  0.  0.]

- Generate an array of ones.
  [ 1.  1.  1.  1.  1.  1.  1.  1.  1.  1.]

- Generate a range of integers from 0 to 9.
  [0 1 2 3 4 5 6 7 8 9]



### Multidimensional Arrays

The **`ndim`** command can be used to display the number of dimensions of a given array. while **`shape`** command can be used to display the size of each of the dimensions.

In [6]:
a = np.array([[1, 2, 3], [4, 5, 6], [7, 8, 9]])

print 'a has', a.ndim, 'dimensions'
print 'the shape of a is', a.shape

a has 2 dimensions
the shape of a is (3, 3)


It is also possible to change the shape of an existing array using the **`reshape()`** command.

In [7]:
x = np.arange(16)
print 'x =', x
print ''

x = x.reshape(4, 4)
print 'x =', x

x = [ 0  1  2  3  4  5  6  7  8  9 10 11 12 13 14 15]

x = [[ 0  1  2  3]
 [ 4  5  6  7]
 [ 8  9 10 11]
 [12 13 14 15]]


### Mathematical Operations

One of the most useful features of numpy is that it allows element-wise mathematical operations.

In [8]:
x = np.arange(5)
print 'x =', x
print '2x =', 2 * x
print 'x^2 =', x ** 2
print 'sqrt(x) =', np.sqrt(x)
print 'x * pi =', np.pi * x
print ''

y = np.array([7, 9, 2, 3, 2])
print 'y =', y
print 'x + y =', x + y
print ''

a = np.ones((2, 2)).astype(int) * 2
b = np.arange(4).reshape(2, 2)
print 'a ='
print a
print 'b ='
print b
print 'a * b ='
print a * b

x = [0 1 2 3 4]
2x = [0 2 4 6 8]
x^2 = [ 0  1  4  9 16]
sqrt(x) = [ 0.          1.          1.41421356  1.73205081  2.        ]
x * pi = [  0.           3.14159265   6.28318531   9.42477796  12.56637061]

y = [7 9 2 3 2]
x + y = [ 7 10  4  6  6]

a =
[[2 2]
 [2 2]]
b =
[[0 1]
 [2 3]]
a * b =
[[0 2]
 [4 6]]


---

## Random Numbers

The numpy **`random`** module (<a href="https://docs.scipy.org/doc/numpy/reference/routines.random.html" target="_blank">numpy.random</a>) contain various methods for generating random numbers.

In [9]:
print '- Generate an array of 10 random integers with values between 2 and 9.'
print ' ', np.random.randint(2, 9, 10)
print ''

print '- Generate an array of 5 random floats with values between 0 and 1.'
print ' ', np.random.ranf(5)
print ''

print '- Generate an array of 5 Gaussian distributed random floats with values between -1 and 1.'
print ' ', np.random.randn(5)
print ''

print '- Generate an array of 5 Poisson distributed random integers.'
print ' ', np.random.poisson(1, 5)
print ''


print '- Generate an 2D array of random numbers.'
print ' ', np.random.rand(5, 5)
print ''

- Generate an array of 10 random integers with values between 2 and 9.
  [2 4 7 8 2 3 7 2 4 6]

- Generate an array of 5 random floats with values between 0 and 1.
  [ 0.83763648  0.93169243  0.7588618   0.31003726  0.91353357]

- Generate an array of 5 Gaussian distributed random floats with values between -1 and 1.
  [-0.75811755 -0.21855471 -0.5868432   0.29183421  0.81256019]

- Generate an array of 5 Poisson distributed random integers.
  [0 0 2 1 0]

- Generate an 2D array of random numbers.
  [[ 0.15594737  0.09846457  0.1642568   0.13904037  0.89445259]
 [ 0.58365273  0.397344    0.77478171  0.54908231  0.30587479]
 [ 0.35858792  0.52544045  0.5478804   0.79765923  0.01583088]
 [ 0.84271475  0.12152521  0.0023797   0.66132027  0.43793267]
 [ 0.92009158  0.81910957  0.78104538  0.7348352   0.91265858]]



---

## Reading and Writing Files

Numpy provides some advatages over native Python in terms of reading and writing files.

### ASCII Files

ASCII (plain text) files can be read using the **`genfromtxt()`** command (<a href="https://docs.scipy.org/doc/numpy/reference/generated/numpy.genfromtxt.html" target="_blank">numpy.genfromtxt</a>) as follows

In [73]:
# Read in the file
data = np.genfromtxt('materials/ninja_turtles.txt', names=True, dtype=None)

# Print the column names
print data.dtype.names
print ''

# Print the rows
for line in data:
    print line

print ''

# Calculate the average age
print 'The average age is', data['Age'].mean()

('Name', 'Age', 'Colour', 'Weapon')

('Leonardo', 15, 'blue', 'Katana')
('Raphael', 15, 'Red', 'Sai')
('Donatello', 14, 'Purple', 'Bo-staff')
('Michelangelo', 13, 'Orange', 'Nunchucks')

The average age is 14.25


In the example above a file called `ninja_turtles.txt` which is in the directory `materials` is read in. The first row
of this file provides column names which are stored in `dtype.names` using the **`names`** option. Setting the **`dtype`** as `None` allows the command to assign the appropriate data type to each column in the file.

Data can be saved to an ASCII file using the **`savetxt()`** command (<a href="https://docs.scipy.org/doc/numpy/reference/generated/numpy.savetxt.html" target="_blank">numpy.savetext</a>) as follows 

In [84]:
string = 'average turtle age is ' + str(data['Age'].mean())

np.savetxt('materials/turtle_age.txt', [string], fmt='%s')

### Binary Files

Numpy arrays can also be read in or saved to special numpy binary files (`.npy` extension) using the **`load()`** and **`save()`** commands (<a href="https://docs.scipy.org/doc/numpy/reference/generated/numpy.save.html" target="_blank">numpy.save</a>, <a href="https://docs.scipy.org/doc/numpy/reference/generated/numpy.load.html" target="_blank">numpy.load</a>), respectively.

In [89]:
# Read in the numpy binary file
data = np.load('materials/ninja_turtles.npy')

# Print the array contents
print data
print ''

# Increase ages by 1
data['Age'] += 1
print data

# Save new data to a binary file
np.save('materials/ninja_turtles_new.npy', data)

[('Leonardo', 15, 'blue', 'Katana') ('Raphael', 15, 'Red', 'Sai')
 ('Donatello', 14, 'Purple', 'Bo-staff')
 ('Michelangelo', 13, 'Orange', 'Nunchucks')]

[('Leonardo', 16, 'blue', 'Katana') ('Raphael', 16, 'Red', 'Sai')
 ('Donatello', 15, 'Purple', 'Bo-staff')
 ('Michelangelo', 14, 'Orange', 'Nunchucks')]


---

## Linear Algebra

Numpy also handles linear algebra (<a href="https://docs.scipy.org/doc/numpy/reference/routines.linalg.html">numpy.linalg</a>). For example, you can multiply two matrices using the **`dot()`** command.

In [10]:
x = np.random.randint(0, 9, (3, 3))
print 'x ='
print x
print ''

y = np.random.randint(0, 9, (3, 3))
print 'y ='
print y
print ''

print 'xy ='
print np.dot(x, y)

x =
[[1 6 3]
 [1 8 3]
 [4 0 0]]

y =
[[1 1 2]
 [2 8 8]
 [4 0 3]]

xy =
[[25 49 59]
 [29 65 75]
 [ 4  4  8]]


You can also calculate the inverse of a matrix

In [93]:
a = np.array([[1., 2.], [3., 4.]])

print 'a^-1 ='
print np.linalg.inv(a)

a^-1 =
[[-2.   1. ]
 [ 1.5 -0.5]]


or you can calculate the $l_2$-norm of a matrix using the **`norm`** command

In [11]:
x = np.random.ranf((3, 3))
print 'x =', x
print ''
print '||x|| =', np.linalg.norm(x)

x = [[ 0.64758246  0.69747508  0.39679684]
 [ 0.12249095  0.35390599  0.25149235]
 [ 0.91941756  0.92663294  0.33604258]]

||x|| = 1.75604271172


---

## Exercises

### Exercise 2.1

Read in the file called `nearest_stars.txt` in the directory `materials` and determine the standard deviation of the distances of the stars with spectral type `M`.

In [94]:
# Implement your code here

### Exercise 2.2

Generate a three dimensional array of random floats with shape (3, 4, 4) using a random seed of 8. For each 4 x 4 matrix in the array calculate the $l_1$-norm and then calculate the mean of these values.

**HINT:** Make sure you check the documentation pages for commands that are not mentioned in this notebook.

In [None]:
# Implement your code here