## Numpy is the main Python library for scientific computation
* Numpy provides a new data type, the `array`
* `arrays` are multi-dimensional collections of data of the same intrinsic type (int, float, etc.)

## Import numpy before using it
* `numpy` is **not** built in, but is often installed by default.
* use `import numpy` to import the entire package.
* use `from numpy import ...` to import some functions.
* use `import numpy as np` to use the most common alias.

In [2]:
import numpy as np
import numpy
from numpy import cos

print(numpy.cos, np.cos, cos)

<ufunc 'cos'> <ufunc 'cos'> <ufunc 'cos'>


## Use `numpy.zeros` to create empty arrays

In [4]:
f10 = numpy.zeros(10)
i10 = numpy.zeros(10, dtype=int)
print("default array of zeros: ", f10)
print("integer array of zeros: ", i10)

default array of zeros:  [0. 0. 0. 0. 0. 0. 0. 0. 0. 0.]
integer array of zeros:  [0 0 0 0 0 0 0 0 0 0]


## Use `numpy.ones` to create an array of ones.

In [5]:
print("Using numpy.ones    : ", numpy.ones(10))

Using numpy.ones    :  [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]


## Using `numpy.arange` to generate sets of numbers
* arange takes from one to three arguments. By default arange will generate numbers starting from 0 with a step of 1
* `arange(N)` generates numbers from 0..N-1
* `arange(M,N)` generates numbers from M..N-1
* `arange(M,N,P)` generates numbers from M..N-1 including only ever Pth number.

generate an array of numbers from 1 to 10

generate an array of numbers from 0 to 10

In [32]:
numpy.arange(10)

array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])

generate an array of numbers from 1 to 10

In [33]:
numpy.arange(1,10)

array([1, 2, 3, 4, 5, 6, 7, 8, 9])

generate an array of odd numbers from 1 to 10

In [34]:
numpy.arange(1,10,2)

array([1, 3, 5, 7, 9])

**incorrectly** generate an array of odd numbers from 1 to 10, backwards

In [7]:
numpy.arange(1,10,-2)

array([], dtype=int64)

generate an array of even numbers from 10 to 2, backwards

In [8]:
numpy.arange(10,1,-2)

array([10,  8,  6,  4,  2])

## Numpy arrays have a `size`
* Numpy arrays have a `size` parameter associated with them


In [7]:
a = numpy.arange(10)
print("a.size is", a.size)

a.size is 10


## Numpy arrays have a `shape`
* Numpy arrays have a `shape` parameter associated with them
* You can change the shape with the `reshape` method

In [9]:
a = numpy.arange(10)
print("a's shape is ",a.shape)

b=a.reshape(5,2)
print("b's shape is ",b.shape)

a's shape is  (10,)
b's shape is  (5, 2)


## Numpy arrays can be treated like single numbers in arithmetic
* Arithmetic using numpy arrays is *element-by-element*
* Matrix operations are possible with functions or methods.
* The size and shape of the arrays should match.

In [10]:
a = numpy.arange(5)
b = numpy.arange(5)
print("a=",a)
print("b=",b)
print("a+b=",a+b)
print("a*b=",a*b)

a= [0 1 2 3 4]
b= [0 1 2 3 4]
a+b= [0 2 4 6 8]
a*b= [ 0  1  4  9 16]


In [11]:
c = numpy.ones((5,2))
d = numpy.ones((5,2)) + 100
d

array([[101., 101.],
       [101., 101.],
       [101., 101.],
       [101., 101.],
       [101., 101.]])

In [13]:
c + d

array([[102., 102.],
       [102., 102.],
       [102., 102.],
       [102., 102.],
       [102., 102.]])

* Arrays need to have the same shape to be used together

In [15]:
e = numpy.ones((2,5))
c+e #c and e have different shapes

ValueError: operands could not be broadcast together with shapes (5,2) (2,5) 

In [16]:
print(e)

[[1. 1. 1. 1. 1.]
 [1. 1. 1. 1. 1.]]


## The Numpy library has many functions that work on `arrays`
* Aggregation functions like `sum`,`mean`,`size`


In [17]:
a=numpy.arange(5)
print("a = ", a)

a =  [0 1 2 3 4]


Add all of the elements of the array together.

In [18]:
print("sum(a) = ", a.sum())

sum(a) =  10


Calculate the average value of the elements in the array.

In [19]:
print("mean(a) = ", a.mean())

mean(a) =  2.0


Calculate something called `std` of the array.

In [16]:
print("std(a) = ", a.std()) #what is this?

std(a) =  1.4142135623730951


Calculate the `sin` of each element in the array

In [20]:
print("np.sin(a) = ", np.sin(a))

np.sin(a) =  [ 0.          0.84147098  0.90929743  0.14112001 -0.7568025 ]


* Note that the `math` library does not work with `numpy` arrays

In [21]:
import math
print("math.sin(a) = ", math.sin(a))

TypeError: only size-1 arrays can be converted to Python scalars

## Check the `numpy` help and webpage for more functions
https://docs.scipy.org/doc/numpy/reference/routines.html

## Use the `axis` keyword to use the function over a subset of the data.
* Many functions take the `axis` keyword to perform the aggregation of that dimension

In [22]:
a = numpy.arange(10).reshape(5,2)
print("a=",a)
print("mean(a)="  ,numpy.mean(a))
print("mean(a,0)=",numpy.mean(a,axis=0))
print("mean(a,1)=",numpy.mean(a,axis=1))

a= [[0 1]
 [2 3]
 [4 5]
 [6 7]
 [8 9]]
mean(a)= 4.5
mean(a,0)= [4. 5.]
mean(a,1)= [0.5 2.5 4.5 6.5 8.5]


## Use square brackets to access elements in the array
* Single integers in square brackets returns one element
* ranges of data can be accessed with slices

In [23]:
a=numpy.arange(10)

Access the fifth element

In [24]:
a[5]

5

Access elements 5 through 10

In [25]:
a[5:10]

array([5, 6, 7, 8, 9])

Access elements from 5 to the end of the array

In [26]:
a[5:]

array([5, 6, 7, 8, 9])

Access all elements from the start of the array to the fifth element.

In [55]:
a[:5]

array([0, 1, 2, 3, 4])

Access every 2nd element from the 5th to the 10th

In [28]:
a[5:10:2]

array([5, 7, 9])

Access every -2nd element from the 5th to the 10th. (**incorrect**)


In [29]:
a[5:10:-2]

array([], dtype=int64)

* Access every -2nd element from the 10th to the 5th. (**correct**)

In [30]:
a[10:5:-2]

array([9, 7])

## Exercise 1
There is an `arange` function and `linspace` function, that take similar arguments. Explain the difference. For example, what does the following code do?

    print (numpy.arange(1.,9,3))
    print (numpy.linspace(1.,9,3))

In [27]:
print (numpy.arange(1.,9,3))
print (numpy.linspace(1.,9,3))

[1. 4. 7.]
[1. 5. 9.]


* `arange` takes the arguments *start, stop, step*, and generates numbers from *start* to *stop* (excluding *stop*) stepping by *step* each time.
* `linspace` takes the arguments *start, stop, number*, and generates numbers from *start* to *stop* (including *stop*) with *number* of steps.

## Exercise 2
Generate a 10 x 3 array of random numbers (using `numpy.random.randn`). From each column, find the minimum absolute value. Make use of `numpy.abs` and `numpy.min` functions. The result should be a one-dimensional array.

In [35]:
a = numpy.random.randn(30).reshape(10,3)
print("a is ", a)

a is  [[-1.60521011 -0.34223924  0.83378038]
 [-1.35675697 -1.23955876  1.14066271]
 [-1.54904009  1.73652901 -0.3989485 ]
 [ 0.14233518 -0.30290938  0.78242142]
 [-2.47141284  0.1315992  -0.09231829]
 [-1.002909   -0.27133214 -0.38174639]
 [ 0.37217151  2.02437003 -0.35475461]
 [-1.3939088   1.05066409  0.9495233 ]
 [ 1.39678725  1.51510431  0.60302301]
 [ 1.05198176  2.68871235 -0.23692426]]


In [36]:
print("min(a) along each column is ", numpy.min( numpy.abs( a ), axis=0))

min(a) along each column is  [0.14233518 0.1315992  0.09231829]


## Use the `scipy` library for common scientific and numerical methods
* `scipy` contains functions to generate random numbers, calculate Fourier transforms, integrate
* Check the `scipy` website for more help: https://docs.scipy.org/doc/scipy/reference/

## Example : integrate y=x^2 from 0 to 10

In [None]:
x = numpy.arange(11) #including 10
y = x**2
import scipy.integrate
#by default, trapz assumes the independent variable is a list of integers from 0..N
int_x2 = scipy.integrate.trapz(y)
print("integral of x^2 from 0 to 10 = ", int_x2)#This value should be 10**3/3 = 333.333

## Exercise 3
Why isn't the integral of $x^2$ above exactly 333.333?

In [42]:
x = numpy.linspace(0,10,1000) # finer grid
y=x**2
print("integral of x^2 from 0 to 10 = ", scipy.integrate.trapz(y) )#This value should be 10**3/3 = 333.333

integral of x^2 from 0 to 10 =  33300.01668335002


## Exercise 4
Why is the integral 100 times bigger than expected?

In [41]:
print("integral of x^2 from 0 to 10 = ", scipy.integrate.trapz(y,x) )#This value should be 10**3/3 = 333.333

integral of x^2 from 0 to 10 =  333.333500333834


We'll come back to `scipy.optimize` later, when we fit models to experimental data.

## Keypoints
* Use the numpy library to get basic statistics out of tabular data.
* Print numpy arrays.
* Use mean, sum, std to get summary statistics.
* Add numpy arrays together.
* Study the scipy website
* Use scipy to integrate tabular data.

More details: http://paris-swc.github.io/advanced-numpy-lesson/