# Scientific Programming in Python
Josh Dillon, [jsdillon@berkeley.edu](mailto:jsdillon@berkeley.edu)

In this lesson, we're going to...TODO

Adapted from Josh Bloom's Python Bootcamp lectures on [numpy](https://github.com/profjsb/python-bootcamp/blob/master/Lectures/21_NumpyMatplotlib/IntroNumPy.ipynb) and [matplotlib](https://github.com/profjsb/python-bootcamp/blob/master/Lectures/21_NumpyMatplotlib/IntroMatplolib.ipynb).

In [60]:
import numpy as np
import matplotlib.pyplot as plt
%matplotlib notebook 

# Numpy

## What are numpy arrays and how do they work?
Numpy arrays are a lot like default python lists.

In [61]:
lst = [1,2,3]
npa = np.array([1,2,3])
print 'Python list:', lst
print 'Numpy array:', npa

Python list: [1, 2, 3]
Numpy array: [1 2 3]


Lists are a bit more general...

In [62]:
lst = [1.0, 'dog', (1,2), [[5,5],[5,5]], 1.0+1.0j, {'a': 7}]
print lst

[1.0, 'dog', (1, 2), [[5, 5], [5, 5]], (1+1j), {'a': 7}]


But lists don't really have a notion of acting upon every element like we would a vector or a matrix.

In [63]:
lst = range(4)
print lst, lst*2
nparr = np.arange(4)
print nparr, nparr*2

[0, 1, 2, 3] [0, 1, 2, 3, 0, 1, 2, 3]
[0 1 2 3] [0 2 4 6]


You can always turn a numpy array back into a list. You can make a numpy array from a list, but only if it's homogenous and rectangular.

In [64]:
print list(np.arange(4))

print np.array([[1,2], [3,4]])
print np.array([10, [10,10]])

[0, 1, 2, 3]
[[1 2]
 [3 4]]


ValueError: setting an array element with a sequence.

## Array slicing and multidimensional arrays
It's easy to initialize numpy arrays of ones or zeros, even in higher dimensions

In [65]:
ones_1d = np.ones(5)
zeros_2d = np.zeros((3,5))
print ones_1d, '\n\n', zeros_2d

[ 1.  1.  1.  1.  1.] 

[[ 0.  0.  0.  0.  0.]
 [ 0.  0.  0.  0.  0.]
 [ 0.  0.  0.  0.  0.]]


And numpy supports all sorts of very useful list slicing (this takes some time to visualize and master)

In [66]:
nparr = np.arange(10)
print 'Full array:', nparr
print 'First element:', nparr[0]
print 'Second element:', nparr[1]
print 'Last element:', nparr[-1]
print 'Second-to-last element:', nparr[-2]
print 'First three elements:', nparr[0:3]
print 'Even elements:', nparr[0::2]
print 'Odd elements:', nparr[1::2]
print 'First three odd elements:',nparr[1:6:2] 

Full array: [0 1 2 3 4 5 6 7 8 9]
First element: 0
Second element: 1
Last element: 9
Second-to-last element: 8
First three elements: [0 1 2]
Even elements: [0 2 4 6 8]
Odd elements: [1 3 5 7 9]
First three odd elements: [1 3 5]


In [67]:
rand_2d = np.random.rand(4,3) #random numbers between 0 and 1
print '2d random array:\n', rand_2d
print '\nFirst row:\n', rand_2d[0,:]
print '\nLast column:\n', rand_2d[:,-1]
print '\nTop left 2x2 square:\n', rand_2d[0:2,0:2]

2d random array:
[[ 0.13045476  0.8504469   0.70562777]
 [ 0.00988469  0.75964084  0.02530461]
 [ 0.09880245  0.34864079  0.08488974]
 [ 0.15384778  0.1220596   0.85174172]]

First row:
[ 0.13045476  0.8504469   0.70562777]

Last column:
[ 0.70562777  0.02530461  0.08488974  0.85174172]

Top left 2x2 square:
[[ 0.13045476  0.8504469 ]
 [ 0.00988469  0.75964084]]


You can even index an array with an array of booleans or integer indices.

In [68]:
test1 = np.arange(10)
test2 = 2*np.arange(10)
print 'test1 =', test1
print 'test2 =', test2
print 'Are elements of test1 > 4?', test1 > 4
print 'Elements of test2 where corresponding element of test1 > 4:', test2[test1 > 4]
print 'Elements of test2 where corresponding element of test2 > 4:', test2[test2 > 4]
print 'Elements 3, 1, and 4 of test2:', test2[[3,1,4]]

test1 = [0 1 2 3 4 5 6 7 8 9]
test2 = [ 0  2  4  6  8 10 12 14 16 18]
Are elements of test1 > 4? [False False False False False  True  True  True  True  True]
Elements of test2 where corresponding element of test1 > 4: [10 12 14 16 18]
Elements of test2 where corresponding element of test2 > 4: [ 6  8 10 12 14 16 18]
Elements 3, 1, and 4 of test2: [6 2 8]


## Numpy arrays can be so much faster than lists for manipulating data
Under the hood, numpy arrays use C code (which is much faster than Python, generally speaking) to perform common mathematical algorithms very quickly.

In [69]:
print 'Square each number in x and then append it to a list...'
arr = []
x = range(10000)
%timeit -n 100 for k in x: arr.append(k**2)

print '\nDo the same but with list comprehension...'
x = range(10000)
%timeit -n 100 [k**2 for k in x]

print '\nNow try with numpy...'
x = np.arange(10000)
%timeit -n 100 x**2

Square each number in x and then append it to a list...
100 loops, best of 3: 1.34 ms per loop

Do the same but with list comprehension...
100 loops, best of 3: 685 µs per loop

Now try with numpy...
100 loops, best of 3: 5.25 µs per loop


## Numpy also lets you do other math on arrays 
(You can often also do this on lists too, though the output will generally be numpy arrays)

In [73]:
print np.pi
print np.e

sines = np.sin([np.pi/4 * n for n in range(4)])
print sines
print type(sines) 

3.14159265359
2.71828182846
[ 0.          0.70710678  1.          0.70710678]
<type 'numpy.ndarray'>


This is especially useful for doing statistics.

In [74]:
vec1 = np.arange(6)
print 'vec1: ', vec1
print 'sum:', np.sum(vec1)
print 'mean:', np.mean(vec1)
print 'median:', np.median(vec1)
print 'standard deviation:', np.std(vec1)


vec1:  [0 1 2 3 4 5]
sum: 15
mean: 2.5
median: 2.5
standard deviation: 1.70782512766


More importantly, numpy is optimized for vector and matrix operations. In general, array operations are element-wise:

In [75]:
arr1 = np.arange(10)
arr2 = 2*np.arange(10)
print 'Multiplication arr1*arr2:\n', arr1 * arr2
print 'Complex scalar multiplication and vector subtraction:\n', 3.0*arr2 - 1.5j*arr1
mat1 = np.array([[1,2],[3,4]])
mat2 = np.array([[0,1],[2,3]])
print '\nmat1 * mat2 =\n', mat1 * mat2
print 'mat1 - mat2 =\n', mat1 - mat2

Multiplication arr1*arr2:
[  0   2   8  18  32  50  72  98 128 162]
Complex scalar multiplication and vector subtraction:
[  0. +0.j    6. -1.5j  12. -3.j   18. -4.5j  24. -6.j   30. -7.5j
  36. -9.j   42.-10.5j  48.-12.j   54.-13.5j]

mat1 * mat2 =
[[ 0  2]
 [ 6 12]]
mat1 - mat2 =
[[1 1]
 [1 1]]


However, we'll often want to do special vector-vector or matrix-vector or matrix-matrix operations in numpy. In particular, np.dot() is a very useufl multipurpose tool

In [77]:
vec1 = np.array(range(5))
vec2 = np.array([-1,0,1,-1,0])
print vec1
print np.sum(vec1*vec1)
print np.dot(vec1,vec1)

#TODO: add matrix products

[0 1 2 3 4]
30
30


## Other cool and potentially useful numpy functions

In [None]:
#random numbers

## File input and output with Numpy

# Matplotlib

## Basic plotting with Matplotlib

## Manipulating and customizing plots