# Numpy and Scipy

[Numpy](http://numpy.org) is the **fundamental package for scientific computing with Python**. It contains among other things:

* a powerful N-dimensional array object
* sophisticated (**broadcasting**) functions [what is *broadcasting*?]
* tools for integrating C/C++ and Fortran code
* useful linear algebra, Fourier transform, and random number capabilities

[Scipy](http://scipy) contains additional routines for optimization, special functions, and so on. Both contain modules written in C and Fortran so that they're as fast as possible. Together, they give Python roughly the same capability that the [Matlab](http://www.mathworks.com/products/matlab/) program offers. (In fact, if you're an experienced Matlab user, there a [guide to Numpy for Matlab users](http://www.scipy.org/NumPy_for_Matlab_Users) just for you.)

In IPython, the easiest way to import the numpy package is to call **%pylab inline**. A frequent alternative is to call **import numpy as np**.

## Making vectors and matrices
Fundamental to both Numpy and Scipy is the ability to work with vectors and matrices. You can create vectors from lists using the **array** command:

In [None]:
%pylab inline

In [None]:
random.seed(1)

In [None]:
array([-2,0,1,2,3,4,5,6])

**remember:** a [python list](https://docs.python.org/2/tutorial/datastructures.html) and a [numpy array](https://docs.scipy.org/doc/numpy/reference/generated/numpy.array.html) are different! E.g.

In [None]:
[1,2,3]+[4,5,8]

In [None]:
array([1,2,3])+array([4,5,8])

What is the type of the array?

In [None]:
array([-2,0,1,2,3,4,5,6]).dtype

You can pass in a second argument to **array** that gives the numeric type. There are a number of types [listed here](http://docs.scipy.org/doc/numpy/user/basics.types.html) that your array can be. The most common ones are float64 (double precision floating point number), and int64.

In [None]:
array([-2,0,1,2,3,4,5,6],float64)

Other examples

In [None]:
array([-2,0,1,2,3,4,5,6],bool)

An array with different types:

In [None]:
array([1.,2,'a',True])

numpy try to infer the most exhaustive type, this could be useful or not. As an alternative, you could force the type to be an **object** so that everything keep its original type.

In [None]:
array([1.,2,'a',True],object)

To build matrices, you can either use the array command with lists of lists:

In [None]:
array([[0,1], [1,0]], float64)

You can create arrays with any number of dimensions using lists of lists of .... of lists:

In [None]:
array([[[0,1],[0,1]], [[1,0],[1,0]]], float64)

### Array creation routines

You can also form empty (zero) matrices of arbitrary shape (including vectors, which Numpy treats as vectors with one row), using the **[zeros](https://docs.scipy.org/doc/numpy/reference/generated/numpy.zeros.html)** command:

In [None]:
zeros((3,3), float64)

The first argument is a tuple containing the **shape** of the matrix, and the second is the data type (**dtype**) argument, which follows the same conventions as in the array command. The default dtype is float32. Thus, you can make row vectors:

In [None]:
zeros(3)

In [None]:
zeros((1, 3))

or column vectors:

In [None]:
zeros((3, 1))

The **[identity](https://docs.scipy.org/doc/numpy/reference/generated/numpy.identity.html)** function creates an identity matrix:

In [None]:
identity(4)

And the **[ones](https://docs.scipy.org/doc/numpy/reference/generated/numpy.identity.html)** funciton creates an array of ones:

In [None]:
ones((3, 3))

You can create arrays with any number of dimensions by adding dimensions to the **shape** argument:

In [None]:
ones((6, 3, 5))

Other array creation routines are:

**[empty](https://docs.scipy.org/doc/numpy/reference/generated/numpy.empty.html)**

In [None]:
empty((2, 2))

**[eye](https://docs.scipy.org/doc/numpy/reference/generated/numpy.eye.html)**

In [None]:
eye(3, 4, 1)

**[arange](https://docs.scipy.org/doc/numpy/reference/generated/numpy.arange.html)**

In [None]:
arange(4)

In [None]:
help(arange)

In [None]:
arange(2,6)

In [None]:
arange(-10,4)

In [None]:
arange(4,-10,-2)

**[diag](https://docs.scipy.org/doc/numpy/reference/generated/numpy.diag.html)**

In [None]:
diag(arange(4))

The **[linspace](https://docs.scipy.org/doc/numpy/reference/generated/numpy.linspace.html)** command makes a linear array of points from a starting to an ending value.

In [None]:
linspace(0, 2, 5)

Same command in log, [logspace](https://docs.scipy.org/doc/numpy/reference/generated/numpy.logspace.html)

In [None]:
logspace(0, 2, 9)

### [reshape](https://docs.scipy.org/doc/numpy/reference/generated/numpy.reshape.html)

In [None]:
a = arange(64)
a

In [None]:
a.reshape(8,8)

### [transpose](https://docs.scipy.org/doc/numpy/reference/generated/numpy.transpose.html)

In [None]:
a.reshape(8,8).T

### [Indexing/Slicing](https://docs.scipy.org/doc/numpy/reference/arrays.indexing.html)

You can index and slice numpy arrays in the same way you index/slice lists.

In [None]:
a3 = arange(30) 
a3

In [None]:
a3[0]

In [None]:
a3[::-1]

In [None]:
a3[2:5]

### [Boolean array indexing](https://docs.scipy.org/doc/numpy/reference/arrays.indexing.html): a very **useful** recipe

In [None]:
a3[a3>20]

In [None]:
x = array([[1., 2.], [nan, 3.], [nan, nan]])

In [None]:
x

In [None]:
x[~np.isnan(x)]

#### 2d, 3d slicing

In [None]:
a = arange(64).reshape(8,8)
a

In [None]:
a[0,:]

In [None]:
a[:,0]

In [None]:
a[:2,:2]

In [None]:
a[::2,::2]

In [None]:
b = arange(27).reshape(3,3,3)
b

In [None]:
b[0,0,0]

In [None]:
b[0,:,:]

In [None]:
b[:,0,:]

In [None]:
b[:,:,0]

### NumPy Functions

http://docs.scipy.org/doc/numpy/reference/routines.math.html

#### [randint](https://docs.scipy.org/doc/numpy/reference/generated/numpy.random.randint.html)

In [None]:
a=random.randint(0,10,10)

In [None]:
a

#### [min](https://docs.scipy.org/doc/numpy/reference/generated/numpy.ndarray.min.html)

In [None]:
a.min()

#### [max](https://docs.scipy.org/doc/numpy/reference/generated/numpy.ndarray.max.html)

In [None]:
a.max()

#### [mean](https://docs.scipy.org/doc/numpy/reference/generated/numpy.ndarray.mean.html)

In [None]:
a.mean()

#### [std](https://docs.scipy.org/doc/numpy/reference/generated/numpy.ndarray.std.html)

In [None]:
a.std()#standard deviation

#### [sum](https://docs.scipy.org/doc/numpy/reference/generated/numpy.ndarray.sum.html)

In [None]:
a.sum()

In [None]:
b=random.randint(0,10,(10,5))

In [None]:
b

In [None]:
b.shape

In [None]:
b.T

In [None]:
b.T.shape

#### [trace](https://docs.scipy.org/doc/numpy/reference/generated/numpy.ndarray.trace.html)

In [None]:
b.trace()

#### [diag](https://docs.scipy.org/doc/numpy/reference/generated/numpy.ndarray.diag.html)

In [None]:
diag(b)

In [None]:
b.min()

In [None]:
b.min(axis=0)

In [None]:
b.min(axis=1)

#### [ravel](https://docs.scipy.org/doc/numpy/reference/generated/numpy.ndarray.ravel.html)

In [None]:
b.ravel()

# Matplotlib

[Matplotlib](http://matplotlib.org/) is the **fundamental package for scientific plotting with Python**. We suggest to visit the [gallery](http://matplotlib.org/gallery.html) to get an idea of the different kind of plots that it could be made with matplotlib

In [None]:
x=array([1,2,4,10])
y=array([-2,6,7,2])

#### [plot](http://matplotlib.org/api/pyplot_api.html#matplotlib.pyplot.plot)

In [None]:
plot(x,y)

In [None]:
plot(x,2+3*x,label='a line')
plot(linspace(1,10,100),10*sin(linspace(1,10,100)),label='sin(x)')
xlabel('x')
ylabel('y')
legend()

In [None]:
r=random.rand(100)

#### [hist](http://matplotlib.org/api/pyplot_api.html#matplotlib.pyplot.hist)

In [None]:
hist(r)

In [None]:
rn=random.randn(1000)

In [None]:
rn.mean()
# rn.std()

In [None]:
hist(rn,bins=linspace(-4,4,10))

be aware also of the command [histogram](https://docs.scipy.org/doc/numpy/reference/generated/numpy.histogram.html) of numpy. For istance the previous plot could also be obtained in the following way:

In [None]:
x_bin=linspace(-4,4,10)

In [None]:
h=histogram(rn,bins=x_bin)

In [None]:
plot(h[1][:-1],h[0],'-o')

or using the [bar](http://matplotlib.org/api/pyplot_api.html#matplotlib.pyplot.bar) command:

In [None]:
bar(h[1][:-1],h[0])

#### [imshow](http://matplotlib.org/api/pyplot_api.html#matplotlib.pyplot.imshow)

In [None]:
r_matrix=random.rand(100,100)

In [None]:
imshow(r_matrix,interpolation='None')
colorbar()

# Scipy

[Scipy](http://scipy.org) contains additional routines for optimization, special functions, and so on.

Some examples:
* [do you want to maximize/minimize a function?](https://docs.scipy.org/doc/scipy/reference/tutorial/optimize.html)
* [some linear algebra (eigenvalues, matrix inversion, etc.)?](https://docs.scipy.org/doc/scipy/reference/tutorial/linalg.html)
* [integrate a function?](https://docs.scipy.org/doc/scipy/reference/tutorial/integrate.html)
* [some useful statistical funciton?](https://docs.scipy.org/doc/scipy/reference/tutorial/stats.html)
* [further examples](https://docs.scipy.org/doc/scipy/reference/)


Consider the following example: we want to know if the sample $r1$ and the sample $r2$ come from the same distribution?

In [None]:
r1=random.randn(2453)*3
r2=random.randn(5718)

In [None]:
h1=histogram(r1,linspace(-10,10,100))
h2=histogram(r2,linspace(-10,10,100))

In [None]:
plot(h1[1][:-1],h1[0]*1./sum(h1[0]),label='line1')
plot(h1[1][:-1],h2[0]*1./sum(h2[0]),label='line2')
legend()

In [None]:
plot(h1[1][:-1],cumsum(h1[0]*1./sum(h1[0])),label='line1')
plot(h1[1][:-1],cumsum(h2[0]*1./sum(h2[0])),label='line2')
legend()

## [Statistical Functions](https://docs.scipy.org/doc/scipy/reference/stats.html#) in Scipy

In [None]:
from scipy import stats

#### [Two-samples Kolmogorov-Smirnov test](https://docs.scipy.org/doc/scipy/reference/generated/scipy.stats.ks_2samp.html#scipy.stats.ks_2samp)

In [None]:
r,p=stats.ks_2samp(r1,r2)

the **Kolmogorov-Smirnov** statistic

In [None]:
r

the **p-value**

In [None]:
p

# Further Readings

In [None]:
import this