#Introduction to Scientific Python - `scipy` 
##Scipy is a software stack built to support efficient scientific computation in Python.
##The fundamental  components of the ecosystem are `numpy` and `matplotlib`.

you can refer to the official [Scipy website](http://www.scipy.org) for updates and full documentation.

###The basic components of the ecosystem are :
* Numpy        : provides the fundamental array datastructure
* Matplotlib   : provides visualization functionalities
* Scipy        : algorithms  
* IPython

####Additional more specialized packages are 
* Pandas         : convenient data structures for data analysis
* Sympy          : symbolic math
* Scikits-Learn  : tools for machine learning and data mining
* PyTables       : tools for managing hierarchical datasets and large amount of data 
* Numba          : decorations and annotations for enabling jit compilation of computationally intensive code
* Cython         : interfacing C/C++ python 
* .....  many other tools out there.

In [None]:
from IPython.display import Image
Image(filename='Images/ecosystem.png')

In [None]:
# to ensure python2 - python3 portability :
from __future__ import print_function, division

### Let's pay observe some basic limits of the standard python data structures

In [None]:
## Lists are cool, but they are not really ideal for representing vectors in scientific calculations
v1 = range(1,10,1)
v1

In [None]:
## we cannot easily build an array of floats
v1 = range(1,20,0.5)

In [None]:
## the way python deals with summing two lists is not really what we want
v2 = range(10,20,1)
print ( v1, v2 )
print ( v1+v2 )  

In [None]:
## how do we compute a math operation element-by-element on all elements of a vector?
## for example we could use list comprehension
v_sqr = [i**2 for i in v1 ]
print (v_sqr)

## or a map 
v_sqr = map( lambda x : x**2 ,v1)
print (v_sqr)

These are good options, but the performance is not great

In [None]:
v1 = range(1,1000)

In [None]:
%timeit -q -o [i**2 for i in v1]

In [None]:
%timeit -q -o map( lambda x : x**2 ,v1)

##`Numpy` addresses these problems providing :

- ###a numerical array type, suitable for scientific data (vectors, matrices, etc.)
- ###vectorized operations for operating on arrays


In [None]:
## first import numpy
import numpy as np


In [None]:
## we can quickly create a float vector
v1 = np.arange(1,10,0.1)
v1

In [None]:
## this is the new object type introduced in numpy
type(v1)

In [None]:
## the data type for our array is 
v1.dtype

In [None]:
## it's designed to deal with vectors math as we intend it.
## sum
v1 = np.array([0.,1.,2.,3.])
v2 = np.array([10.,11.,12.,13.])
v1+v2

In [None]:
## multiplication by a scalar
-2* v1

In [None]:
## note that multiplication is elementwise
v1 * v2 

In [None]:
## so is **2 
v1**2

In [None]:
## is that faster than the list comprehension version we saw earlier?
v1 = np.arange(1,1000)

In [None]:
%timeit -q -o   v1**2

##Universal functions
###The vectorization of the operations is around the concept of "universal" functions

In [None]:
np.sin(np.pi/2)

In [None]:
v1 = np.linspace(-2*np.pi,2*np.pi,50)
v1

In [None]:
v2 = np.sin(v1)
v2

`np.sin()` is a  "universal function" - `ufunct` - that can operate both on numbers and on arrays.

The idea of numpy is that we should think in a vectorized way, always trying to operate in this way on complete arrays as a unit. 

##`Numpy` provides many functionalities to deal with the nd arrays.
There are many methods we can apply to this new datastructure `nd.array`

In [None]:
v1 = np.arange(1,1000)
dir(v1)

In [None]:
##for example
print ( v1.sum() , v1.min() , v1.mean() , v1.argmax() , v1.shape)

###For a few more details on numpy ndarrays check out [this other notebook](numpy_arrays.ipynb)

## Visualizing data

The other fundamental tool we need is an API to easily visualize data.

`matplotlib` is the graphic package part of the `scipy` stack

The API was originally inspired by the MATLAB, and the syntax should appear quite friendly to MATLAB users.

In [None]:
import matplotlib
import numpy as np
import matplotlib.pyplot as plt

In [None]:
print(matplotlib.__version__)
print(matplotlib.get_backend())

In [None]:
## let's create two simple arrays to plot 
v1 = np.linspace(-2*np.pi,2*np.pi,50)
v2 = np.sin(v1) 

In [None]:
## the most basic way of displaying data is use the function plot.
plt.plot(v1,v2)

Unfortunately you realized you cannot see the plot yet .....<br>
Matplotlib created the object but it has not been rendered yet.<br>
To display the plot you need to use the command show().

In [None]:
plt.show()

.... look for the plot. it is probably in a small window in the backgrownd.....<br>
To procede using the notebook you need to close the plot.
.... not extremely convenient ... but luckily 

###We can have the plot embedded in the notebook using the "magic" command
####%matplotlib inline

In [None]:
%matplotlib inline

In [None]:
## now as soon as the cell is executed, the plot will be displayed in the Output cell.
plt.plot(v1,v2)

In [None]:
## similarly to matlab you can tune global parameters, like the figure size
plt.rcParams['figure.figsize'] = 8,6

In [None]:
##try to replot 
plt.plot(v1,v2)

In [None]:
## if you want to plot multiple dataseries on the same plot, 
## just issue multiple plot commands before issuing the show() command.
## .... in our case, just put multiple plot commands in the same cell.

plt.plot(v1, np.sin(v1), "o" , label="sin(x)")
plt.plot(v1, np.cos(v1), "--x", label="cos(x)")
plt.xlabel("x", size=20)
plt.ylabel("circular functions", size=20)
plt.legend(loc=(1.1 ,0.7 ) , fontsize='xx-large')


Let's try to understand how the graphic output is structured. <br><br>

<img src="images/figure_axes_axis_labeled.png" , width=600, align=center >

####The ``Figure`` is the main container. 
You can think of it as the window that is created when you say plt.show(), or a page if you save your figure to a pdf file. 
<br>
####The real plot happens in an ``Axes``, which is the effective plotting area.<br>
For example if you want to create a page with multiple panels, typically each panel will be a different ``Axes`` in the same ``Figure``.
<br>
####There are many ways to create ``Axes``. The most useful is calling the method ``subplots``.
<br>



In [None]:
##create a figure 
fig = plt.figure()
## add one subplot .... MATLAB users should recognize this 
ax = fig.add_subplot(111) 
print (type(ax))
### add_subplot(ABC) adds subplot C to a grid of AxB subplots

##set some features, like title, axis ranges, labels....
ax.set(xlim=[0.5, 4.5], ylim=[-2, 8], title='An example empty plot', ylabel='Y-Axis', xlabel='X-Axis')


In [None]:
### let's plot sin(x) and cos(x) in two different subplots organized vertically
## create one figure
fig = plt.figure()

## create the first Axes using subplot
ax = fig.add_subplot(211)
ax.set_title('Plot number 1')
ax.set_ylabel('cos(x)')
ax.plot(v1,np.cos(v1))

## and now add the second one
ax = fig.add_subplot(212)
ax.set_title('Plot number 2', fontsize=24)
ax.set_ylabel('sin(x)')
ax.plot(v1,np.sin(v1))

## this is a useful tool to make sure that when the fiugure gets rendered, 
## Matplotlib tries to rearrange the layout to avoid overlaps 
plt.tight_layout()

In [None]:
## we can do similarly a grid of 2 subplots with an horizontal layout 

## we can tune the aspect ratio to make it look better.
fig = plt.figure(figsize=plt.figaspect(0.25))

ax = fig.add_subplot(121)
ax.set_title('Plot number 1')
ax.set_ylabel('cos(x)')
ax.plot(v1,np.cos(v1))

ax = fig.add_subplot(122)
ax.set_title('Plot number 2')
ax.set_ylabel('sin(x)')
ax.plot(v1,np.sin(v1))

In [None]:
## if we want to create many subplots the best is to use the method subplots, 
## the ndarray with the axes objects  
fig, axes = plt.subplots(nrows=4)
print ( type(axes) )
axes

In [None]:
## create the grid
fig, axes = plt.subplots(nrows=4)

## draw a plot in each of the Axes
for i,ax in enumerate(axes):
    ax.plot(v1,np.sin(i * np.pi/4 + v1))
    ax.set_title('plot number %d'%i)

plt.tight_layout()


In [None]:
## we can extend the same concept to a 2d grid of plots 
fig, axes = plt.subplots(nrows=4,ncols=4,figsize=plt.figaspect(0.5))
for i,axs in enumerate(axes):
    for j,ax in enumerate(axs) : 
        ax.plot( v1, np.sin( ( i*4+j)* np.pi/16 + v1), 'r--o')
        ax.set_title('plot number %d , %d' % (i,j)) 
    
plt.tight_layout()

## we can export the current figure using the method savefig()
plt.savefig("multiplot.pdf")

### There are a variety of ploting methods for displaying data. 

The best source of information is the "gallery" on the [matplotlib website](http://matplotlib.org/gallery.html).

You can find many examples of plots generated along with the code used to generate them.

In [None]:
%reset