<p style="text-align: center; font-size: 300%"> Computational Finance </p>
<img src="img/ABSlogo.svg" alt="LOGO" style="display:block; margin-left: auto; margin-right: auto; width: 50%;">

# Dealing with Data
## More Datatypes
### NumPy Arrays
* The most fundamental data type in numerical Python is `ndarray`, provided by the NumPy package.
* An array is similar to a `list`, except that
  * it can have more than one dimension;
  * its elements are homogenous (they all have the same type).
* NumPy provides a large number of functions (*ufuncs*) that operate elementwise on arrays. Allows *vectorized* code, avoiding loops (which are slow in Python).

#### Constructing Arrays
* Arrays can be constructed using the `array` function which takes sequences (e.g, lists), and converts them into arrays. The data type is inferred automatically or can be specified.

In [1]:
import numpy as np
a=np.array([1, 2, 3, 4])
a.dtype

dtype('int64')

In [2]:
a=np.array([1, 2, 3, 4],dtype='float64') #or np.array([1., 2., 3., 4.])
a.dtype

dtype('float64')

* Python uses C++ data types which differ from Python (though `float64` is equivalent to Python's `float`).

* Nested lists result in multidimensional arrays. We won't need anything beyond two-dimensional (i.e., a matrix or table).

In [3]:
b=np.array([[1., 2.], [3., 4.]])
print(b)

[[ 1.  2.]
 [ 3.  4.]]


In [4]:
b.ndim #Number of dimensions

2

In [24]:
b.shape #number of rows and columns

(2, 2)

* Other functions for creating arrays:

In [46]:
np.eye(3, dtype='float64') #identity matrix. float64 is the default dtype and can be omitted

array([[ 1.,  0.,  0.],
       [ 0.,  1.,  0.],
       [ 0.,  0.,  1.]])

In [47]:
np.ones([2,3]) #there's also np.zeros, and np.empty (which result in an uninitialized array)

array([[ 1.,  1.,  1.],
       [ 1.,  1.,  1.]])

In [88]:
np.arange(0,10,2) #like range, but creates an array instead of a list

array([0, 2, 4, 6, 8])

In [97]:
np.linspace(0,10,5) #5 equally spaced points between 0 and 10

array([  0. ,   2.5,   5. ,   7.5,  10. ])

#### Indexing
* Indexing and slicing operations are similar to lists:

In [6]:
b[0,0] #indexing [row, column]. Equivalent to b[0][0]

1.0

In [59]:
c=b[:,0]; c #First column. Note that this yields a 1-dimensional array, not a matrix 

array([ 42.,   3.])

* Slicing returns *views* into the original array (unlike slicing lists):

In [22]:
b[0]=42

In [48]:
c

array([ 42.,   3.])

* Apart from indexing by row and column, arrays also support *Boolean* indexing:

In [67]:
d=np.arange(10); d[d<5]=10; d

array([10, 10, 10, 10, 10,  5,  6,  7,  8,  9])

#### Arithmetic and ufuncs
* NumPy ufuncs are functions that operate elementwise:

In [79]:
a=np.arange(1,5); np.sqrt(a)

array([ 1.        ,  1.41421356,  1.73205081,  2.        ])

* Other useful ufuncs are `exp`, `log`, `abs`, `sqrt`
* Basic arithmetic on arrays works elementwise: 

In [85]:
a=np.arange(1,5); b=np.arange(5,9); a, b, a + b, a - b, a / b

(array([1, 2, 3, 4]),
 array([5, 6, 7, 8]),
 array([ 6,  8, 10, 12]),
 array([-4, -4, -4, -4]),
 array([0, 0, 0, 0]))

#### Broadcasting

#### Useful Statistical Functions
* mean, sum, cumsum, seed,

#### Storing Arrays

### Pandas Dataframes
#### Series

#### Dataframes

In [8]:
import pandas as pd

#### Fetching Data
* `pandas_datareader` is a package that makes it easy to fetch financial data from the web.
* It used to be included in pandas (and therefore Anaconda). In newer versions, you'll have to do `conda install pandas-datareader` to install it
* See http://pandas-datareader.readthedocs.io/en/latest/remote_data.html for the documentation.

In [9]:
#!conda install -y pandas-datareader #uncomment to install. (Note ! executes shell commands)
import pandas_datareader.data as web

In [10]:
start = pd.datetime(2010, 1, 1)
end = pd.datetime(2013, 1, 27)
f = web.DataReader("AAPL", 'yahoo', start, end)
f.index

DatetimeIndex(['2009-12-31', '2010-01-04', '2010-01-05', '2010-01-06',
               '2010-01-07', '2010-01-08', '2010-01-11', '2010-01-12',
               '2010-01-13', '2010-01-14',
               ...
               '2013-01-11', '2013-01-14', '2013-01-15', '2013-01-16',
               '2013-01-17', '2013-01-18', '2013-01-22', '2013-01-23',
               '2013-01-24', '2013-01-25'],
              dtype='datetime64[ns]', name=u'Date', length=772, freq=None)

## Regression Analysis

## Plotting with `matplotlib`