# pandas Basics
__[**pandas**](https://pandas.pydata.org/)__ is a library providing fast, flexible, and expressive data structures for data analysis in Python.

## Main Features
* pandas has two primary data structures, Series (1-dimensional) and DataFrame (2-dimensional). 
* For R users, DataFrame provides everything that R’s data.frame provides and much more. 
* pandas is built on top of NumPy and is intended to integrate well within a scientific computing environment with many other 3rd party libraries
* Robust IO tools for loading data from flat files (CSV and delimited), Excel files, databases, and saving / loading data from the ultrafast HDF5 format

## Read Data
It's very easy to read data from excel files:

In [34]:
import pandas as pd

deposits = pd.read_excel('MktData_CurveBootstrap.xls', index_col = 0, skiprows = 9, usecols = 'D:F')
deposits = deposits[:8]

print(deposits)

                      BID   ASK
Depos                          
2008-02-20 00:00:00  3.99  4.03
2008-02-26 00:00:00  4.05  4.09
2008-03-19 00:00:00  4.13  4.18
2008-04-21 00:00:00  4.21  4.27
2008-05-19 00:00:00   4.3  4.36
2008-08-19 00:00:00  4.29  4.35
2008-11-19 00:00:00  4.29  4.35
2009-02-19 00:00:00  4.29  4.35


In [41]:
# the result is a DataFrame
print(type(deposits))

<class 'pandas.core.frame.DataFrame'>


In [43]:
# the first column is used as label for the rows
print(deposits.shape)

(8, 2)


In [52]:
# the axes of the DataFrame are the row labels and the column labels
print(deposits.axes)

[Index([2008-02-20 00:00:00, 2008-02-26 00:00:00, 2008-03-19 00:00:00,
       2008-04-21 00:00:00, 2008-05-19 00:00:00, 2008-08-19 00:00:00,
       2008-11-19 00:00:00, 2009-02-19 00:00:00],
      dtype='object', name='Depos'), Index(['BID', 'ASK'], dtype='object')]


In [53]:
# we can access the dates by using the axes attribute
dates = deposits.axes[0].tolist()

print(dates)

[datetime.datetime(2008, 2, 20, 0, 0), datetime.datetime(2008, 2, 26, 0, 0), datetime.datetime(2008, 3, 19, 0, 0), datetime.datetime(2008, 4, 21, 0, 0), datetime.datetime(2008, 5, 19, 0, 0), datetime.datetime(2008, 8, 19, 0, 0), datetime.datetime(2008, 11, 19, 0, 0), datetime.datetime(2009, 2, 19, 0, 0)]


## Access Data
There are many ways to access the data in a DataFrame, we'll see only some of them.

### Select a single column

In [54]:
bids = deposits['BID']
print(bids)

Depos
2008-02-20    3.99
2008-02-26    4.05
2008-03-19    4.13
2008-04-21    4.21
2008-05-19     4.3
2008-08-19    4.29
2008-11-19    4.29
2009-02-19    4.29
Name: BID, dtype: object


In [55]:
print(type(bids))

<class 'pandas.core.series.Series'>


### Select some rows (slicing)

In [56]:
print(deposits[3:])

                      BID   ASK
Depos                          
2008-04-21 00:00:00  4.21  4.27
2008-05-19 00:00:00   4.3  4.36
2008-08-19 00:00:00  4.29  4.35
2008-11-19 00:00:00  4.29  4.35
2009-02-19 00:00:00  4.29  4.35


### Select by Labels
There is an ad-hoc method for selecting the data by label: **loc**.

In [68]:
# select specific columns by label
print(deposits.loc[:,['BID', 'ASK']])

                      BID   ASK
Depos                          
2008-02-20 00:00:00  3.99  4.03
2008-02-26 00:00:00  4.05  4.09
2008-03-19 00:00:00  4.13  4.18
2008-04-21 00:00:00  4.21  4.27
2008-05-19 00:00:00   4.3  4.36
2008-08-19 00:00:00  4.29  4.35
2008-11-19 00:00:00  4.29  4.35
2009-02-19 00:00:00  4.29  4.35


In [69]:
print(type(deposits.loc[:,['BID', 'ASK']]))

<class 'pandas.core.frame.DataFrame'>


In [70]:
# select specific value
print(deposits.loc[dates[0], 'ASK'])

4.03


In [71]:
print(type(deposits.loc[dates[0], 'ASK']))

<class 'float'>


### Select by Position
There is an ad-hoc method for selecting the data by position: **iloc**.

In [72]:
# select single row
print(deposits.iloc[3])

BID    4.21
ASK    4.27
Name: 2008-04-21 00:00:00, dtype: object


In [73]:
print(type(deposits.iloc[3]))

<class 'pandas.core.series.Series'>


In [74]:
# slicing
print(deposits.iloc[3:5, 1:])

                      ASK
Depos                    
2008-04-21 00:00:00  4.27
2008-05-19 00:00:00  4.36


In [75]:
print(type(deposits.iloc[3:5, 1:]))

<class 'pandas.core.frame.DataFrame'>


In [76]:
# select specific value
print(deposits.iloc[1, 1])

4.09


## Other Resources
* __[pandas Official Documentation](http://pandas.pydata.org/pandas-docs/stable/)__