# Agenda
* Numpy
* Pandas
* Lab


# Introduction


## Create a new notebook for your code-along:

From our submission directory, type:
    
    jupyter notebook

From the IPython Dashboard, open a new notebook.
Change the title to: "Numpy and Pandas"

In [6]:
my_matrix = [[1,2],[3,4]]
for i,row in enumerate(my_matrix):
    for j,col in enumerate(row):
        my_matrix[i][j] = col * 2
        
my_matrix

[[2, 4], [6, 8]]

# Introduction to Numpy

* Overview
* ndarray
* Indexing and Slicing

More info: [http://wiki.scipy.org/Tentative_NumPy_Tutorial](http://wiki.scipy.org/Tentative_NumPy_Tutorial)


## Numpy Overview

* Why Python for Data? Numpy brings *decades* of C math into Python!
* Numpy provides a wrapper for extensive C/C++/Fortran codebases, used for data analysis functionality
* NDAarray allows easy vectorized math and broadcasting (i.e. functions for vector elements of different shapes)

In [7]:
import numpy as np

### Creating ndarrays

An array object represents a multidimensional, homogeneous array of fixed-size items. 

In [8]:
# Creating arrays
a = np.zeros((3))
b = np.ones((2,3))
c = np.random.randint(1,10,(2,3,4))
d = np.arange(0,11,2)

What are these functions?

    arange?

In [9]:
# Note the way each array is printed:
a

array([ 0.,  0.,  0.])

In [10]:
b

array([[ 1.,  1.,  1.],
       [ 1.,  1.,  1.]])

In [11]:
c

array([[[3, 6, 3, 7],
        [4, 1, 1, 1],
        [9, 1, 9, 5]],

       [[9, 7, 2, 1],
        [3, 1, 5, 7],
        [8, 1, 7, 9]]])

In [12]:
d

array([ 0,  2,  4,  6,  8, 10])

In [13]:
## Arithmetic in arrays is element wise

In [14]:
a = np.array( [20,30,40,50] )
b = np.arange( 4 )
a,b

(array([20, 30, 40, 50]), array([0, 1, 2, 3]))

In [15]:
c = a-b
c

array([20, 29, 38, 47])

In [16]:
b**2

array([0, 1, 4, 9])

## Indexing, Slicing and Iterating

In [17]:
# one-dimensional arrays work like lists:
a = np.arange(10)**2

In [18]:
a

array([ 0,  1,  4,  9, 16, 25, 36, 49, 64, 81])

In [19]:
a[-1]

81

In [20]:
# Multidimensional arrays use tuples with commas for indexing
# with (row,column) conventions beginning, as always in Python, from 0

In [21]:
b = np.random.randint(1,100,(4,4))

In [22]:
b

array([[60, 27,  8, 30],
       [40, 39,  7, 50],
       [88, 10, 62, 91],
       [ 7,  1, 85, 69]])

In [23]:
len(b)

4

In [24]:
diag = []
for i in range(len(b)):
    diag.append(b[-i,-i])
print diag

[60, 69, 62, 39]


In [25]:
b[1:3,1:3]

array([[39,  7],
       [10, 62]])

In [26]:
# Guess the output
print(b[2,3])
print(b[0,0])


91
60


In [27]:
b[0:3,1],b[:,1]

(array([27, 39, 10]), array([27, 39, 10,  1]))

In [28]:
b

array([[60, 27,  8, 30],
       [40, 39,  7, 50],
       [88, 10, 62, 91],
       [ 7,  1, 85, 69]])

In [29]:
b[1:3,:]

array([[40, 39,  7, 50],
       [88, 10, 62, 91]])

# Introduction to Pandas

* Object Creation
* Viewing data
* Selection
* Missing data
* Grouping
* Reshaping
* Time series
* Plotting
* i/o
 

_pandas.pydata.org_

## Pandas Overview

_Source: [pandas.pydata.org](http://pandas.pydata.org/pandas-docs/stable/10min.html)_

In [30]:
import pandas as pd
import numpy as np
import cufflinks as cf
cf.go_offline()

IOPub data rate exceeded.
The notebook server will temporarily stop sending output
to the client in order to avoid crashing it.
To change this limit, set the config variable
`--NotebookApp.iopub_data_rate_limit`.


In [31]:
dates = pd.date_range('20170101',periods=6)
dates

DatetimeIndex(['2017-01-01', '2017-01-02', '2017-01-03', '2017-01-04',
               '2017-01-05', '2017-01-06'],
              dtype='datetime64[ns]', freq='D')

In [32]:
# np.random.randn(600,4)

In [33]:
df = pd.DataFrame(np.random.randn(6,4),index=dates,columns=list('ABCD'))
df

Unnamed: 0,A,B,C,D
2017-01-01,-0.57721,-0.032442,0.838743,-0.404661
2017-01-02,-0.652372,0.601515,-0.814833,-1.52699
2017-01-03,-0.899049,0.054533,0.226235,-1.118606
2017-01-04,-0.114459,-0.364977,0.126552,0.55304
2017-01-05,1.043032,-0.100636,0.934057,0.272236
2017-01-06,-1.022603,-0.439057,-0.915876,0.548895


In [34]:
# Index, columns, underlying numpy data
df = df.T
df

Unnamed: 0,2017-01-01 00:00:00,2017-01-02 00:00:00,2017-01-03 00:00:00,2017-01-04 00:00:00,2017-01-05 00:00:00,2017-01-06 00:00:00
A,-0.57721,-0.652372,-0.899049,-0.114459,1.043032,-1.022603
B,-0.032442,0.601515,0.054533,-0.364977,-0.100636,-0.439057
C,0.838743,-0.814833,0.226235,0.126552,0.934057,-0.915876
D,-0.404661,-1.52699,-1.118606,0.55304,0.272236,0.548895


In [35]:
# Index, columns, underlying numpy data
df = df.T
df

Unnamed: 0,A,B,C,D
2017-01-01,-0.57721,-0.032442,0.838743,-0.404661
2017-01-02,-0.652372,0.601515,-0.814833,-1.52699
2017-01-03,-0.899049,0.054533,0.226235,-1.118606
2017-01-04,-0.114459,-0.364977,0.126552,0.55304
2017-01-05,1.043032,-0.100636,0.934057,0.272236
2017-01-06,-1.022603,-0.439057,-0.915876,0.548895


In [36]:
df2 = pd.DataFrame({ 'A' : 1.,
                         'B' : pd.Timestamp('20130102'),
                         'C' : pd.Series(1,index=list(range(4)),dtype='float32'),
                         'D' : np.array([3] * 4,dtype='int32'),
                         'E' : 'foo' })
    

df2

Unnamed: 0,A,B,C,D,E
0,1.0,2013-01-02,1.0,3,foo
1,1.0,2013-01-02,1.0,3,foo
2,1.0,2013-01-02,1.0,3,foo
3,1.0,2013-01-02,1.0,3,foo


In [37]:
# With specific dtypes
df2.dtypes

A           float64
B    datetime64[ns]
C           float32
D             int32
E            object
dtype: object

#### Viewing Data

In [38]:
df.head()

Unnamed: 0,A,B,C,D
2017-01-01,-0.57721,-0.032442,0.838743,-0.404661
2017-01-02,-0.652372,0.601515,-0.814833,-1.52699
2017-01-03,-0.899049,0.054533,0.226235,-1.118606
2017-01-04,-0.114459,-0.364977,0.126552,0.55304
2017-01-05,1.043032,-0.100636,0.934057,0.272236


In [39]:
df.tail()

Unnamed: 0,A,B,C,D
2017-01-02,-0.652372,0.601515,-0.814833,-1.52699
2017-01-03,-0.899049,0.054533,0.226235,-1.118606
2017-01-04,-0.114459,-0.364977,0.126552,0.55304
2017-01-05,1.043032,-0.100636,0.934057,0.272236
2017-01-06,-1.022603,-0.439057,-0.915876,0.548895


In [40]:
df.index

DatetimeIndex(['2017-01-01', '2017-01-02', '2017-01-03', '2017-01-04',
               '2017-01-05', '2017-01-06'],
              dtype='datetime64[ns]', freq='D')

In [41]:
df.describe()

Unnamed: 0,A,B,C,D
count,6.0,6.0,6.0,6.0
mean,-0.370444,-0.046844,0.065813,-0.279348
std,0.760317,0.371157,0.78993,0.889918
min,-1.022603,-0.439057,-0.915876,-1.52699
25%,-0.83738,-0.298892,-0.579487,-0.94012
50%,-0.614791,-0.066539,0.176394,-0.066212
75%,-0.230147,0.032789,0.685616,0.47973
max,1.043032,0.601515,0.934057,0.55304


In [42]:
df.sort_values(by='B', ascending=False)

Unnamed: 0,A,B,C,D
2017-01-02,-0.652372,0.601515,-0.814833,-1.52699
2017-01-03,-0.899049,0.054533,0.226235,-1.118606
2017-01-01,-0.57721,-0.032442,0.838743,-0.404661
2017-01-05,1.043032,-0.100636,0.934057,0.272236
2017-01-04,-0.114459,-0.364977,0.126552,0.55304
2017-01-06,-1.022603,-0.439057,-0.915876,0.548895


### Selection

In [43]:
df[ ['A','B'] ]

Unnamed: 0,A,B
2017-01-01,-0.57721,-0.032442
2017-01-02,-0.652372,0.601515
2017-01-03,-0.899049,0.054533
2017-01-04,-0.114459,-0.364977
2017-01-05,1.043032,-0.100636
2017-01-06,-1.022603,-0.439057


In [44]:
df[:]

Unnamed: 0,A,B,C,D
2017-01-01,-0.57721,-0.032442,0.838743,-0.404661
2017-01-02,-0.652372,0.601515,-0.814833,-1.52699
2017-01-03,-0.899049,0.054533,0.226235,-1.118606
2017-01-04,-0.114459,-0.364977,0.126552,0.55304
2017-01-05,1.043032,-0.100636,0.934057,0.272236
2017-01-06,-1.022603,-0.439057,-0.915876,0.548895


In [45]:
# By label
print dates[0]
df.loc[dates[0]]

2017-01-01 00:00:00


A   -0.577210
B   -0.032442
C    0.838743
D   -0.404661
Name: 2017-01-01 00:00:00, dtype: float64

In [46]:
# multi-axis by label
df.loc[:,['A','B']]

Unnamed: 0,A,B
2017-01-01,-0.57721,-0.032442
2017-01-02,-0.652372,0.601515
2017-01-03,-0.899049,0.054533
2017-01-04,-0.114459,-0.364977
2017-01-05,1.043032,-0.100636
2017-01-06,-1.022603,-0.439057


In [47]:
# Date Range
df.loc['20170102':'20170104',['B']]

Unnamed: 0,B
2017-01-02,0.601515
2017-01-03,0.054533
2017-01-04,-0.364977


In [48]:
# Fast access to scalar
df.at[dates[1],'B']

0.60151452080654644

In [49]:
# iloc provides integer locations similar to np style
df.iloc[3:10, 1:3]

Unnamed: 0,B,C
2017-01-04,-0.364977,0.126552
2017-01-05,-0.100636,0.934057
2017-01-06,-0.439057,-0.915876


### Boolean Indexing

In [50]:
df.A

2017-01-01   -0.577210
2017-01-02   -0.652372
2017-01-03   -0.899049
2017-01-04   -0.114459
2017-01-05    1.043032
2017-01-06   -1.022603
Freq: D, Name: A, dtype: float64

In [51]:
df[df.A < 0] # Basically a 'where' operation

Unnamed: 0,A,B,C,D
2017-01-01,-0.57721,-0.032442,0.838743,-0.404661
2017-01-02,-0.652372,0.601515,-0.814833,-1.52699
2017-01-03,-0.899049,0.054533,0.226235,-1.118606
2017-01-04,-0.114459,-0.364977,0.126552,0.55304
2017-01-06,-1.022603,-0.439057,-0.915876,0.548895


### Setting

In [52]:
df_posA = df.copy() # Without "copy" it would act on the dataset

df_posA[df_posA.A < 0] = -1*df_posA

In [53]:
df_posA

Unnamed: 0,A,B,C,D
2017-01-01,0.57721,0.032442,-0.838743,0.404661
2017-01-02,0.652372,-0.601515,0.814833,1.52699
2017-01-03,0.899049,-0.054533,-0.226235,1.118606
2017-01-04,0.114459,0.364977,-0.126552,-0.55304
2017-01-05,1.043032,-0.100636,0.934057,0.272236
2017-01-06,1.022603,0.439057,0.915876,-0.548895


In [54]:
#Setting new column aligns data by index
s1 = pd.Series([1,2,3,4,5,6],index=pd.date_range('17170102',periods=6))

In [55]:
s1

1717-01-02 00:00:00    1
1717-01-03 00:00:00    2
1717-01-04 00:00:00    3
1717-01-05 00:00:00    4
1717-01-06 00:00:00    5
1717-01-07 00:00:00    6
Freq: D, dtype: int64

In [56]:
df['F'] = s1

In [57]:
df

Unnamed: 0,A,B,C,D,F
2017-01-01,-0.57721,-0.032442,0.838743,-0.404661,
2017-01-02,-0.652372,0.601515,-0.814833,-1.52699,
2017-01-03,-0.899049,0.054533,0.226235,-1.118606,
2017-01-04,-0.114459,-0.364977,0.126552,0.55304,
2017-01-05,1.043032,-0.100636,0.934057,0.272236,
2017-01-06,-1.022603,-0.439057,-0.915876,0.548895,


### Missing Data

In [58]:
# Add a column with missing data
df1 = df.reindex(index=dates[0:4],columns=list(df.columns) + ['E'])

In [59]:
df1.loc[dates[0]:dates[1],'E'] = 1

In [60]:
df1

Unnamed: 0,A,B,C,D,F,E
2017-01-01,-0.57721,-0.032442,0.838743,-0.404661,,1.0
2017-01-02,-0.652372,0.601515,-0.814833,-1.52699,,1.0
2017-01-03,-0.899049,0.054533,0.226235,-1.118606,,
2017-01-04,-0.114459,-0.364977,0.126552,0.55304,,


In [61]:
# find where values are null
pd.isnull(df1).sum()

A    0
B    0
C    0
D    0
F    4
E    2
dtype: int64

### Operations

In [62]:
df.describe()

Unnamed: 0,A,B,C,D,F
count,6.0,6.0,6.0,6.0,0.0
mean,-0.370444,-0.046844,0.065813,-0.279348,
std,0.760317,0.371157,0.78993,0.889918,
min,-1.022603,-0.439057,-0.915876,-1.52699,
25%,-0.83738,-0.298892,-0.579487,-0.94012,
50%,-0.614791,-0.066539,0.176394,-0.066212,
75%,-0.230147,0.032789,0.685616,0.47973,
max,1.043032,0.601515,0.934057,0.55304,


In [63]:
df.min(),df.min(1) # Operation on two different axes

(A   -1.022603
 B   -0.439057
 C   -0.915876
 D   -1.526990
 F         NaN
 dtype: float64, 2017-01-01   -0.577210
 2017-01-02   -1.526990
 2017-01-03   -1.118606
 2017-01-04   -0.364977
 2017-01-05   -0.100636
 2017-01-06   -1.022603
 Freq: D, dtype: float64)

### Applying functions

In [64]:
df

Unnamed: 0,A,B,C,D,F
2017-01-01,-0.57721,-0.032442,0.838743,-0.404661,
2017-01-02,-0.652372,0.601515,-0.814833,-1.52699,
2017-01-03,-0.899049,0.054533,0.226235,-1.118606,
2017-01-04,-0.114459,-0.364977,0.126552,0.55304,
2017-01-05,1.043032,-0.100636,0.934057,0.272236,
2017-01-06,-1.022603,-0.439057,-0.915876,0.548895,


In [65]:
df.apply(np.cumsum)

Unnamed: 0,A,B,C,D,F
2017-01-01,-0.57721,-0.032442,0.838743,-0.404661,
2017-01-02,-1.229583,0.569073,0.02391,-1.93165,
2017-01-03,-2.128632,0.623605,0.250145,-3.050256,
2017-01-04,-2.243091,0.258628,0.376697,-2.497217,
2017-01-05,-1.200059,0.157992,1.310754,-2.22498,
2017-01-06,-2.222662,-0.281065,0.394877,-1.676086,


In [66]:
df.apply(lambda x: x.max() - x.min())

A    2.065635
B    1.040571
C    1.849933
D    2.080030
F         NaN
dtype: float64

In [67]:
def my_f(x):
    return x.max() - x.min()

df.apply(my_f)

A    2.065635
B    1.040571
C    1.849933
D    2.080030
F         NaN
dtype: float64

In [68]:
# Built in string methods
s = pd.Series(['A', 'B', 'C', np.nan, 'CABA', 'dog', 'cat'])
s.str.lower()

0       a
1       b
2       c
3     NaN
4    caba
5     dog
6     cat
dtype: object

### Merge

In [69]:
np.random.randn(10,4)

array([[-0.36697239,  0.02370905,  0.31413366, -0.2926697 ],
       [-0.24751901, -0.92611002,  2.17438632,  1.06248401],
       [ 0.08475146, -0.16042049,  0.22905759, -0.39120867],
       [ 0.90831162, -1.40656349,  0.58236084,  0.7241905 ],
       [ 0.29988706,  0.65941356, -1.59458218,  0.31569369],
       [-0.29465763,  2.51498932, -1.76278393, -1.5809577 ],
       [ 0.78466561, -0.2661952 ,  0.16309843, -0.83288942],
       [ 0.66638713, -0.41471795, -1.95021694, -1.63594138],
       [ 1.06308337, -0.4820737 ,  1.90729906,  2.74274495],
       [ 0.28673394,  0.68194776,  1.52085292,  1.02975135]])

In [70]:
#Concatenating pandas objects together
df = pd.DataFrame(np.random.randn(10,4))
df

Unnamed: 0,0,1,2,3
0,-0.278333,0.195459,-0.401318,-1.610336
1,-1.789226,1.273203,2.477695,0.741275
2,-1.238613,0.114002,-0.728302,0.832313
3,0.314552,-0.058522,1.329424,0.911328
4,-0.51601,0.109561,0.554909,-0.89691
5,-0.072406,-0.321589,1.038541,-0.351587
6,0.085116,-0.722688,0.189906,-1.028109
7,0.456935,0.955018,0.036407,-0.335046
8,-0.838093,0.035093,-0.029559,0.17864
9,1.342146,-0.506482,-0.01334,-0.917982


In [71]:
# Break it into pieces
pieces = [df[:3], df[3:7],df[7:]]
pieces

[          0         1         2         3
 0 -0.278333  0.195459 -0.401318 -1.610336
 1 -1.789226  1.273203  2.477695  0.741275
 2 -1.238613  0.114002 -0.728302  0.832313,
           0         1         2         3
 3  0.314552 -0.058522  1.329424  0.911328
 4 -0.516010  0.109561  0.554909 -0.896910
 5 -0.072406 -0.321589  1.038541 -0.351587
 6  0.085116 -0.722688  0.189906 -1.028109,
           0         1         2         3
 7  0.456935  0.955018  0.036407 -0.335046
 8 -0.838093  0.035093 -0.029559  0.178640
 9  1.342146 -0.506482 -0.013340 -0.917982]

In [72]:
pd.concat(pieces)

Unnamed: 0,0,1,2,3
0,-0.278333,0.195459,-0.401318,-1.610336
1,-1.789226,1.273203,2.477695,0.741275
2,-1.238613,0.114002,-0.728302,0.832313
3,0.314552,-0.058522,1.329424,0.911328
4,-0.51601,0.109561,0.554909,-0.89691
5,-0.072406,-0.321589,1.038541,-0.351587
6,0.085116,-0.722688,0.189906,-1.028109
7,0.456935,0.955018,0.036407,-0.335046
8,-0.838093,0.035093,-0.029559,0.17864
9,1.342146,-0.506482,-0.01334,-0.917982


In [73]:
# Also can "Join" and "Append"
df

Unnamed: 0,0,1,2,3
0,-0.278333,0.195459,-0.401318,-1.610336
1,-1.789226,1.273203,2.477695,0.741275
2,-1.238613,0.114002,-0.728302,0.832313
3,0.314552,-0.058522,1.329424,0.911328
4,-0.51601,0.109561,0.554909,-0.89691
5,-0.072406,-0.321589,1.038541,-0.351587
6,0.085116,-0.722688,0.189906,-1.028109
7,0.456935,0.955018,0.036407,-0.335046
8,-0.838093,0.035093,-0.029559,0.17864
9,1.342146,-0.506482,-0.01334,-0.917982


### Grouping


In [74]:
df = pd.DataFrame({'Origin' : ['foo', 'bar', 'foo', 'bar',
                       'foo', 'bar', 'foo', 'foo'],
                       'Destination' : ['one', 'one', 'two', 'three',
                             'two', 'two', 'one', 'three'],
                       'C' : np.random.randn(8),
                       'D' : np.random.randn(8)})

In [75]:
df

Unnamed: 0,C,D,Destination,Origin
0,-1.178256,-0.215448,one,foo
1,0.825166,0.493228,one,bar
2,-0.032167,-1.206044,two,foo
3,0.145815,-1.868281,three,bar
4,0.78333,0.941726,two,foo
5,-1.973103,0.562442,two,bar
6,0.568068,-0.025025,one,foo
7,-0.476725,0.311091,three,foo


In [76]:
df.groupby(['Origin','Destination']).sum()

Unnamed: 0_level_0,Unnamed: 1_level_0,C,D
Origin,Destination,Unnamed: 2_level_1,Unnamed: 3_level_1
bar,one,0.825166,0.493228
bar,three,0.145815,-1.868281
bar,two,-1.973103,0.562442
foo,one,-0.610189,-0.240474
foo,three,-0.476725,0.311091
foo,two,0.751163,-0.264317


### Reshaping

In [77]:
# You can also stack or unstack levels

In [78]:
a = df.groupby(['Origin','Destination']).sum()

In [79]:
a

Unnamed: 0_level_0,Unnamed: 1_level_0,C,D
Origin,Destination,Unnamed: 2_level_1,Unnamed: 3_level_1
bar,one,0.825166,0.493228
bar,three,0.145815,-1.868281
bar,two,-1.973103,0.562442
foo,one,-0.610189,-0.240474
foo,three,-0.476725,0.311091
foo,two,0.751163,-0.264317


In [80]:
# Pivot Tables
pd.pivot_table(df,values=['C','D'],index=['Origin'],columns=['Destination'])

Unnamed: 0_level_0,C,C,C,D,D,D
Destination,one,three,two,one,three,two
Origin,Unnamed: 1_level_2,Unnamed: 2_level_2,Unnamed: 3_level_2,Unnamed: 4_level_2,Unnamed: 5_level_2,Unnamed: 6_level_2
bar,0.825166,0.145815,-1.973103,0.493228,-1.868281,0.562442
foo,-0.305094,-0.476725,0.375582,-0.120237,0.311091,-0.132159


### Time Series


In [81]:
import pandas as pd
import numpy as np

In [82]:
# 100 Seconds starting on January 1st
rng = pd.date_range('1/1/2017', periods=1000, freq='S')

In [83]:
# Give each second a random value
ts = pd.Series(np.random.randint(0, 500, len(rng)), index=rng)

In [84]:
ts

2017-01-01 00:00:00     24
2017-01-01 00:00:01    105
2017-01-01 00:00:02    142
2017-01-01 00:00:03    440
2017-01-01 00:00:04    169
2017-01-01 00:00:05    419
2017-01-01 00:00:06     76
2017-01-01 00:00:07    226
2017-01-01 00:00:08    334
2017-01-01 00:00:09    339
2017-01-01 00:00:10    322
2017-01-01 00:00:11    263
2017-01-01 00:00:12    274
2017-01-01 00:00:13    259
2017-01-01 00:00:14    250
2017-01-01 00:00:15    449
2017-01-01 00:00:16    480
2017-01-01 00:00:17      7
2017-01-01 00:00:18     39
2017-01-01 00:00:19    298
2017-01-01 00:00:20    418
2017-01-01 00:00:21     85
2017-01-01 00:00:22    112
2017-01-01 00:00:23    380
2017-01-01 00:00:24    109
2017-01-01 00:00:25     11
2017-01-01 00:00:26    472
2017-01-01 00:00:27    411
2017-01-01 00:00:28     79
2017-01-01 00:00:29     76
                      ... 
2017-01-01 00:16:10    263
2017-01-01 00:16:11    315
2017-01-01 00:16:12     45
2017-01-01 00:16:13    369
2017-01-01 00:16:14    453
2017-01-01 00:16:15    236
2

In [85]:
# Built in resampling
ts.resample('30S').mean() # Resample secondly to 1Minutely

2017-01-01 00:00:00    235.600000
2017-01-01 00:00:30    288.433333
2017-01-01 00:01:00    216.200000
2017-01-01 00:01:30    287.100000
2017-01-01 00:02:00    227.900000
2017-01-01 00:02:30    242.000000
2017-01-01 00:03:00    198.533333
2017-01-01 00:03:30    195.766667
2017-01-01 00:04:00    263.100000
2017-01-01 00:04:30    284.966667
2017-01-01 00:05:00    251.033333
2017-01-01 00:05:30    233.900000
2017-01-01 00:06:00    277.033333
2017-01-01 00:06:30    227.600000
2017-01-01 00:07:00    272.933333
2017-01-01 00:07:30    215.766667
2017-01-01 00:08:00    226.100000
2017-01-01 00:08:30    236.333333
2017-01-01 00:09:00    224.133333
2017-01-01 00:09:30    249.733333
2017-01-01 00:10:00    258.800000
2017-01-01 00:10:30    211.133333
2017-01-01 00:11:00    225.600000
2017-01-01 00:11:30    224.866667
2017-01-01 00:12:00    266.966667
2017-01-01 00:12:30    225.833333
2017-01-01 00:13:00    203.766667
2017-01-01 00:13:30    278.900000
2017-01-01 00:14:00    219.600000
2017-01-01 00:

In [86]:
# Many additional time series features
ts. #use tab

SyntaxError: invalid syntax (<ipython-input-86-5c9240a56f62>, line 2)

### Plotting


In [87]:
ts.iplot()
# ts.plot()

In [None]:
def randwalk(startdate,points):
    ts = pd.Series(np.random.randn(points), index=pd.date_range(startdate, periods=points))
    ts=ts.cumsum()
    ts.iplot()
    return(ts)

In [None]:
# Using pandas to make a simple random walker by repeatedly running:
a=randwalk('1/1/2012',1000)

### I/O
I/O is straightforward with, for example, pd.read_csv or df.to_csv

#### The benefits of open source:

Let's look under x's in plt modules

# Next Steps

**Recommended Resources**

Name | Description
--- | ---
[Official Pandas Tutorials](http://pandas.pydata.org/pandas-docs/stable/10min.html) | Wes & Company's selection of tutorials and lectures
[Julia Evans Pandas Cookbook](https://github.com/jvns/pandas-cookbook) | Great resource with examples from weather, bikes and 311 calls
[Learn Pandas Tutorials](https://bitbucket.org/hrojas/learn-pandas) | A great series of Pandas tutorials from Dave Rojas
[Research Computing Python Data PYNBs](https://github.com/ResearchComputing/Meetup-Fall-2013/tree/master/python) | A super awesome set of python notebooks from a meetup-based course exclusively devoted to pandas