# Part 2 - Other basics and Pandas

#### Loops
Loops are used when you want to repeat an action. In Python there are 2 types of loops: for and while.<br>
**For loops** are used when you have a block of code which you want to repeat a fixed number of times.<br> 
**While loops** are used for repeating sections of code - but unlike a for loop, the while loop will not run n times, but until a defined condition is no longer met. If the condition is initially false, the loop body will not be executed at all.

In [None]:
# The general format for a for loop in Python:
for item in object:
    statements to do stuff

The variable name used for the item is completely up to the coder, so use your best judgment for choosing a name that makes sense and you will be able to understand when revisiting your code. This item name can then be referenced inside you loop, for example if you wanted to use if statements to perform checks.

In [1]:
my_list = [1,2,3]

In [2]:
# We can use a for loop to iterate through a list
for number in my_list:
    print(number) 

1
2
3


In [4]:
# We decided to name our variable "number" as it makes sense. We can pick any other name and it won't make a difference
for jelly in my_list:
    print(jelly)

1
2
3


It is better to choose a name that has more meaning to the context and can help understand your code other programmers if they were to read it.

In [None]:
# The general format for a while loop in Python:
while test:
    code statement
    
# It is also possible to have the following format
while test:
    code statement
else:
    final code statements

A while loop will run as long as the condition is met. ** IT IS VERY IMPORTANT THAT YOU USE AN EXIT CONDITION IN A WHILE LOOP. OTHERWISE, YOUR LOOP WILL RUN INFINITE NUMBER OF TIMES. (you will get an infinite loop)**

In [6]:
x = 0

while x < 10:
    print('x is currently: ',x)
    print(' x is still less than 10, adding 1 to x')
    x = x + 1

x is currently:  0
 x is still less than 10, adding 1 to x
x is currently:  1
 x is still less than 10, adding 1 to x
x is currently:  2
 x is still less than 10, adding 1 to x
x is currently:  3
 x is still less than 10, adding 1 to x
x is currently:  4
 x is still less than 10, adding 1 to x
x is currently:  5
 x is still less than 10, adding 1 to x
x is currently:  6
 x is still less than 10, adding 1 to x
x is currently:  7
 x is still less than 10, adding 1 to x
x is currently:  8
 x is still less than 10, adding 1 to x
x is currently:  9
 x is still less than 10, adding 1 to x


In [8]:
# Alternatively, you could have:
x = 0

while x < 10:
    print('x is currently: ',x)
    print(' x is still less than 10, adding 1 to x')
    x+=1
    
else:
    print('All Done!')
# In this case, when x reaches 10, it triggers the exit condition (since x = 10 and not x < 10)
# Which means that the condition for the while loop is not True anymore, so the else statement gets executed

x is currently:  0
 x is still less than 10, adding 1 to x
x is currently:  1
 x is still less than 10, adding 1 to x
x is currently:  2
 x is still less than 10, adding 1 to x
x is currently:  3
 x is still less than 10, adding 1 to x
x is currently:  4
 x is still less than 10, adding 1 to x
x is currently:  5
 x is still less than 10, adding 1 to x
x is currently:  6
 x is still less than 10, adding 1 to x
x is currently:  7
 x is still less than 10, adding 1 to x
x is currently:  8
 x is still less than 10, adding 1 to x
x is currently:  9
 x is still less than 10, adding 1 to x
All Done!


Most of the time, you will be using for loops to do iterations. While loops are useful in some other cases like if you want to keep getting user's input until you get desired input. In the example below, we keep asking the user to enter hello. The process is repeated until the user inputs hello.

In [9]:
n = input("Please enter 'hello':")
while n != 'hello':
    n = input("Please enter 'hello':")

Please enter 'hello':q
Please enter 'hello':e
Please enter 'hello':w
Please enter 'hello':2
Please enter 'hello':hello


#### Conditionals
Conditional statements are executed only when the conditional statement is True. It is easier to understand using an example.

In [1]:
name = 'Mike'

if name == 'Mike':
    print('Hello Mike')

Hello Mike


In the example above, we created a simple if statement which prints 'Hello Mike' if the name equals to Mike. Note the the if statement will not work if we change the name to anything else, including lower case 'mike'.

In [2]:
# Prints nothing
name = 'Bob'

if name == 'Mike':
    print('Hello Mike')

In [3]:
# Also prints nothing
name = 'mike'

if name == 'Mike':
    print('Hello Mike')

In [None]:
# If we want to have more than one condition in the same if statement, we can do that by using elif (else if) and else keywords.
# The general format is:
if condition1 == True:
    execute code1
elif condition2 == True:
    execute code2
else:   # Notice there's no condition here because this statement is executed when no condition is met
    execute code3

In [4]:
# Using our previous example:
name = 'Bob'

if name == 'Mike':
    print('Hello Mike')
elif name == 'Bob':
    print('Hello Bob')
elif name == 'John':
    print('Hello John')
else:   
    print('Hello stranger')

Hello Bob


In [5]:
name = 'Mike'

if name == 'Mike':
    print('Hello Mike')
elif name == 'Bob':
    print('Hello Bob')
elif name == 'John':
    print('Hello John')
else:   
    print('Hello stranger')

Hello Mike


In [6]:
name = 'John'

if name == 'Mike':
    print('Hello Mike')
elif name == 'Bob':
    print('Hello Bob')
elif name == 'John':
    print('Hello John')
else:   
    print('Hello stranger')

Hello John


In [7]:
name = 'Chris'

if name == 'Mike':
    print('Hello Mike')
elif name == 'Bob':
    print('Hello Bob')
elif name == 'John':
    print('Hello John')
else:   
    print('Hello stranger')

Hello stranger


There is a difference between using if, elif statements and multiple if statements. When using a single if statement on conjunction with elif, the statement is executed only once if the first condition is met, while using multiple if statements, each statement is executed against the condition.

In [8]:
# In this case, else statement is not executed because it is a part of a larger if statement. 
# When Python checks the conditions, it exits the statement as soon as the condition is met.
x = 4

if x == 1: # False, so continue
    print("One")
elif x == 2: # False, so continue
    print("Two")
elif x == 3: # False, so continue
    print("Three")
elif x == 4: # True, exit the statement
    print("Four")
elif x == 5: # Never checked because terminated earlier
    print("Five")
elif x == 6: # Never checked because terminated earlier
    print("Six")
else:        # Never checked because terminated earlier
    print("Else")

Four


In [9]:
# The statement above is different from:
# In this case, each if statement is checked to see if it's True. Else statement is printed because it is a part of
# if x == 6. Because x is not equal to 6, else statement is executed

x = 4

if x == 1:
    print("One")
if x == 2:
    print("Two")
if x == 3:
    print("Three")
if x == 4:
    print("Four")
if x == 5:
    print("Five")
if x == 6:
    print("Six")
else:
    print("Else")

Four
Else


In [10]:
# Perhaps a better example to illustrate how if statements work is:

if x > 1: # True, exit the statement
    print("One")
elif x > 2: # Never checked because terminated earlier
    print("Two")
elif x > 3: # Never checked because terminated earlier
    print("Three")
elif x > 4: # Never checked because terminated earlier
    print("Four")
elif x > 5: # Never checked because terminated earlier
    print("Five")
elif x > 6: # Never checked because terminated earlier
    print("Six")
else:        # Never checked because terminated earlier
    print("Else")

One


In [11]:
x = 4

# Each of these statement is checked against the condition because these are all separate if statements

if x > 1: 
    print("One")
if x > 2:
    print("Two")
if x > 3:
    print("Three")
if x > 4:
    print("Four")
if x > 5:
    print("Five")
if x > 6:
    print("Six")
else:
    print("Else")

One
Two
Three
Else


In [13]:
# You can also combine multiple conditions and if statements
x = 3

if x > 1 and x < 4:
    print('x is within 1 and 4')
else:
    print('x is not within 1 and 4')

x is within 1 and 4


In [14]:
x = 5

if x > 1 and x < 4:
    print('x is within 1 and 4')
else:
    print('x is not within 1 and 4')

x is not within 1 and 4


In [18]:
x = 4
y = 3
if x > 1:
    if y < 4:
        print('x is within 1 and 4')

x is within 1 and 4


In [20]:
# Here's how you can combine loops and if statements:
# Print only even numbers

for number in range(0,21):
    if number % 2 == 0:
        print(number)

0
2
4
6
8
10
12
14
16
18
20


# Pandas
Pandas is an open source library that provides easy-to-use data structures and data analytics tools for Python. It allows you to load data from difference sources into Python and then use Python code to analyze that data. Pandas does not come pre-installed, so you will need to pip install pandas using your command prompt. 

Probably the most widely used data structure in pandas is data frame. You can think of a data frame as an Excel table.

In [1]:
# First, import pandas
import pandas as pd

In [22]:
df = pd.DataFrame

In [24]:
df

pandas.core.frame.DataFrame

In [72]:
# Perhaps the easiest way to instantiate a data frame (create a data frame with some values) is by using a dictionary.
# And then convert the dictionary to a data frame
d = {'col1': [1, 2], 'col2': [3, 4]}
df = pd.DataFrame(d)
df

Unnamed: 0,col1,col2
0,1,3
1,2,4


Functions from pandas_datareader.data and pandas_datareader.wb extract data from various Internet sources into a pandas DataFrame. Currently the following sources are supported:

* Yahoo! Finance
* Google Finance
* Enigma
* St.Louis FED (FRED)
* Kenneth French’s data library
* World Bank
* OECD
* Eurostat
* Thrift Savings Plan
* Oanda currency historical rate
* Nasdaq Trader symbol definitions (remote_data.nasdaq_symbols)

It should be noted, that various sources support different kinds of data, so not all sources implement the same methods and the data elements returned might also differ.

In [2]:
# Pandas data reader allows us to download data from the web.
import pandas_datareader as web

# Datetime is a Python library used to work with date and time
import datetime as dt

In [4]:
# Format is Year, Month, Day

start = dt.datetime(2016,1,1) # Jan 1, 2016

end = dt.datetime(2017,1,1) # Jan 1, 2017

In [5]:
# Let's download stock data for Facebook from yahoo


facebook = web.DataReader("FB", 'yahoo', start, end)

# The general format is:
# web.DataReader('ticker', 'source', start date, end date)

In [11]:
facebook

Unnamed: 0_level_0,Open,High,Low,Close,Adj Close,Volume
Date,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1
2016-01-04,101.949997,102.239998,99.750000,102.220001,102.220001,37912400
2016-01-05,102.889999,103.709999,101.669998,102.730003,102.730003,23258200
2016-01-06,101.129997,103.769997,100.900002,102.970001,102.970001,25096200
2016-01-07,100.500000,101.430000,97.300003,97.919998,97.919998,45172900
2016-01-08,99.879997,100.500000,97.029999,97.330002,97.330002,35402300
2016-01-11,97.910004,98.599998,95.389999,97.510002,97.510002,29873100
2016-01-12,99.000000,99.959999,97.550003,99.370003,99.370003,28395400
2016-01-13,100.580002,100.580002,95.209999,95.440002,95.440002,33410600
2016-01-14,95.849998,98.870003,92.449997,98.370003,98.370003,48658600
2016-01-15,93.980003,96.379997,93.540001,94.970001,94.970001,46132800


In [7]:
facebook.head() # Shows the first n rows. Default is 5

Unnamed: 0_level_0,Open,High,Low,Close,Adj Close,Volume
Date,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1
2016-01-04,101.949997,102.239998,99.75,102.220001,102.220001,37912400
2016-01-05,102.889999,103.709999,101.669998,102.730003,102.730003,23258200
2016-01-06,101.129997,103.769997,100.900002,102.970001,102.970001,25096200
2016-01-07,100.5,101.43,97.300003,97.919998,97.919998,45172900
2016-01-08,99.879997,100.5,97.029999,97.330002,97.330002,35402300


In [8]:
facebook.tail() # Shows the alst n rows. Default is 5 

Unnamed: 0_level_0,Open,High,Low,Close,Adj Close,Volume
Date,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1
2016-12-23,117.0,117.559998,116.300003,117.269997,117.269997,10877300
2016-12-27,116.959999,118.68,116.860001,118.010002,118.010002,12051500
2016-12-28,118.190002,118.25,116.650002,116.919998,116.919998,12087400
2016-12-29,117.0,117.529999,116.059998,116.349998,116.349998,9921400
2016-12-30,116.599998,116.830002,114.769997,115.050003,115.050003,18684100


#### Writing to files

In [14]:
# We can also save facebook dataframe as a file

# NOTE: all your files should be saved in the default Anaconda directory located at "C:\Users\YOURUSERNAME\Anaconda3"
# NOTE: you can change where the files are saved by providing full path
# For example, if you wanna save in your desktop, then use "C:\Users\YOURUSERNAME\Desktop\NAME_OF_THE_FILE.txt"

# As CSV
facebook.to_csv('facebook.csv')

In the example above, we are saving our facebook dataframe as a CSV (Comma Separated Values) file. By default, when saving as a CSV file, values will be separated with commas ',' (obviously), but we can change the separator if we need to. For example, we can use semi-colon ';' to separate our values. Finally, the to_csv method allows as to save a file in different formats (like .txt).

In [15]:
# As text file
facebook.to_csv('facebook.txt')

In [16]:
# As text file separated by ';'
facebook.to_csv('facebook_semi_colon.txt', sep=';')

In [17]:
# We can also save facebook as a csv file using ';' as a separator, but if you try to open it with Excel, you will get one big column
facebook.to_csv('facebook_semi_colon.csv', sep=';')

In [13]:
# As Excel file
facebook.to_excel('facebook.xlsx')

It is also possible to save into other formats like JSON and HTML, but we won't be working with those.<br>
Now let's talk about how we can read files into to a dataframe. Let's read out newly created files into new dataframes.

#### Reading from files

In [19]:
# All we have to do is to use pandas to read the data and then assign our results (so we won't lose them) to a variable
df1 = pd.read_csv('facebook.csv')

# Print df1
df1

Unnamed: 0,Date,Open,High,Low,Close,Adj Close,Volume
0,2016-01-04,101.949997,102.239998,99.750000,102.220001,102.220001,37912400
1,2016-01-05,102.889999,103.709999,101.669998,102.730003,102.730003,23258200
2,2016-01-06,101.129997,103.769997,100.900002,102.970001,102.970001,25096200
3,2016-01-07,100.500000,101.430000,97.300003,97.919998,97.919998,45172900
4,2016-01-08,99.879997,100.500000,97.029999,97.330002,97.330002,35402300
5,2016-01-11,97.910004,98.599998,95.389999,97.510002,97.510002,29873100
6,2016-01-12,99.000000,99.959999,97.550003,99.370003,99.370003,28395400
7,2016-01-13,100.580002,100.580002,95.209999,95.440002,95.440002,33410600
8,2016-01-14,95.849998,98.870003,92.449997,98.370003,98.370003,48658600
9,2016-01-15,93.980003,96.379997,93.540001,94.970001,94.970001,46132800


As we can see our Date column was imported as a column and not an index. We can easily change using 2 different ways

In [61]:
# 1) Specify that Date should be used as an index while importing the data
df1 = pd.read_csv('facebook.csv', index_col='Date')

# Print df1
df1

Unnamed: 0_level_0,Open,High,Low,Close,Adj Close,Volume
Date,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1
2016-01-04,101.949997,102.239998,99.750000,102.220001,102.220001,37912400
2016-01-05,102.889999,103.709999,101.669998,102.730003,102.730003,23258200
2016-01-06,101.129997,103.769997,100.900002,102.970001,102.970001,25096200
2016-01-07,100.500000,101.430000,97.300003,97.919998,97.919998,45172900
2016-01-08,99.879997,100.500000,97.029999,97.330002,97.330002,35402300
2016-01-11,97.910004,98.599998,95.389999,97.510002,97.510002,29873100
2016-01-12,99.000000,99.959999,97.550003,99.370003,99.370003,28395400
2016-01-13,100.580002,100.580002,95.209999,95.440002,95.440002,33410600
2016-01-14,95.849998,98.870003,92.449997,98.370003,98.370003,48658600
2016-01-15,93.980003,96.379997,93.540001,94.970001,94.970001,46132800


In [62]:
# Import the data as it is, and then set Date as an index
df1 = pd.read_csv('facebook.csv')
df1.set_index('Date', inplace=True)

# Print df1
df1

Unnamed: 0_level_0,Open,High,Low,Close,Adj Close,Volume
Date,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1
2016-01-04,101.949997,102.239998,99.750000,102.220001,102.220001,37912400
2016-01-05,102.889999,103.709999,101.669998,102.730003,102.730003,23258200
2016-01-06,101.129997,103.769997,100.900002,102.970001,102.970001,25096200
2016-01-07,100.500000,101.430000,97.300003,97.919998,97.919998,45172900
2016-01-08,99.879997,100.500000,97.029999,97.330002,97.330002,35402300
2016-01-11,97.910004,98.599998,95.389999,97.510002,97.510002,29873100
2016-01-12,99.000000,99.959999,97.550003,99.370003,99.370003,28395400
2016-01-13,100.580002,100.580002,95.209999,95.440002,95.440002,33410600
2016-01-14,95.849998,98.870003,92.449997,98.370003,98.370003,48658600
2016-01-15,93.980003,96.379997,93.540001,94.970001,94.970001,46132800


Look at the example above. While setting an index, we had to specify that inplace argument = True. In pandas, changes that you make to the data are not permanent, unless you say inplace=True. This is useful because you can play with your data without being afraid of losing it. 

In [64]:
# Reading from CVS, separated by ';'
df2 = pd.read_csv('facebook_semi_colon.csv', index_col='Date', sep=';')
df2

Unnamed: 0_level_0,Open,High,Low,Close,Adj Close,Volume
Date,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1
2016-01-04,101.949997,102.239998,99.750000,102.220001,102.220001,37912400
2016-01-05,102.889999,103.709999,101.669998,102.730003,102.730003,23258200
2016-01-06,101.129997,103.769997,100.900002,102.970001,102.970001,25096200
2016-01-07,100.500000,101.430000,97.300003,97.919998,97.919998,45172900
2016-01-08,99.879997,100.500000,97.029999,97.330002,97.330002,35402300
2016-01-11,97.910004,98.599998,95.389999,97.510002,97.510002,29873100
2016-01-12,99.000000,99.959999,97.550003,99.370003,99.370003,28395400
2016-01-13,100.580002,100.580002,95.209999,95.440002,95.440002,33410600
2016-01-14,95.849998,98.870003,92.449997,98.370003,98.370003,48658600
2016-01-15,93.980003,96.379997,93.540001,94.970001,94.970001,46132800


In [66]:
# Reading from text file (.txt)
df3 = pd.read_csv('facebook.txt',index_col='Date')
df3

Unnamed: 0_level_0,Open,High,Low,Close,Adj Close,Volume
Date,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1
2016-01-04,101.949997,102.239998,99.750000,102.220001,102.220001,37912400
2016-01-05,102.889999,103.709999,101.669998,102.730003,102.730003,23258200
2016-01-06,101.129997,103.769997,100.900002,102.970001,102.970001,25096200
2016-01-07,100.500000,101.430000,97.300003,97.919998,97.919998,45172900
2016-01-08,99.879997,100.500000,97.029999,97.330002,97.330002,35402300
2016-01-11,97.910004,98.599998,95.389999,97.510002,97.510002,29873100
2016-01-12,99.000000,99.959999,97.550003,99.370003,99.370003,28395400
2016-01-13,100.580002,100.580002,95.209999,95.440002,95.440002,33410600
2016-01-14,95.849998,98.870003,92.449997,98.370003,98.370003,48658600
2016-01-15,93.980003,96.379997,93.540001,94.970001,94.970001,46132800


In [68]:
# Reading from text file (.txt). separated by ';'
df4 = pd.read_csv('facebook_semi_colon.txt',index_col='Date', sep=';')
df4

Unnamed: 0_level_0,Open,High,Low,Close,Adj Close,Volume
Date,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1
2016-01-04,101.949997,102.239998,99.750000,102.220001,102.220001,37912400
2016-01-05,102.889999,103.709999,101.669998,102.730003,102.730003,23258200
2016-01-06,101.129997,103.769997,100.900002,102.970001,102.970001,25096200
2016-01-07,100.500000,101.430000,97.300003,97.919998,97.919998,45172900
2016-01-08,99.879997,100.500000,97.029999,97.330002,97.330002,35402300
2016-01-11,97.910004,98.599998,95.389999,97.510002,97.510002,29873100
2016-01-12,99.000000,99.959999,97.550003,99.370003,99.370003,28395400
2016-01-13,100.580002,100.580002,95.209999,95.440002,95.440002,33410600
2016-01-14,95.849998,98.870003,92.449997,98.370003,98.370003,48658600
2016-01-15,93.980003,96.379997,93.540001,94.970001,94.970001,46132800


In [71]:
# Reading from Excel file
df5 = pd.read_excel('facebook.xlsx', index_col='Date')
df5

Unnamed: 0_level_0,Open,High,Low,Close,Adj Close,Volume
Date,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1
2016-01-04,101.949997,102.239998,99.750000,102.220001,102.220001,37912400
2016-01-05,102.889999,103.709999,101.669998,102.730003,102.730003,23258200
2016-01-06,101.129997,103.769997,100.900002,102.970001,102.970001,25096200
2016-01-07,100.500000,101.430000,97.300003,97.919998,97.919998,45172900
2016-01-08,99.879997,100.500000,97.029999,97.330002,97.330002,35402300
2016-01-11,97.910004,98.599998,95.389999,97.510002,97.510002,29873100
2016-01-12,99.000000,99.959999,97.550003,99.370003,99.370003,28395400
2016-01-13,100.580002,100.580002,95.209999,95.440002,95.440002,33410600
2016-01-14,95.849998,98.870003,92.449997,98.370003,98.370003,48658600
2016-01-15,93.980003,96.379997,93.540001,94.970001,94.970001,46132800


In [78]:
# It is also possible to read data from the web (as we know)
# Go to http://pythonhow.com/supermarkets.json to see what kind of data it is (it is in JSON format)
df6 = pd.read_json('http://pythonhow.com/supermarkets.json')
df6

Unnamed: 0,Address,City,Country,Employees,ID,Name,State
0,3666 21st St,San Francisco,USA,8,1,Madeira,CA 94114
1,735 Dolores St,San Francisco,USA,15,2,Bready Shop,CA 94119
2,332 Hill St,San Francisco,USA,25,3,Super River,California 94114
3,3995 23rd St,San Francisco,USA,10,4,Ben's Shop,CA 94114
4,1056 Sanchez St,San Francisco,USA,12,5,Sanchez,California
5,551 Alvarado St,San Francisco,USA,20,6,Richvalley,CA 94114


#### Slicing and indexing

In [9]:
# Here we have 6 columns
facebook.columns

Index(['Open', 'High', 'Low', 'Close', 'Adj Close', 'Volume'], dtype='object')

In [11]:
# and an index in DateTime format
facebook.index

DatetimeIndex(['2016-01-04', '2016-01-05', '2016-01-06', '2016-01-07',
               '2016-01-08', '2016-01-11', '2016-01-12', '2016-01-13',
               '2016-01-14', '2016-01-15',
               ...
               '2016-12-16', '2016-12-19', '2016-12-20', '2016-12-21',
               '2016-12-22', '2016-12-23', '2016-12-27', '2016-12-28',
               '2016-12-29', '2016-12-30'],
              dtype='datetime64[ns]', name='Date', length=252, freq=None)

In [75]:
# A quick way to check how many rows and columns we have is by using shape method
facebook.shape  # 252 rows, 6 columns (index and column names are not included)

(252, 6)

In [76]:
# To give you a better example of what I mean, let's use our very first dataframe
df

Unnamed: 0,col1,col2
0,1,3
1,2,4


In [77]:
df.shape # 2 rows, 2 columns (index and column names are not included)

(2, 2)

In [79]:
facebook[:10] # First 10 rows

Unnamed: 0_level_0,Open,High,Low,Close,Adj Close,Volume
Date,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1
2016-01-04,101.949997,102.239998,99.75,102.220001,102.220001,37912400
2016-01-05,102.889999,103.709999,101.669998,102.730003,102.730003,23258200
2016-01-06,101.129997,103.769997,100.900002,102.970001,102.970001,25096200
2016-01-07,100.5,101.43,97.300003,97.919998,97.919998,45172900
2016-01-08,99.879997,100.5,97.029999,97.330002,97.330002,35402300
2016-01-11,97.910004,98.599998,95.389999,97.510002,97.510002,29873100
2016-01-12,99.0,99.959999,97.550003,99.370003,99.370003,28395400
2016-01-13,100.580002,100.580002,95.209999,95.440002,95.440002,33410600
2016-01-14,95.849998,98.870003,92.449997,98.370003,98.370003,48658600
2016-01-15,93.980003,96.379997,93.540001,94.970001,94.970001,46132800


In [80]:
facebook['Open'] #only open prices

Date
2016-01-04    101.949997
2016-01-05    102.889999
2016-01-06    101.129997
2016-01-07    100.500000
2016-01-08     99.879997
2016-01-11     97.910004
2016-01-12     99.000000
2016-01-13    100.580002
2016-01-14     95.849998
2016-01-15     93.980003
2016-01-19     96.529999
2016-01-20     92.830002
2016-01-21     94.910004
2016-01-22     96.410004
2016-01-25     98.720001
2016-01-26     97.760002
2016-01-27     97.790001
2016-01-28    107.199997
2016-01-29    108.989998
2016-02-01    112.269997
2016-02-02    114.800003
2016-02-03    115.269997
2016-02-04    111.800003
2016-02-05    109.510002
2016-02-08    100.410004
2016-02-09     97.139999
2016-02-10    101.550003
2016-02-11     99.599998
2016-02-12    103.739998
2016-02-16    103.800003
                 ...    
2016-11-17    116.809998
2016-11-18    118.389999
2016-11-21    118.199997
2016-11-22    122.400002
2016-11-23    121.230003
2016-11-25    121.010002
2016-11-28    120.120003
2016-11-29    120.570000
2016-11-30    120.32

In [93]:
facebook[['Open','Close']] # Putting it in a list to get 2+ columns

Unnamed: 0_level_0,Open,Close
Date,Unnamed: 1_level_1,Unnamed: 2_level_1
2016-01-04,101.949997,102.220001
2016-01-05,102.889999,102.730003
2016-01-06,101.129997,102.970001
2016-01-07,100.500000,97.919998
2016-01-08,99.879997,97.330002
2016-01-11,97.910004,97.510002
2016-01-12,99.000000,99.370003
2016-01-13,100.580002,95.440002
2016-01-14,95.849998,98.370003
2016-01-15,93.980003,94.970001


In [95]:
facebook['2016-01-04':'2016-01-15'] # Selecting between dates (last date included)

Unnamed: 0_level_0,Open,High,Low,Close,Adj Close,Volume
Date,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1
2016-01-04,101.949997,102.239998,99.75,102.220001,102.220001,37912400
2016-01-05,102.889999,103.709999,101.669998,102.730003,102.730003,23258200
2016-01-06,101.129997,103.769997,100.900002,102.970001,102.970001,25096200
2016-01-07,100.5,101.43,97.300003,97.919998,97.919998,45172900
2016-01-08,99.879997,100.5,97.029999,97.330002,97.330002,35402300
2016-01-11,97.910004,98.599998,95.389999,97.510002,97.510002,29873100
2016-01-12,99.0,99.959999,97.550003,99.370003,99.370003,28395400
2016-01-13,100.580002,100.580002,95.209999,95.440002,95.440002,33410600
2016-01-14,95.849998,98.870003,92.449997,98.370003,98.370003,48658600
2016-01-15,93.980003,96.379997,93.540001,94.970001,94.970001,46132800


In [100]:
facebook['2016-01-04':'2016-01-15'][['Close','Volume']] # Selecting Close and Volume between two dates

Unnamed: 0_level_0,Close,Volume
Date,Unnamed: 1_level_1,Unnamed: 2_level_1
2016-01-04,102.220001,37912400
2016-01-05,102.730003,23258200
2016-01-06,102.970001,25096200
2016-01-07,97.919998,45172900
2016-01-08,97.330002,35402300
2016-01-11,97.510002,29873100
2016-01-12,99.370003,28395400
2016-01-13,95.440002,33410600
2016-01-14,98.370003,48658600
2016-01-15,94.970001,46132800


#### Methods
Pandas has a lot of built-in methods. Let's explore some of them.


In [12]:
facebook['Volume'].sum() # Total volume for the whole period

2115453204

In [14]:
facebook['Close'].mean() # Average closing price for the whole period

117.03587302380944

In [96]:
# Calculating Simple Moving Average for 20 days. Rolling(20) basically specifies rolling period that we are gonna use.
# Then we call mean() method to calculate the average price on the rolling period.
# We create a new column to record all the values
# The first 19 values are NaN (Not a Number) because we can't calculate their MA
facebook['Simple MA 20'] = facebook['Close'].rolling(20).mean() 
facebook['Simple MA 20'] 

Date
2016-01-04           NaN
2016-01-05           NaN
2016-01-06           NaN
2016-01-07           NaN
2016-01-08           NaN
2016-01-11           NaN
2016-01-12           NaN
2016-01-13           NaN
2016-01-14           NaN
2016-01-15           NaN
2016-01-19           NaN
2016-01-20           NaN
2016-01-21           NaN
2016-01-22           NaN
2016-01-25           NaN
2016-01-26           NaN
2016-01-27           NaN
2016-01-28           NaN
2016-01-29           NaN
2016-02-01     99.787501
2016-02-02    100.407001
2016-02-03    100.905001
2016-02-04    101.281000
2016-02-05    101.588501
2016-02-08    101.709500
2016-02-09    101.811000
2016-02-10    101.892500
2016-02-11    102.216000
2016-02-12    102.398000
2016-02-16    102.730000
                 ...    
2016-11-17    124.692499
2016-11-18    123.939999
2016-11-21    123.364499
2016-11-22    122.823499
2016-11-23    122.313499
2016-11-25    121.847999
2016-11-28    121.304000
2016-11-29    120.798000
2016-11-30    120.24

In [21]:
facebook['Close'].cummax() # Cumulative maximum = returns maximum value of the column for the whole period
# There is also cummin (cumulative minimum), cumprod (cumulative product), and
# cumsum (cumulative sum)

Date
2016-01-04    102.220001
2016-01-05    102.730003
2016-01-06    102.970001
2016-01-07    102.970001
2016-01-08    102.970001
2016-01-11    102.970001
2016-01-12    102.970001
2016-01-13    102.970001
2016-01-14    102.970001
2016-01-15    102.970001
2016-01-19    102.970001
2016-01-20    102.970001
2016-01-21    102.970001
2016-01-22    102.970001
2016-01-25    102.970001
2016-01-26    102.970001
2016-01-27    102.970001
2016-01-28    109.110001
2016-01-29    112.209999
2016-02-01    115.089996
2016-02-02    115.089996
2016-02-03    115.089996
2016-02-04    115.089996
2016-02-05    115.089996
2016-02-08    115.089996
2016-02-09    115.089996
2016-02-10    115.089996
2016-02-11    115.089996
2016-02-12    115.089996
2016-02-16    115.089996
                 ...    
2016-11-17    133.279999
2016-11-18    133.279999
2016-11-21    133.279999
2016-11-22    133.279999
2016-11-23    133.279999
2016-11-25    133.279999
2016-11-28    133.279999
2016-11-29    133.279999
2016-11-30    133.27

In [25]:
facebook[['Open','Close']].corr() # Correlation

Unnamed: 0,Open,Close
Open,1.0,0.986879
Close,0.986879,1.0


In [27]:
facebook['Close'].cumsum() # cumulative sum of closing prices

Date
2016-01-04      102.220001
2016-01-05      204.950004
2016-01-06      307.920005
2016-01-07      405.840003
2016-01-08      503.170005
2016-01-11      600.680007
2016-01-12      700.050010
2016-01-13      795.490012
2016-01-14      893.860015
2016-01-15      988.830016
2016-01-19     1084.090018
2016-01-20     1178.440016
2016-01-21     1272.600020
2016-01-22     1370.540022
2016-01-25     1467.550024
2016-01-26     1564.890020
2016-01-27     1659.340017
2016-01-28     1768.450018
2016-01-29     1880.660017
2016-02-01     1995.750013
2016-02-02     2110.360014
2016-02-03     2223.050016
2016-02-04     2333.540014
2016-02-05     2437.610014
2016-02-08     2537.360014
2016-02-09     2636.900015
2016-02-10     2737.900015
2016-02-11     2839.810019
2016-02-12     2941.820021
2016-02-16     3043.430022
                  ...     
2016-11-17    26052.980018
2016-11-18    26170.000015
2016-11-21    26291.770012
2016-11-22    26413.240013
2016-11-23    26534.080009
2016-11-25    26654.460

In [30]:
facebook.dropna() # Drops all rows where data is missing
# In our example, column 'Simple MA 20' had NaN for the first 20 rows. By apply dropna() method, we drop those rows
# Notice becaise I do not specify facebook.dropna(inokace=True) our original data is not modified.

Unnamed: 0_level_0,Open,High,Low,Close,Adj Close,Volume,Simple MA 20
Date,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1
2016-02-01,112.269997,115.720001,112.010002,115.089996,115.089996,46132700,99.787501
2016-02-02,114.800003,117.589996,113.199997,114.610001,114.610001,59778600,100.407001
2016-02-03,115.269997,115.339996,109.750000,112.690002,112.690002,56919300,100.905001
2016-02-04,111.800003,111.940002,109.250000,110.489998,110.489998,38648500,101.281000
2016-02-05,109.510002,109.580002,103.180000,104.070000,104.070000,76894700,101.588501
2016-02-08,100.410004,102.680000,97.459999,99.750000,99.750000,71229700,101.709500
2016-02-09,97.139999,102.400002,96.820000,99.540001,99.540001,62580100,101.811000
2016-02-10,101.550003,103.250000,100.239998,101.000000,101.000000,45179400,101.892500
2016-02-11,99.599998,105.110001,98.879997,101.910004,101.910004,43670600,102.216000
2016-02-12,103.739998,104.239998,101.089996,102.010002,102.010002,36176800,102.398000


In [31]:
facebook

Unnamed: 0_level_0,Open,High,Low,Close,Adj Close,Volume,Simple MA 20
Date,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1
2016-01-04,101.949997,102.239998,99.750000,102.220001,102.220001,37912400,
2016-01-05,102.889999,103.709999,101.669998,102.730003,102.730003,23258200,
2016-01-06,101.129997,103.769997,100.900002,102.970001,102.970001,25096200,
2016-01-07,100.500000,101.430000,97.300003,97.919998,97.919998,45172900,
2016-01-08,99.879997,100.500000,97.029999,97.330002,97.330002,35402300,
2016-01-11,97.910004,98.599998,95.389999,97.510002,97.510002,29873100,
2016-01-12,99.000000,99.959999,97.550003,99.370003,99.370003,28395400,
2016-01-13,100.580002,100.580002,95.209999,95.440002,95.440002,33410600,
2016-01-14,95.849998,98.870003,92.449997,98.370003,98.370003,48658600,
2016-01-15,93.980003,96.379997,93.540001,94.970001,94.970001,46132800,


In [34]:
facebook['Momentum'] = facebook['Close'].pct_change(10) # Percent change over given number of periods.
facebook

Unnamed: 0_level_0,Open,High,Low,Close,Adj Close,Volume,Simple MA 20,Momentum
Date,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1
2016-01-04,101.949997,102.239998,99.750000,102.220001,102.220001,37912400,,
2016-01-05,102.889999,103.709999,101.669998,102.730003,102.730003,23258200,,
2016-01-06,101.129997,103.769997,100.900002,102.970001,102.970001,25096200,,
2016-01-07,100.500000,101.430000,97.300003,97.919998,97.919998,45172900,,
2016-01-08,99.879997,100.500000,97.029999,97.330002,97.330002,35402300,,
2016-01-11,97.910004,98.599998,95.389999,97.510002,97.510002,29873100,,
2016-01-12,99.000000,99.959999,97.550003,99.370003,99.370003,28395400,,
2016-01-13,100.580002,100.580002,95.209999,95.440002,95.440002,33410600,,
2016-01-14,95.849998,98.870003,92.449997,98.370003,98.370003,48658600,,
2016-01-15,93.980003,96.379997,93.540001,94.970001,94.970001,46132800,,


### Most important methods

In [37]:
facebook.head() # See first n mumber of rows. Default is 5

Unnamed: 0_level_0,Open,High,Low,Close,Adj Close,Volume,Simple MA 20,Momentum
Date,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1
2016-01-04,101.949997,102.239998,99.75,102.220001,102.220001,37912400,,
2016-01-05,102.889999,103.709999,101.669998,102.730003,102.730003,23258200,,
2016-01-06,101.129997,103.769997,100.900002,102.970001,102.970001,25096200,,
2016-01-07,100.5,101.43,97.300003,97.919998,97.919998,45172900,,
2016-01-08,99.879997,100.5,97.029999,97.330002,97.330002,35402300,,


In [38]:
facebook.tail() # See last n mumber of rows. Default is 5

Unnamed: 0_level_0,Open,High,Low,Close,Adj Close,Volume,Simple MA 20,Momentum
Date,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1
2016-12-23,117.0,117.559998,116.300003,117.269997,117.269997,10877300,118.6125,-0.020137
2016-12-27,116.959999,118.68,116.860001,118.010002,118.010002,12051500,118.4925,0.002038
2016-12-28,118.190002,118.25,116.650002,116.919998,116.919998,12087400,118.294999,-0.028177
2016-12-29,117.0,117.529999,116.059998,116.349998,116.349998,9921400,118.191499,-0.03211
2016-12-30,116.599998,116.830002,114.769997,115.050003,115.050003,18684100,118.189,-0.045783


In [39]:
facebook.dropna() # drop all rows that have NaN <- good to clear the data

Unnamed: 0_level_0,Open,High,Low,Close,Adj Close,Volume,Simple MA 20,Momentum
Date,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1
2016-02-01,112.269997,115.720001,112.010002,115.089996,115.089996,46132700,99.787501,0.211856
2016-02-02,114.800003,117.589996,113.199997,114.610001,114.610001,59778600,100.407001,0.203128
2016-02-03,115.269997,115.339996,109.750000,112.690002,112.690002,56919300,100.905001,0.194383
2016-02-04,111.800003,111.940002,109.250000,110.489998,110.489998,38648500,101.281000,0.173428
2016-02-05,109.510002,109.580002,103.180000,104.070000,104.070000,76894700,101.588501,0.062589
2016-02-08,100.410004,102.680000,97.459999,99.750000,99.750000,71229700,101.709500,0.028244
2016-02-09,97.139999,102.400002,96.820000,99.540001,99.540001,62580100,101.811000,0.022601
2016-02-10,101.550003,103.250000,100.239998,101.000000,101.000000,45179400,101.892500,0.069349
2016-02-11,99.599998,105.110001,98.879997,101.910004,101.910004,43670600,102.216000,-0.065988
2016-02-12,103.739998,104.239998,101.089996,102.010002,102.010002,36176800,102.398000,-0.090901


In [41]:
facebook['Volume'].rolling(20).mean() # Calculate moving average

Date
2016-01-04           NaN
2016-01-05           NaN
2016-01-06           NaN
2016-01-07           NaN
2016-01-08           NaN
2016-01-11           NaN
2016-01-12           NaN
2016-01-13           NaN
2016-01-14           NaN
2016-01-15           NaN
2016-01-19           NaN
2016-01-20           NaN
2016-01-21           NaN
2016-01-22           NaN
2016-01-25           NaN
2016-01-26           NaN
2016-01-27           NaN
2016-01-28           NaN
2016-01-29           NaN
2016-02-01    41927140.0
2016-02-02    43020450.0
2016-02-03    44703505.0
2016-02-04    45381120.0
2016-02-05    46967210.0
2016-02-08    48758580.0
2016-02-09    50393930.0
2016-02-10    51233130.0
2016-02-11    51746130.0
2016-02-12    51122040.0
2016-02-16    51099760.0
                 ...    
2016-11-17    29390920.0
2016-11-18    29580445.0
2016-11-21    30477705.0
2016-11-22    31115325.0
2016-11-23    31240710.0
2016-11-25    30836570.0
2016-11-28    30514360.0
2016-11-29    30675460.0
2016-11-30    310810

In [46]:
# iloc = position based indexing = [from row: up to row,from column: up to column] 
facebook.iloc[1:3,3:5] # Picks rows 1 and 2, and columns 3 and 4

Unnamed: 0_level_0,Close,Adj Close
Date,Unnamed: 1_level_1,Unnamed: 2_level_1
2016-01-05,102.730003,102.730003
2016-01-06,102.970001,102.970001


In [48]:
facebook.iloc[:200,2:5] # Grab first 200 rows, columns 2, 3, 4

Unnamed: 0_level_0,Low,Close,Adj Close
Date,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1
2016-01-04,99.750000,102.220001,102.220001
2016-01-05,101.669998,102.730003,102.730003
2016-01-06,100.900002,102.970001,102.970001
2016-01-07,97.300003,97.919998,97.919998
2016-01-08,97.029999,97.330002,97.330002
2016-01-11,95.389999,97.510002,97.510002
2016-01-12,97.550003,99.370003,99.370003
2016-01-13,95.209999,95.440002,95.440002
2016-01-14,92.449997,98.370003,98.370003
2016-01-15,93.540001,94.970001,94.970001


In [89]:
# Also possible
facebook.iloc[:24][['Open','Close']]

Unnamed: 0_level_0,Open,Close
Date,Unnamed: 1_level_1,Unnamed: 2_level_1
2016-01-04,101.949997,102.220001
2016-01-05,102.889999,102.730003
2016-01-06,101.129997,102.970001
2016-01-07,100.5,97.919998
2016-01-08,99.879997,97.330002
2016-01-11,97.910004,97.510002
2016-01-12,99.0,99.370003
2016-01-13,100.580002,95.440002
2016-01-14,95.849998,98.370003
2016-01-15,93.980003,94.970001


In [78]:
x = pd.read_json('http://pythonhow.com/supermarkets.json')
x = x.set_index('Address')
x

Unnamed: 0_level_0,City,Country,Employees,ID,Name,State
Address,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1
3666 21st St,San Francisco,USA,8,1,Madeira,CA 94114
735 Dolores St,San Francisco,USA,15,2,Bready Shop,CA 94119
332 Hill St,San Francisco,USA,25,3,Super River,California 94114
3995 23rd St,San Francisco,USA,10,4,Ben's Shop,CA 94114
1056 Sanchez St,San Francisco,USA,12,5,Sanchez,California
551 Alvarado St,San Francisco,USA,20,6,Richvalley,CA 94114


In [79]:
# loc = name based indexing = if index has names
x.loc['332 Hill St':'1056 Sanchez St']

Unnamed: 0_level_0,City,Country,Employees,ID,Name,State
Address,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1
332 Hill St,San Francisco,USA,25,3,Super River,California 94114
3995 23rd St,San Francisco,USA,10,4,Ben's Shop,CA 94114
1056 Sanchez St,San Francisco,USA,12,5,Sanchez,California


In [80]:
# ix = combination of names and indexes [3:5,"Name":"Name"] or [3:5,2:3] or ["Name":"Name",4:6] or ["Name":"Name","Name":"Name"]
# !Not recommended!
x.ix[1:4,'City':'Employees']

Unnamed: 0_level_0,City,Country,Employees
Address,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1
735 Dolores St,San Francisco,USA,15
332 Hill St,San Francisco,USA,25
3995 23rd St,San Francisco,USA,10


In [81]:
x.ix['332 Hill St':'1056 Sanchez St',2:5]

Unnamed: 0_level_0,Employees,ID,Name
Address,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1
332 Hill St,25,3,Super River
3995 23rd St,10,4,Ben's Shop
1056 Sanchez St,12,5,Sanchez


In [88]:
facebook.iloc[:24][['Open','Close']]

Unnamed: 0_level_0,Open,Close
Date,Unnamed: 1_level_1,Unnamed: 2_level_1
2016-01-04,101.949997,102.220001
2016-01-05,102.889999,102.730003
2016-01-06,101.129997,102.970001
2016-01-07,100.5,97.919998
2016-01-08,99.879997,97.330002
2016-01-11,97.910004,97.510002
2016-01-12,99.0,99.370003
2016-01-13,100.580002,95.440002
2016-01-14,95.849998,98.370003
2016-01-15,93.980003,94.970001


In [97]:
# Finally, if we wanna drop a column, we use drop method
facebook.drop('Simple MA 20', axis=1, inplace=True)
# NOTE: axis = 1 means that we wanna drop a column. axis = 0 means we wana drop a row
# NOTE: specify inplace=True, to actually remove the column

In [98]:
facebook

Unnamed: 0_level_0,Open,High,Low,Close,Adj Close,Volume,Momentum
Date,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1
2016-01-04,101.949997,102.239998,99.750000,102.220001,102.220001,37912400,
2016-01-05,102.889999,103.709999,101.669998,102.730003,102.730003,23258200,
2016-01-06,101.129997,103.769997,100.900002,102.970001,102.970001,25096200,
2016-01-07,100.500000,101.430000,97.300003,97.919998,97.919998,45172900,
2016-01-08,99.879997,100.500000,97.029999,97.330002,97.330002,35402300,
2016-01-11,97.910004,98.599998,95.389999,97.510002,97.510002,29873100,
2016-01-12,99.000000,99.959999,97.550003,99.370003,99.370003,28395400,
2016-01-13,100.580002,100.580002,95.209999,95.440002,95.440002,33410600,
2016-01-14,95.849998,98.870003,92.449997,98.370003,98.370003,48658600,
2016-01-15,93.980003,96.379997,93.540001,94.970001,94.970001,46132800,


In [106]:
y = facebook.resample(rule='M').last() # Resample data into months, get last price
# http://benalexkeen.com/resampling-time-series-data-with-pandas/ <- other rules

Unnamed: 0_level_0,Open,High,Low,Close,Adj Close,Volume,Momentum
Date,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1
2016-01-31,108.989998,112.839996,108.839996,112.209999,112.209999,62739500,0.140693
2016-02-29,107.599998,108.910004,106.75,106.919998,106.919998,32779000,0.048132
2016-03-31,114.699997,115.010002,113.769997,114.099998,114.099998,21207500,0.017115
2016-04-30,116.82,117.839996,115.839996,117.580002,117.580002,37140600,0.072419
2016-05-31,119.459999,120.099998,118.120003,118.809998,118.809998,23547600,0.00118
2016-06-30,114.669998,115.18,113.669998,114.279999,114.279999,23192700,-0.000962
2016-07-31,124.650002,125.839996,123.709999,123.940002,123.940002,35058800,0.060585
2016-08-31,125.599998,126.220001,125.099998,126.120003,126.120003,14200600,0.014071
2016-09-30,128.029999,128.589996,127.449997,128.270004,128.270004,18402900,-0.006198
2016-10-31,132.009995,132.119995,130.880005,130.990005,130.990005,15669000,0.02705


### The End