# DS-1 Project-1

# Visualizing the Madness of Crowds with Python



#### The fundamental reason why we visualize things in Data Science isn't just to show off pretty pictures, but to express ideas that do not lend themselves to easy comprehension.

#### An idea I've obsessed with over the years is that of asset price bubbles: When the price of an asset such as stock prices or gold or real estate decouples from its fundamental drivers by magnitudes previously unseen.  

#### It's important to note that asset bubbles don't necessarily exclude non-financial instruments.  One of the earliest examples of a large-scale financial bubble took place in the market for [Tulips](https://en.wikipedia.org/wiki/Tulip_mania) (Netherlands):


<img src="TulipPriceIndex.png">

#### For this project we're going to look at some exemplary financial time series to get a handle on some of the salient characteristics of asset bubbles.

#### We're also going to use a new library in addition to the standard libraries we use for data analysis:


In [6]:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt

# New library
import pandas_datareader.data as web

import datetime



#### For analyzing financial assets it's standard practice in the industry to compare an investment performance against some benchmark such as the S&P 500 which is an index made up of 500 large market-cap companies in the US usually listed on the New York Stock Exchange or the Nasdaq.

#### Here, we'll be creating a pandas DataFrame for the S&P 500 along with datasets for Bitcoin (the cryptocurrency) and Helios and Matheson which is the public company behind Movie Pass (Helios and Matheson stock ticker is HMNY which is what we're using here, and for Bitcoin we'll be using btc).  And the main variable of interest for our DataFrames in the 'Adjusted Close' which is the regular price adjusted for events like stock splits and therefore a more accurate reflection.

#### We're also making sure all the variables we examine are in the same time frame, in this case from the year 2013 to 2018.  So all of our DataFrames will be in that time frame as their respective indexes with daily frequencies as the frequency.

In [7]:
# S&P 500 DataFrame

sp500 = web.get_data_yahoo('^GSPC', start=datetime.datetime(2013, 1, 3), end=datetime.datetime(2018, 12, 10))
sp500.head()

Unnamed: 0_level_0,High,Low,Open,Close,Volume,Adj Close
Date,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1
2013-01-03,1465.469971,1455.530029,1462.420044,1459.369995,3829730000,1459.369995
2013-01-04,1467.939941,1458.98999,1459.369995,1466.469971,3424290000,1466.469971
2013-01-07,1466.469971,1456.619995,1466.469971,1461.890015,3304970000,1461.890015
2013-01-08,1461.890015,1451.640015,1461.890015,1457.150024,3601600000,1457.150024
2013-01-09,1464.72998,1457.150024,1457.150024,1461.02002,3674390000,1461.02002


In [8]:
# Bitcoin DataFrame

btc = web.get_data_yahoo('BTC-USD', start=datetime.datetime(2013, 1, 3), end=datetime.datetime(2018, 12, 10))
btc.head()                         

Unnamed: 0_level_0,High,Low,Open,Close,Volume,Adj Close
Date,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1
2013-01-03,13.46,13.25,13.28,13.4,240845,13.4
2013-01-04,13.52,13.27,13.4,13.5,397884,13.5
2013-01-05,13.55,13.31,13.5,13.44,286932,13.44
2013-01-06,13.52,13.36,13.44,13.45,171497,13.45
2013-01-07,13.59,13.4,13.45,13.59,344083,13.59


In [10]:
# Helios and Matheson (HMNY) DataFrame

hmny = web.get_data_yahoo('HMNY', start=datetime.datetime(2013, 1, 3), end=datetime.datetime(2018, 12, 10))
hmny.head()

Unnamed: 0_level_0,High,Low,Open,Close,Volume,Adj Close
Date,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1
2013-01-03,950.0,745.0,752.5,827.5,0.0,827.360352
2013-01-04,940.0,762.5,837.5,777.5,0.0,777.368713
2013-01-07,780.0,780.0,780.0,780.0,0.0,779.868347
2013-01-08,780.0,780.0,780.0,780.0,0.0,779.868347
2013-01-09,780.0,780.0,780.0,780.0,0.0,779.868347
