In the world of financial analysis, many different instruments and methods are used to determine whether or not a stock is worth investing in, and which ones to include in your portfolio. The S&P500 index is one of the biggest indexes to invest in and has, as the name indicates, over 500 stocks to analyse. The purpose of this project is to collect data for each stock listed on the S&P500 index and look at the adjusted close prices for the companies with data available from 2016 till may 20th of this year. We will collect the data through the Yahoo Finance API and compile all these into a single dataframe for us to analyse. We will use the adjusted close prices to calculate the stock returns of each stock, as well as indexing each of them by their first observation to better compare the evolution of two different stocks. 

In [62]:
"""
write this in the Terminal:
conda install -c anaconda pandas-datareader

When asked:
The following packages will be SUPERSEDED by a higher-priority channel:

  ca-certificates                                 pkgs/main --> anaconda
  certifi                                         pkgs/main --> anaconda
  openssl                                         pkgs/main --> anaconda
  qt                                              pkgs/main --> anaconda
Proceed ([y]/n)?

Press y

jupyter nbextension enable --py widgetsnbextension

jupyter labextension install @jupyter-widgets/jupyterlab-manager

"""

import datetime as dt
import matplotlib.pyplot as plt
from matplotlib import style
import matplotlib.dates as mdates
import pandas as pd
import pandas_datareader as web
import numpy as np
import bs4 as bs
import pickle
import requests
import os
import pickle
import numpy as np
from scipy.stats import norm # normal distribution
import matplotlib.pyplot as plt
import ipywidgets as widgets
from ipywidgets import interact, interact_manual
style.use("ggplot")

First we need to get the tickers for each of the companites on the S&P500 index list, which we later on will
use to append all the historic data to the appropriate ticker. We use *requests.get* to access the data from Wikipedia, which is the list of tickers, after which we use *BeautifulSoup* to then read the HTML format of the website, and make a table. The *soup.find* command is used to specify where exactly in the HTML the table we want is located. 

We make an empty list, and use a for loop to read all entries in the column of the tickers and append each of these to the list we made (called tickers). 

We save it as a pickle as we want to access this data later on, and fast. We called this *"sp500tickers.pickle"*. 


In [63]:
#Automating S&P500 - From Yahoo Finance - Close price adjusted for splits, and Adj. Close price is adjusted for both dividends and splits.
def save_sp500_tickers():
    resp = requests.get("https://en.wikipedia.org/wiki/List_of_S%26P_500_companies")
    soup = bs.BeautifulSoup(resp.text, "lxml")
    table = soup.find("table", {"class": "wikitable sortable"})
    tickers = []
    for row in table.findAll("tr")[1:]:
        ticker = row.findAll("td")[0].text.replace(".","-")
        ticker = ticker[:-1]
        tickers.append(ticker)

    df_tickers = pd.DataFrame(tickers)
    df_tickers.to_csv("tickers.csv")

    with open("sp500tickers.pickle", "wb") as f:
        pickle.dump(tickers, f)
    
        print(tickers)

        return(tickers)
    
save_sp500_tickers()

df_tickers = pd.read_csv("tickers.csv", index_col= 0)
df_tickers.columns = ["Ticker"]
df_tickers.head()

['MMM', 'ABT', 'ABBV', 'ABMD', 'ACN', 'ATVI', 'ADBE', 'AMD', 'AAP', 'AES', 'AMG', 'AFL', 'A', 'APD', 'AKAM', 'ALK', 'ALB', 'ARE', 'ALXN', 'ALGN', 'ALLE', 'AGN', 'ADS', 'LNT', 'ALL', 'GOOGL', 'GOOG', 'MO', 'AMZN', 'AEE', 'AAL', 'AEP', 'AXP', 'AIG', 'AMT', 'AWK', 'AMP', 'ABC', 'AME', 'AMGN', 'APH', 'APC', 'ADI', 'ANSS', 'ANTM', 'AON', 'AOS', 'APA', 'AIV', 'AAPL', 'AMAT', 'APTV', 'ADM', 'ARNC', 'ANET', 'AJG', 'AIZ', 'ATO', 'T', 'ADSK', 'ADP', 'AZO', 'AVB', 'AVY', 'BHGE', 'BLL', 'BAC', 'BK', 'BAX', 'BBT', 'BDX', 'BRK-B', 'BBY', 'BIIB', 'BLK', 'HRB', 'BA', 'BKNG', 'BWA', 'BXP', 'BSX', 'BMY', 'AVGO', 'BR', 'BF-B', 'CHRW', 'COG', 'CDNS', 'CPB', 'COF', 'CPRI', 'CAH', 'KMX', 'CCL', 'CAT', 'CBOE', 'CBRE', 'CBS', 'CE', 'CELG', 'CNC', 'CNP', 'CTL', 'CERN', 'CF', 'SCHW', 'CHTR', 'CVX', 'CMG', 'CB', 'CHD', 'CI', 'XEC', 'CINF', 'CTAS', 'CSCO', 'C', 'CFG', 'CTXS', 'CLX', 'CME', 'CMS', 'KO', 'CTSH', 'CL', 'CMCSA', 'CMA', 'CAG', 'CXO', 'COP', 'ED', 'STZ', 'COO', 'CPRT', 'GLW', 'COST', 'COTY', 'CCI', 'CS

Unnamed: 0,Ticker
0,MMM
1,ABT
2,ABBV
3,ABMD
4,ACN


Next, we need data for each ticker we have in the list. We get this from yahoo, and using the *DataReader* from pandas. 

Remember we saved the tickers as a pickle file, which we will now open and use to append the data to. Also, we make a separate folder for all of the CSV files we will get. We suggest you download the stocks_dfs folder from our repository, as this should save some time. Otherwise you can run the code and a folder with all stock data will be created for you. We define appropriate start and end dates for the data we want, which in this case has been chosen to be the start of 2016 and the latest date with available data. 

We then create a for loop, and save each CSV file with price data in the stock_dfs folder. 

In [64]:
#Getting data from Yahoo
def data_yahoo(reload_sp500=False):
    if reload_sp500:
        tickers = save_sp500_tickers()
    else:
        with open("sp500tickers.pickle", "rb") as f:
            tickers = pickle.load(f)
    if not os.path.exists('stock_dfs'):
        os.makedirs('stock_dfs')

    start = dt.datetime(2016, 1, 1)
    end = dt.datetime.now()
    for ticker in tickers:
        # just in case your connection breaks, we'd like to save our progress!
        if not os.path.exists('stock_dfs/{}.csv'.format(ticker)):
            df = web.DataReader(ticker, 'yahoo', start, end)
            df.to_csv('stock_dfs/{}.csv'.format(ticker))
        else:
            print('Already have {}'.format(ticker))

data_yahoo()

Already have MMM
Already have ABT
Already have ABBV
Already have ABMD
Already have ACN
Already have ATVI
Already have ADBE
Already have AMD
Already have AAP
Already have AES
Already have AMG
Already have AFL
Already have A
Already have APD
Already have AKAM
Already have ALK
Already have ALB
Already have ARE
Already have ALXN
Already have ALGN
Already have ALLE
Already have AGN
Already have ADS
Already have LNT
Already have ALL
Already have GOOGL
Already have GOOG
Already have MO
Already have AMZN
Already have AEE
Already have AAL
Already have AEP
Already have AXP
Already have AIG
Already have AMT
Already have AWK
Already have AMP
Already have ABC
Already have AME
Already have AMGN
Already have APH
Already have APC
Already have ADI
Already have ANSS
Already have ANTM
Already have AON
Already have AOS
Already have APA
Already have AIV
Already have AAPL
Already have AMAT
Already have APTV
Already have ADM
Already have ARNC
Already have ANET
Already have AJG
Already have AIZ
Already have A

Furthermore, the data from yahoo and the tickers are not very useful by themselves, so obvisouly we want to compile the data in order to get a dataframe with all tickers and their data. We open the pickle file again, and make an empty dataframe.

We can then compile each ticker with their data. We drop all columns that are not *Adj Close*, and rename *Adj Close* to the ticker name since we do not have any other data than for Adjusted Close price. We choose to only include this, since the Adjusted Close price takes into account payment of dividends of companies, eventual stock splits, and Rights offerings.

For easier comparison, we also index the data by dividing the first observation in each column by the rest of the column. 

We then use the empty data frame made before, and join all the data together into a single data frame. 

We also convert the data frame to a csv file for easy access to it. 

finally, we create a data frame from the CSV file just saved, and index the date. We call this new data frame *df_stocks*.

It is important to note that we drop those companies that do not have stock data available for the entire period, those being "BHGE", "DWDP", "TPR", "ARNC", "ZBH", "OKE", "EVRG", "COST", "EW", "BBT", "JNJ", "VMC", "LIN", "COTY", "DGX", "ZBH", "FTV", "LW".

In [89]:
def compile_data():
    with open("sp500tickers.pickle", "rb") as f:
        tickers = pickle.load(f)

    main_df = pd.DataFrame()

    #Iterating though all DFs

    for count, ticker in enumerate(tickers):
        df = pd.read_csv("stock_dfs/{}.csv".format(ticker))
        df.set_index("Date", inplace=True)
        df.rename(columns = {"Adj Close": ticker}, inplace=True) #Adj Close takes the categories place in the column - Simple rename
        df.drop(["Open","High","Low","Close","Volume"],1, inplace=True)
    
        if main_df.empty:
            main_df = df
        else:
            main_df = main_df.join(df, how="outer")
        
        if count % 100 == 0: #Only print #100, #200, #300, etc. - It will let you know how 
            print("I have now compiled", count, "stock files")
    print(main_df.head())
    main_df.to_csv("sp500_joined_adj_closes.csv")

compile_data()

data_df = pd.read_csv("sp500_joined_adj_closes.csv")
data_df = data_df.drop(["BHGE", "DWDP", "TPR", "ARNC", "ZBH", "OKE", "EVRG", "COST", "EW", "BBT", "JNJ", "VMC", "LIN", "COTY", "DGX", "ZBH", "FTV", "LW"], axis=1)
data_df.head()

I have now compiled 0 stock files
I have now compiled 100 stock files
I have now compiled 200 stock files
I have now compiled 300 stock files
I have now compiled 400 stock files
I have now compiled 500 stock files
                   MMM        ABT       ABBV       ABMD        ACN       ATVI  \
Date                                                                            
2016-01-04  135.163361  39.841354  50.214832  85.239998  95.316299  36.604977   
2016-01-05  135.752548  39.832066  50.005642  85.000000  95.812393  36.137928   
2016-01-06  133.018326  39.497967  50.014351  85.300003  95.625191  35.797367   
2016-01-07  129.777802  38.551361  49.866177  81.919998  92.817101  35.291401   
2016-01-08  129.335922  37.743946  48.506435  84.580002  91.918503  34.746506   

                 ADBE   AMD         AAP       AES    ...            WLTW  \
Date                                                 ...                   
2016-01-04  91.970001  2.77  151.373932  8.105658    ...      118.

Unnamed: 0,Date,MMM,ABT,ABBV,ABMD,ACN,ATVI,ADBE,AMD,AAP,...,WMB,WLTW,WYNN,XEL,XRX,XLNX,XYL,YUM,ZION,ZTS
0,2016-01-04,135.163361,39.841354,50.214832,85.239998,95.316299,36.604977,91.970001,2.77,151.373932,...,21.791628,118.837708,64.482521,32.163685,25.023848,42.507366,34.595341,48.128719,25.348028,46.09782
1,2016-01-05,135.752548,39.832066,50.005642,85.0,95.812393,36.137928,92.339996,2.75,150.339844,...,21.866369,119.655312,65.701477,32.488022,24.975262,43.138481,34.585754,48.008751,25.072823,46.819458
2,2016-01-06,133.018326,39.497967,50.014351,85.300003,95.625191,35.797367,91.019997,2.51,146.362579,...,19.017845,114.083321,62.297787,32.830379,24.659422,42.349586,34.154266,47.668831,24.342087,46.82922
3,2016-01-07,129.777802,38.551361,49.866177,81.919998,92.817101,35.291401,89.110001,2.28,147.983307,...,17.116064,109.300514,56.437443,32.956512,23.979164,40.678997,33.272129,46.042538,23.611343,45.405415
4,2016-01-08,129.335922,37.743946,48.506435,84.580002,91.918503,34.746506,87.849998,2.14,144.731918,...,16.866919,110.888451,54.140198,32.596134,23.298904,39.8437,32.955711,45.416023,23.345621,44.742283


With all the data collected, we can now look at how the price of each stock has changed through time. The code below will make an interactive widget that shows the observed adjusted close price for each date available in each file.

The reader can choose between two individual stocks and compare the close price for each.

In [90]:
def plot_w(dataframe,ticker,benchmark):
    
    # a. Printing highest observed values and corresponding date
    max1 = data_df.loc[:, ticker].max()
    max2 = data_df.loc[:, benchmark].max()
    max1_date = next(iter(data_df.loc[data_df[ticker] == max1, 'Date']), 'no match') # Finding corresponding date 
    max2_date = data_df[data_df[benchmark] == max2]['Date'].values[0] # Finding corresponding date alternative version
    print("The highest adjusted close price observed at: \n", ticker, ":", max1.round(2), "USD on the date ", max1_date, 
          "\n", benchmark, ":", max2.round(2), "USD on the date", max2_date)
    
    # b. Setting up plot based on dropdown input
    fig, ax = plt.subplots(figsize=(10, 5))
    mpl_figure = dataframe.loc[:, ['Date',ticker,benchmark]]
    mpl_figure.plot(x='Date', y=[ticker,benchmark], style=['-b','-k'], ax=ax, fontsize=11, legend='true', linestyle = '-')
    plt.ylabel("USD",labelpad=5)
    plt.locator_params(axis='x', nbins=20)
    title = "Adjusted close prices for " + ticker + " and " + benchmark
    plt.title(title)
    plt.gcf().autofmt_xdate()
    #date = data_df.iloc[::55,0]
    #datetime = data_df['Date'].dt.strftime('%m/%Y')
    xlabels = data_df.iloc[::52,0]
    #ax.set_xticks(xlabels)
    ax.set_xticklabels(xlabels, minor = False)
    #ax.set_major_formatter(mdates.DateFormatter('%Y-%m'))
    
# c. Creating the widget for the plot
widgets.interact(plot_w,
    dataframe = widgets.fixed(data_df),
    ticker = widgets.Dropdown(
            options=data_df[data_df.columns.difference(['Date'])],
            value='ATVI',
            description='Company 1:',
            disabled=False,
        ),
    benchmark = widgets.Dropdown(
            options=data_df[data_df.columns.difference(['Date'])],
            value='AAPL',
            description='Company 2:',
            disabled=False,
        )
)

interactive(children=(Dropdown(description='Company 1:', index=51, options=('A', 'AAL', 'AAP', 'AAPL', 'ABBV',…

<function __main__.plot_w(dataframe, ticker, benchmark)>

On the graph above, taking our starting point with Activision blizzard and Apple, we see that in most recent times, the two stocks have followed a relatively positive trend, where they both then dropped around early february of this year. The Apple stock has since then managed to bounce back, whereas the Activision Blizzard stock has not. 

A problem does occor, however, when comparing two stocks that have very different price levels. We can fix this problem by indexing the stock data so that each stock is comparable with eachother no matter the price level. 

In [93]:
#Get sp500 index data
data_df_indexed = data_df.set_index("Date", inplace=True) #setting the index so that we don't get a column with dates, that we cannot divide with.  


In [94]:
data_df_indexed = data_df/data_df.iloc[0]*100 #We do the calculation. 

In [95]:
data_df_indexed.head() #We print just to make sure that this has been done right. 

Unnamed: 0_level_0,MMM,ABT,ABBV,ABMD,ACN,ATVI,ADBE,AMD,AAP,AES,...,WMB,WLTW,WYNN,XEL,XRX,XLNX,XYL,YUM,ZION,ZTS
Date,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1,Unnamed: 16_level_1,Unnamed: 17_level_1,Unnamed: 18_level_1,Unnamed: 19_level_1,Unnamed: 20_level_1,Unnamed: 21_level_1
2016-01-04,100.0,100.0,100.0,100.0,100.0,100.0,100.0,100.0,100.0,100.0,...,100.0,100.0,100.0,100.0,100.0,100.0,100.0,100.0,100.0,100.0
2016-01-05,100.435908,99.976686,99.583409,99.718445,100.520471,98.724084,100.4023,99.277979,99.316865,101.378639,...,100.342982,100.688001,101.890367,101.008395,99.805842,101.484719,99.97229,99.750734,98.914292,101.565449
2016-01-06,98.413006,99.138113,99.600753,100.070396,100.32407,97.793717,98.96705,90.613719,96.689422,97.879109,...,87.271338,95.999261,96.611897,102.072818,98.543687,99.628818,98.725047,99.044461,96.031481,101.586625
2016-01-07,96.015519,96.762175,99.305672,96.105115,97.377994,96.411483,96.89029,82.310469,97.7601,95.440101,...,78.54422,91.974607,87.523629,102.464978,95.825248,95.6987,96.175174,95.665412,93.148639,98.497964
2016-01-08,95.688596,94.7356,96.597824,99.22572,96.435241,94.922901,95.520275,77.256322,95.612181,96.818686,...,77.400911,93.310829,83.961044,101.344527,93.106803,93.733637,95.260549,94.363665,92.100344,97.059433


In [96]:
data_df_indexed= data_df_indexed.reset_index() #We then reset the index so that we can use the column "Dates" on the x-axis in the graph.


In [97]:
# a. Printing highest observed values
def plot_index(dataframe_index,ticker,benchmark):
#print(data_df.loc[:, I]) # 850 rows 
    max1 = data_df_indexed.loc[:, ticker].max()
    max2 = data_df_indexed.loc[:, benchmark].max()
    print("The highest stock price observed at: \n", ticker, ":", max1.round(2), " USD on the date ", "2", 
    "\n", benchmark, ":", max2.round(2), " USD on the date", "test")

# b. Setting up plot based on dropdown input
    I = data_df_indexed.columns == ticker
    plt = dataframe_index.loc[:, ['Date',ticker,benchmark]].plot(x='Date', y=[ticker,benchmark], style=['-b','-k'], figsize=(8, 5), fontsize=11, legend='true')

# c. Creating the widget for the plot
widgets.interact(plot_index, 
                 dataframe_index = widgets.fixed(data_df_indexed), 
                 ticker = widgets.Dropdown(
                          options=data_df.columns, value='ATVI', 
                          description='Company 1:', disabled=False,),

                benchmark = widgets.Dropdown(options=data_df_indexed.columns, value='AAPL', 
                                 description='Company 2:', disabled=False,)
    )

interactive(children=(Dropdown(description='Company 1:', index=5, options=('MMM', 'ABT', 'ABBV', 'ABMD', 'ACN'…

<function __main__.plot_index(dataframe_index, ticker, benchmark)>

As we can see now, using the same two stocks as reference, it is now much easier to compare stock development. The conclusions from before when we only looked at raw adjusted close prices are the same now. The trend for each of the two are around the same until early february where Apple then manages to turn around into a positive trend and ATVI remains at a new level. Although the story is still the same for the two, it is now a lot easier to see now that the data has been indexed. 

Next up we wish to analyse the stock returns, which we will do by looking at the percentage change for each day of each individual stock. 

In [98]:
#pct. change on the closing prices. 
data_df_pct_change = data_df.apply(lambda x: (x - x.shift(1))/x.shift(1)*100) #we take the observed price for day 1, substract the value from the day before and then divide by the day before.
data_df_pct_change = data_df_pct_change.fillna(value=0) #Since we cannot subtract a value from the first observed day, we get an entire row of NaN values. We fill these with zeros. 
data_df_pct_change.head() #We print to make sure the dataframe looks correct. 

data_df_pct_change = data_df_pct_change.reset_index() #Again we reset the index in order to use the "Date" on the x-axis. 

In [99]:
# a. Printing highest observed values
def plot_pct_change(dataframe_pct_change,ticker,benchmark): 
#print(data_df.loc[:, I]) # 850 rows 
    max1 = data_df_pct_change.loc[:, ticker].max() #We want to report the highest percentage change and the date it occured for company 1.
    max2 = data_df_pct_change.loc[:, benchmark].max() #We want to report the highest percentage change and the date it occured for company 2.
    print("The highest percentage change: \n", ticker, ":", max1.round(2), " pct. on the date ", "2", 
    "\n", benchmark, ":", max2.round(2), " pct. on the date", "test") 
# b. Setting up plot based on dropdown input
    I = data_df_pct_change.columns == ticker #We define I as all the tickers
    plt = dataframe_pct_change.loc[:, ['Date',ticker,benchmark]].plot(x='Date', y=[ticker,benchmark], style=['-b','-g'], figsize=(8, 5), fontsize=11, legend='true') #We then define the plot itself

# c. Creating the widget for the plot
widgets.interact(plot_pct_change, 
                 dataframe_pct_change = widgets.fixed(data_df_pct_change), 
                 ticker = widgets.Dropdown(
                          options=data_df.columns, value='ATVI', 
                          description='Company 1:', disabled=False,),

                benchmark = widgets.Dropdown(options=data_df_pct_change.columns, value='AAPL', 
                                 description='Company 2:', disabled=False)
    )

interactive(children=(Dropdown(description='Company 1:', index=5, options=('MMM', 'ABT', 'ABBV', 'ABMD', 'ACN'…

<function __main__.plot_pct_change(dataframe_pct_change, ticker, benchmark)>

We now have an instrument that will tell us if a stock has a tendency of yielding very positive and negative returns, or if the returns are somewhat minimal.

We can also see what stocks a single stock is most correlated with, least correlated with, and most negatively correlated with. This will in theory help the individual put together a portfolio of additional stocks if that portfolio only consists of a few stocks already. Since we only have data for a few years, we have decided to base the correlation calculates on the percentage change for each stock, as most literature suggests to base correlation between stocks on their returns on the short run. 

In [100]:
data_df_corr = data_df.pct_change().corr()
df = data_df_corr
data = df
data.head()

Unnamed: 0,MMM,ABT,ABBV,ABMD,ACN,ATVI,ADBE,AMD,AAP,AES,...,WMB,WLTW,WYNN,XEL,XRX,XLNX,XYL,YUM,ZION,ZTS
MMM,1.0,0.434111,0.336578,0.264108,0.498707,0.280339,0.417762,0.206384,0.166423,0.277283,...,0.22397,0.405373,0.251276,0.132721,0.36273,0.488232,0.596775,0.40561,0.340739,0.443617
ABT,0.434111,1.0,0.45736,0.450732,0.48402,0.342146,0.493453,0.290314,0.21478,0.262426,...,0.202408,0.374291,0.226561,0.102206,0.325227,0.371582,0.439535,0.393401,0.343151,0.530334
ABBV,0.336578,0.45736,1.0,0.306029,0.334571,0.214254,0.328809,0.165315,0.151816,0.144665,...,0.208775,0.286081,0.172011,0.059425,0.215166,0.22991,0.307586,0.219717,0.269321,0.440411
ABMD,0.264108,0.450732,0.306029,1.0,0.407676,0.334404,0.486673,0.269574,0.148915,0.160628,...,0.256828,0.27683,0.278435,-0.014634,0.223936,0.281301,0.338265,0.325665,0.236613,0.412202
ACN,0.498707,0.48402,0.334571,0.407676,1.0,0.393214,0.550518,0.254905,0.218867,0.298163,...,0.259396,0.454752,0.267392,0.128098,0.361623,0.414756,0.471732,0.456129,0.385275,0.485583


In [101]:
def visualize_data(dataframe_corr, ticker):                
    #df.set_index('Date', inplace=True) 
    #df_corr = df.corr()
    #df_corr = df.corr()
    #I = data.columns
    #print(df_corr.head)
    
    
    figure = dataframe_corr.loc[:, ticker]
    #print(figure.head())
    #corr_matrix = figure.drop(figure.loc(ticker), axis=0)
    mostpos_corr = dataframe_corr.loc[:, ticker].nlargest(n=6)
    no_corr = dataframe_corr.loc[:, ticker].abs().min()
    mostneg_corr = dataframe_corr.loc[:, ticker].nsmallest(n=5)
    least_corr = dataframe_corr[dataframe_corr[ticker] == no_corr].index.values[0]

    
    print("The 5 most positively correlated companies with", ticker, "are: \n", mostpos_corr[1:], "\n")
    
    print("The company", ticker, "is most uncorrelated with is", least_corr, "with a correlation of:", no_corr.round(6), "\n")

    print("The 5 most negatively correlated companies with", ticker, "are: \n", mostneg_corr[:])


# c. Creating the widget for the plot
widgets.interact(visualize_data, 
                 dataframe_corr = widgets.fixed(data), 
                 ticker = widgets.Dropdown(
                          options=data_df_corr.columns, value='MMM', 
                          description='Company 1:', disabled=False,)
                 

                )
     
#least_corr = dataframe_corr[dataframe_corr[ticker] == no_corr][ticker]
#print(least_corr) 
    

interactive(children=(Dropdown(description='Company 1:', options=('MMM', 'ABT', 'ABBV', 'ABMD', 'ACN', 'ATVI',…

<function __main__.visualize_data(dataframe_corr, ticker)>

We then want to graph a single stock and compare it to the growth of the index as a whole. We start by making a copy of the data we have, using only *"ATVI"* and the *"S&P500"* data. 

On the graph above, we see the growth in the Activision Blizzard stock compared to the S&P500 index. We see that the Activision Blizzard stock has experience a wild growth since its start in 2000. The spike around 2008 is due to the merge between Activision Entertainment and Blizzard Entertainment. The rise since ca. 2013 can be due to the rise in popularity and availability to video games. The eventual drastic fall in the end of the series comes after Blizzcon 2018 where a very unpopular annoucement was made, resulting in t