Before you turn this problem in, make sure everything runs as expected. First, **restart the kernel** (in the menu bar, select Kernel$\rightarrow$Restart) and then **run all cells** (in the menu bar, select Cell$\rightarrow$Run All).

Below, please fill in your name and collaborators:

In [None]:
NAME = ""
COLLABORATORS = ""

# Assignment 3 - Time Series Analysis
**(15 points total)**

## Assignment tasks:

In this assignment you will conduct time series analysis of the financial data. 

1. Setup your environment to access and download latest stock data. Please see instructions below for different tools you can use to get the data. You can use any of the options provided, either Quandl or Yahoo Finance. If you know of any other service to download the data, please use that service, provide an explanation in the comments.

2. *(2 points)* Download the **adjusted** close prices for FB, MMM, IBM and AMZN for the last 60 months. If you run into any issues downloading the data from online sources, you can use `.csv` files provided. This will not affect your grade for the assignment.

3. *(3 points)* Resample the data to get prices for the end of the **business** month. Select the **Adjusted Close** for each stock.

4. *(3 points)* Use the pandas `autocorrelation_plot()` function to plot the autocorrelation of the adjusted month-end close prices for each of the stocks.
    - Are they autocorrelated?
    - Provide short explanation.

5. *(4 points)* 
    - Calculate the monthly returns for each stock using the "shift trick" explained in the lecture, using `shift()` function. 
    - Use pandas `autotocorrelation_plot()` to plot the autocorrelation of the monthly returns.
    - Are the returns autocorrelated? Provide short explanation.

6. *(3 points)*
    - Combine all 4 time series (returns) into a single DataFrame,
    - Visualize the correlation between the returns of all pairs of stocks using a scatter plot matrix (use `scatter_matrix()` function from `pandas.plotting`).
    - Explain the results. Is there any correlation?

**NOTES:** 
1. In this assignment, please make sure the DataFrame(s) do not contain any NAs before you plot autocorrelations or scatter matrix.
2. Both options explained below use `pandas-datareader` package for remote data access. To install it, type the following in a command window: `conda install pandas-datareader`. You will also need to install one or more of the following packages `fix_yahoo_finance`, `quandl`. See below.

---------

## Downloading Stock Prices

### Option 1 - Using QUANDL

To use QUANDL service, you need to create an account and get an API Key. Here is the short description of steps:

- Go to https://www.quandl.com/
- Click either `sign up` at the top right corner of the home page, or scroll all the way down and click `Create Free Account` button at the bottom of the page.
- Create an account. 
- You will receive an email to the email address you have used during the registration. Confirm your email.

You are all set.

Now, as you login into your account, click the avatar icon at the top right corner of the page, select `"Account Settings."`
On the next page, you will see `Your API Key` field with a long string of numbers and characters underneath. You need this API key for your call to Quandl from the notebook. In the code below, replace `YOUR_API_KEY` with the actual API key from your account. 

**NOTE**: You can remove this key before submitting the assignment.

In [7]:
# all imports and env variables
import pandas as pd
pd.core.common.is_list_like = pd.api.types.is_list_like
import datetime
import pandas_datareader.data as web

# This line of code should work on Windows and Mac
#%env QUANDL_API_KEY = "YOUR_API_KEY"

# If the above line of code does not work on your system,
# You can use this way of setting Quandl env variable
import quandl
quandl.ApiConfig.api_key = "A7LHyXNXL_HekYdeMsHM"

In [8]:
## TYPE YOUE CODE BELOW

from datetime import date
from dateutil.relativedelta import relativedelta
from typing import List

months_ago: int = 60

start = date.today() + relativedelta(months=-months_ago)
end = date.today()

all_stocks_list = ['FB', 'MMM', 'IBM', 'AMZN']
df_list: List = []

print(f"Start date is {start}")
print(f"End date is {end}")

for name in all_stocks_list:
    df = web.DataReader(name, 'quandl', start, end, api_key = 'A7LHyXNXL_HekYdeMsHM')
    df.name = name
    df_list.append(df)

Start date is 2017-07-24
End date is 2022-07-24


In [9]:
fb_df = df_list[0]

# Resampling data for AMAZON, FB, IBM, MMM and selecting Adjusted Close for each

In [None]:
adjusted_close_list: List = []
import time

for df in df_list:
    name = df.name
    if 'Date' not in df.columns:
        df['Date'] = df.index
    df = df.reset_index(drop=True)
    df['Date'] = pd.to_datetime(df['Date'])  # converting string date to datetime date for Date column
    df = df.loc[df.groupby(df['Date'].dt.to_period('m'))['Date'].idxmax()]
    df = df.set_index(['Date']) # setting df index to Date column

    try:
        adjusted_close_series = df['AdjClose'] # getting the series with AdjClose column

    except KeyError:
        adjusted_close_series = df['Adj Close'] # getting the series with AdjClose column 

    adjusted_close_series.name = name
    adjusted_close_list.append(adjusted_close_series) #appending to list

    

In [6]:
adjusted_close_list[0]

NameError: name 'adjusted_close_list' is not defined

In [None]:
adjusted_close_list[1]


In [None]:
adjusted_close_list[2]

In [None]:
adjusted_close_list[3]

# Autocorrelation function of the adjusted close for each stock closing price

In [None]:
import matplotlib.pyplot as plt

for closing in adjusted_close_list:
    x = pd.plotting.autocorrelation_plot(closing)
    plt.ylabel(f'Autocorrelation from {closing.name}')
    plt.show()
    x.plot()

# Shift trick to manually solve the returns and then showing the correlation function for each

In [None]:
stock_returns_list: List = []

for df in df_list:
    try:
        closing = df['AdjClose']
    except KeyError:
        closing = df['Adj Close']
    shifted_closing_price = closing.shift(1)
    try:
        dividend = df['ExDividend']
    except KeyError:
        dividend = 0
    stock_return = ((closing + dividend)/ shifted_closing_price) - 1
    stock_return = stock_return.fillna(0) # make sure no NaN value
    stock_return.name = df.name
    stock_returns_list.append(stock_return.fillna(0))

    x = pd.plotting.autocorrelation_plot(stock_return)
    plt.ylabel(f'Returns autocorrelation from {df.name}')
    plt.show()
    x.plot()

# Combining returns into 1 dataframe and visualization using scatter plot

In [None]:
all_returns = pd.concat(stock_returns_list, axis = 1);
scatter_matrix = pd.plotting.scatter_matrix(all_returns, figsize=[15,15]);

# for ax in scatter_matrix.ravel():
#     ax.set_xlabel(ax.get_xlabel(), fontsize = 12, rotation = 90);
#     ax.set_ylabel(ax.get_ylabel(), fontsize = 12, rotation = 0);