# Data Analysis - Financial Time Series

**Author**: [Gabriele Pompa](https://www.linkedin.com/in/gabrielepompa/): gabriele.pompa@unisi.com

# Table of contents

[Executive Summary](#summary)

**TODO**

### **Resources**: 

**TODO**

# Executive Summary <a name="summary"></a>

**TODO**

These are the basic imports that we need to work with NumPy, Pandas and to plot data using Matplotlib functionalities

In [1]:
# for NumPy arrays
import numpy as np

# for Pandas Series and DataFrame
import pandas as pd

# for Matplotlib plotting
import matplotlib.pyplot as plt

# to do inline plots in the Notebook
%matplotlib inline

# for Operating System operations
import os

# 1. Introduction <a name="introduction"></a>

When you need to process data using a programming languages, either you have your data stored in a file or database (as we have seen in the previous lesson) or you get your data from a Data Provider, such as Bloomberg, Reuters, etc.

Typically, data providers store secured their data in remote servers and expose interfaces to the public: these are called [Application Programming Interfaces](https://en.wikipedia.org/wiki/Application_programming_interface) (APIs). For what concern us, an API is a particular piece of code that allows you to get data from a data provider.

There are plenty of APIs which manage the interface between Python code and financial data, like [Reuters Eikon Data API](https://developers.refinitiv.com/eikon-apis/eikon-data-api). Being a business in itself, most of APIs to retrieve financial data are not for free.

Luckily for us, and thanks to people like [Ran Aroussi](https://aroussi.com/), we have a Python API, called [yfinance](https://github.com/ranaroussi/yfinance), which is 100% free:

- yfinance Github page: [https://github.com/ranaroussi/yfinance](https://github.com/ranaroussi/yfinance);
- Blog post form Ran Aroussi with a yfinance tutorial: [https://aroussi.com/post/python-yahoo-finance](https://aroussi.com/post/python-yahoo-finance).

In a nutshell, yfinance - named after the now decommissioned _Yahoo! Finance_ API, is a reliable Python API to retrieve market data.

The yfinance API comes under the form of a Python module: `yfinance`. We shall see now how to include it in our Conda installation. 

## 1.1. Installing `yfinance` <a name="installing-yfinance"></a>

All you need to do to install yfinance library is to:

- (If not done already) In your Anaconda Navigator switch to the class `ITForBusAndFin2020_env` conda environment (see [Figure 1](#anaconda_nav_and_env)). For Mac users: in case you are working under `base (root)` environment, it's ok, you can stay there.

| ![](../images/anaconda_nav_and_env.PNG) <a name="anaconda_nav_and_env"></a>| 
|:--:| 
| _**Figure 1**: in Anaconda Navigator, switch to the class conda environment_ |

- (iIf not done already) Open your terminal window (the usual black command line window) using the _CMD.exe Prompt_ app or the _console_shortcut_ app in the Anaconda Navigator (both apps are fine and you have displayed one or the other depending on whether you have already updated the Anaconda Navigator or not yet, see [Figure 2](#open_terminal)).

| ![](../images/CMD_exe.PNG) | 
|:--:| 
| ![](../images/console_shortcut.PNG) | 
| _**Figure 2**: Open a Terminal window using the  CMD.exe Prompt app or console_shorcut app in Anaconda Navigator_ <a name="open_terminal"></a>|

- (If not done already) In the terminal window, change directory to your local class folder typing `cd` followed by the complete path to the class folder, like `C:\Users\gabri\Projects\IT_For_Business_And_Finance_2019_20` on my local machine (see [Figure 3](#yfinance_png) below) 


- In the terminal window type the command to install `yfinance` (see [Figure 3](#yfinance_png)):
  ```
  pip install yfinance --upgrade --no-cache-dir
  ```
  
| ![](../images/yfinance.png) <a name="yfinance_png"></a>| 
|:--:| 
| _**Figure 3**: change directory to the class folder and install `yfinance` package in conda_ |



- Always type `y` when asked for installation confirmation;


- You can check that `yfinance` is now part of the packages available in your conda environment typing
  ```
  conda list 
  ```
  which lists all the packages installed (see resulting screen from `conda list` command in [Figure 4](#conda_list_yfinance))

| ![](../images/conda_list_yfinance.png) <a name="conda_list_yfinance"></a>| 
|:--:| 
| _**Figure 4**: check that `yfinance` is installed, typing `conda list`_ |


## 1.2. `yfinance` basic usage <a name="yfinance-basic-usage"></a>

To use yfinance library, we just import the corresponding `yfinance` Python module, giving it the `yf` alias.

In [2]:
import yfinance as yf

For details on `yfinance` usage, see the [dedicated blog post](https://aroussi.com/post/python-yahoo-finance) from Ran Aroussi. Broadly speaking, `yfinance` allows you to:

- get market and meta data for one (or more than one) security, using the `yf.Ticker()` module;
- doing mass download of market data, using the `yf.download()` function.

Let's reuse the utility function to delete files

In [3]:
def removeFile(fileName):
    """
    removeFile(fileName) function remove file 'fileName', if it exists. It also prints on screen a success/failure message.
    
    Parameters:
        fileName (str): name of the file ('Data' folder is assumed)
        
    Returns:
        None
    """

    if os.path.isfile(os.path.join(dataFolderPath, fileName)):
        os.remove(os.path.join(dataFolderPath, fileName))

        # double-check if file still exists
        fileStillExists = os.path.isfile(os.path.join(dataFolderPath, fileName))

        if fileStillExists:
            print("Failure: file {} still exists...".format(fileName))
        else:
            print("Success: file {} successfully removed!".format(fileName))
            
    else:
        print("File {} already removed.".format(fileName))

### 1.2.1. How to lookup for a Yahoo! Finance ticker of a security <a name="how-to-lookup-for-a-ticker-of-a-security"></a>

If you want to get market data and information for a security, you need to use the `yf.Ticker()` module, which takes in input the appropriate security symbol:

```python
yf.Ticker(SymbolString)
```

where `SymbolString` is the Python String representing the symbol ticker of the desired security (like 'AAPL', 'MSFT', etc.).

Most symbols are well known from financial news (like 'AAPL' for Apple Inc. or 'MSFT' for Microsoft Corporation, etc.), but in case you know the public name of a company or security but don't remember the corresponding symbol, you can use the [Symbol Lookup from Yahoo Finance](https://finance.yahoo.com/lookup/). 

Suppose you want to look for the Fiat Chrysler Automobiles symbol. Start writing the public name of the company in the search bar and you get the back the information that the symbol is **'FCAU'**.

| ![](../images/yahoo_symbol_lookup.png) <a name="yahoo_symbol_lookup"></a>| 
|:--:| 
| _**Figure 5**: Symbol Lookup from Yahoo Finance_ |

Here is a list of tickers that we will use in this notebook

Yahoo! Finance ticker | Name
:---: | :---
    'AAPL' | Apple Inc. Stock
    'GOOG' | Alphabet Inc. Stock
    'FB'   | Facebook, Inc. Stock
    'MSFT' | Microsoft Corporation Stock
    'INTC' | Intel Corporation Stock
    'AMZN' | Amazon.com, Inc. Stock
    'BABA' | Alibaba Group Holding Limited Stock
    'NFLX' | Netflix, Inc. Stock
    'DIS'  | The Walt Disney Company Stock
    'GS'   | The Goldman Sachs Group, Inc. Stock
    'DB'   | Deutsche Bank Aktiengesellschaft Stock
    '^GSPC'| S&P 500 Index
    '^VIX' | CBOE Volatility Index
    'EURUSD=X' | EUR/USD Exchange Rate
    'EURCHF=X' | EUR/CHF Exchange Rate
    'EURGBP=X' | EUR/GBP Exchange Rate
    'FCAU' | Fiat Chrysler Automobiles N.V.
    'E' | Eni S.p.A. Stock
    'ENIA' | Enel Americas S.A. Stock

### 1.2.2. How to get market and meta data for a security: `yf.Ticker()` module <a name="get-market-and-meta-data:-yf.ticker()-module"></a>

In [None]:
aapl = yf.Ticker("AAPL")
aapl

In [None]:
aapl_info = aapl.info
aapl_info

In [None]:
aapl_info['longBusinessSummary']

In [None]:
aapl_info['regularMarketPreviousClose']

In [17]:
import json

In [18]:
dataFolderPath = "../Data"

In [None]:
filePath = os.path.join(dataFolderPath, "aapl_stock_info.json")

with open(filePath, 'w') as file:
    %time json.dump(aapl_info, file, indent="\t")

In [None]:
# removeFile(filePath)

In [None]:
aapl.actions

In [None]:
ax = aapl.actions.plot(secondary_y="Dividends")

ax.set_ylabel("Number of Stock Splits")
ax.right_ax.set_ylabel("Dividends (USD)")

In [None]:
aapl_history = aapl.history(period="max", interval="1wk")
aapl_history

In [None]:
ax = aapl_history["Close"].plot()

ax.set_title("AAPL")
ax.set_ylabel("Close Price (USD)")

In [None]:
aapl_history_last_two_years = aapl.history(start="2018-04-06", end="2020-04-06")
aapl_history_last_two_years

In [None]:
ax = aapl_history_last_two_years["Close"].plot()

ax.set_title("AAPL")
ax.set_ylabel("Close Price (USD)")

In [None]:
ax = aapl_history_last_two_years.loc["2020-01-01":, "Close"].plot()

ax.set_title("AAPL")
ax.set_ylabel("Close Price (USD)")

#### 1.2.2.1. multiple securities simultaneously: `yf.Tickers()` module <a name="multiple-securities-simultaneously:-yf.tickers()-module"></a>

In [None]:
securities = yf.Tickers('FB AMZN NFLX GOOG')
securities

In [None]:
fb = securities.tickers.FB
amzn = securities.tickers.AMZN
nflx = securities.tickers.NFLX
goog = securities.tickers.GOOG

In [None]:
goog_info = goog.info
goog_info

In [None]:
goog_info['longName']

In [None]:
info_dict = {'FB': fb.info,
             'AMZN': amzn.info, 
             'NFLX': nflx.info,
             'GOOG': goog_info}

filePath = os.path.join(dataFolderPath, "FANG_stocks_info.json")

with open(filePath, 'w') as file:
    %time json.dump(info_dict, file, indent="\t")

In [None]:
# removeFile(filePath)

In [None]:
nflx_history_last_two_years = nflx.history(start="2018-04-06", end="2020-04-06")
nflx_history_last_two_years

In [None]:
ax = nflx_history_last_two_years["Close"].plot()

ax.set_title("NFLX")
ax.set_ylabel("Close Price (USD)")

In [None]:
ax = nflx_history_last_two_years.loc["2020-01-01":, "Close"].plot()

ax.set_title("NFLX")
ax.set_ylabel("Close Price (USD)")

### 1.2.3. mass download of market data: `yf.download()` function <a name="get-market-and-meta-data:-yf.download()-function"></a>

In [4]:
securityTickerToNameDict = {
    'AAPL':     "Apple Inc. Stock",
    'GOOG':     "Alphabet Inc. Stock",
    'FB':       "Facebook, Inc. Stock",
    'MSFT':     "Microsoft Corporation Stock",
    'INTC':     "Intel Corporation Stock",
    'AMZN':     "Amazon.com, Inc. Stock",
    'BABA':     "Alibaba Group Holding Limited Stock",
    'NFLX':     "Netflix, Inc. Stock",
    'DIS':      "The Walt Disney Company Stock",
    'GS':       "The Goldman Sachs Group, Inc. Stock",
    'DB':       "Deutsche Bank Aktiengesellschaft Stock",
    '^GSPC':    "S&P 500 Index",
    '^VIX':     "CBOE Volatility Index",
    'EURUSD=X': "EUR/USD Exchange Rate",
    'EURCHF=X': "EUR/CHF Exchange Rate",
    'EURGBP=X': "EUR/GBP Exchange Rate",
    'FCAU':     "Fiat Chrysler Automobiles N.V.",
    'E':        "Eni S.p.A. Stock",
    'ENIA':     "Enel Americas S.A. Stock"
}

In [5]:
tickerList = list(securityTickerToNameDict.keys())
tickerList

['AAPL',
 'GOOG',
 'FB',
 'MSFT',
 'INTC',
 'AMZN',
 'BABA',
 'NFLX',
 'DIS',
 'GS',
 'DB',
 '^GSPC',
 '^VIX',
 'EURUSD=X',
 'EURCHF=X',
 'EURGBP=X',
 'FCAU',
 'E',
 'ENIA']

In [6]:
tickerListString = ' '.join(tickerList)
tickerListString

'AAPL GOOG FB MSFT INTC AMZN BABA NFLX DIS GS DB ^GSPC ^VIX EURUSD=X EURCHF=X EURGBP=X FCAU E ENIA'

In [7]:
data = yf.download(tickers=tickerListString, period="max")

[*********************100%***********************]  19 of 19 completed


In [25]:
data

Unnamed: 0_level_0,Adj Close,Adj Close,Adj Close,Adj Close,Adj Close,Adj Close,Adj Close,Adj Close,Adj Close,Adj Close,...,Volume,Volume,Volume,Volume,Volume,Volume,Volume,Volume,Volume,Volume
Unnamed: 0_level_1,AAPL,AMZN,BABA,DB,DIS,E,ENIA,EURCHF=X,EURGBP=X,EURUSD=X,...,EURUSD=X,FB,FCAU,GOOG,GS,INTC,MSFT,NFLX,^GSPC,^VIX
Date,Unnamed: 1_level_2,Unnamed: 2_level_2,Unnamed: 3_level_2,Unnamed: 4_level_2,Unnamed: 5_level_2,Unnamed: 6_level_2,Unnamed: 7_level_2,Unnamed: 8_level_2,Unnamed: 9_level_2,Unnamed: 10_level_2,Unnamed: 11_level_2,Unnamed: 12_level_2,Unnamed: 13_level_2,Unnamed: 14_level_2,Unnamed: 15_level_2,Unnamed: 16_level_2,Unnamed: 17_level_2,Unnamed: 18_level_2,Unnamed: 19_level_2,Unnamed: 20_level_2,Unnamed: 21_level_2
1927-12-30,,,,,,,,,,,...,,,,,,,,,0.000000e+00,
1928-01-03,,,,,,,,,,,...,,,,,,,,,0.000000e+00,
1928-01-04,,,,,,,,,,,...,,,,,,,,,0.000000e+00,
1928-01-05,,,,,,,,,,,...,,,,,,,,,0.000000e+00,
1928-01-06,,,,,,,,,,,...,,,,,,,,,0.000000e+00,
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
2020-04-02,244.929993,1918.829956,188.899994,6.02,96.970001,21.570000,5.79,1.05821,0.88450,1.095362,...,0.0,20886300.0,2838900.0,1964900.0,4446900.0,27810000.0,49630700.0,4592500.0,6.454990e+09,0.0
2020-04-03,241.410004,1906.589966,187.110001,5.90,93.879997,20.219999,6.82,1.05675,0.87503,1.084740,...,0.0,25983300.0,1910500.0,2313400.0,2801600.0,23906100.0,41243300.0,4860800.0,6.087190e+09,0.0
2020-04-06,262.470001,1997.589966,196.449997,6.38,99.580002,20.969999,7.11,1.05594,0.88427,1.080696,...,0.0,28453600.0,3663900.0,2664700.0,4689400.0,32323400.0,67021300.0,8183900.0,6.391860e+09,0.0
2020-04-07,259.429993,,198.000000,6.64,101.239998,20.209999,7.29,1.05629,0.88236,1.080380,...,0.0,31379400.0,,2383000.0,4950700.0,41394700.0,,7040000.0,7.040720e+09,0.0


In [26]:
data["Adj Close"].loc["2020-04-08", "AAPL"]

264.6300048828125

In [15]:
dataClose = data["Close"]

Unnamed: 0_level_0,AAPL,AMZN,BABA,DB,DIS,E,ENIA,EURCHF=X,EURGBP=X,EURUSD=X,FB,FCAU,GOOG,GS,INTC,MSFT,NFLX,^GSPC,^VIX
Date,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1,Unnamed: 16_level_1,Unnamed: 17_level_1,Unnamed: 18_level_1,Unnamed: 19_level_1
1927-12-30,,,,,,,,,,,,,,,,,,17.66,
1928-01-03,,,,,,,,,,,,,,,,,,17.76,
1928-01-04,,,,,,,,,,,,,,,,,,17.719999,
1928-01-05,,,,,,,,,,,,,,,,,,17.549999,
1928-01-06,,,,,,,,,,,,,,,,,,17.66,


In [12]:
dataClose.head()

Unnamed: 0_level_0,AAPL,AMZN,BABA,DB,DIS,E,ENIA,EURCHF=X,EURGBP=X,EURUSD=X,FB,FCAU,GOOG,GS,INTC,MSFT,NFLX,^GSPC,^VIX
Date,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1,Unnamed: 16_level_1,Unnamed: 17_level_1,Unnamed: 18_level_1,Unnamed: 19_level_1
1927-12-30,,,,,,,,,,,,,,,,,,17.66,
1928-01-03,,,,,,,,,,,,,,,,,,17.76,
1928-01-04,,,,,,,,,,,,,,,,,,17.719999,
1928-01-05,,,,,,,,,,,,,,,,,,17.549999,
1928-01-06,,,,,,,,,,,,,,,,,,17.66,


In [13]:
dataClose.tail()

Unnamed: 0_level_0,AAPL,AMZN,BABA,DB,DIS,E,ENIA,EURCHF=X,EURGBP=X,EURUSD=X,FB,FCAU,GOOG,GS,INTC,MSFT,NFLX,^GSPC,^VIX
Date,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1,Unnamed: 16_level_1,Unnamed: 17_level_1,Unnamed: 18_level_1,Unnamed: 19_level_1
2020-04-02,244.929993,1918.829956,188.899994,6.02,96.970001,21.57,5.79,1.05821,0.8845,1.095362,158.190002,6.81,1120.839966,149.929993,54.349998,155.259995,370.079987,2526.899902,50.91
2020-04-03,241.410004,1906.589966,187.110001,5.9,93.879997,20.219999,6.82,1.05675,0.87503,1.08474,154.179993,6.71,1097.880005,146.929993,54.130001,153.830002,361.76001,2488.649902,46.799999
2020-04-06,262.470001,1997.589966,196.449997,6.38,99.580002,20.969999,7.11,1.05594,0.88427,1.080696,165.550003,7.25,1186.920044,158.229996,58.43,165.270004,379.959991,2663.679932,45.240002
2020-04-07,259.429993,,198.0,6.64,101.239998,20.209999,7.29,1.05629,0.88236,1.08038,168.830002,,1186.51001,166.020004,58.400002,,372.279999,2659.409912,46.700001
2020-04-08,264.630005,2027.190918,194.845001,6.55,101.470001,20.129999,7.68,1.05497,0.8752,1.08613,171.460007,7.775,1205.102661,173.75,58.799999,164.979095,371.399994,2720.389893,42.860001


In [19]:
filePath = os.path.join(dataFolderPath, "closing_price_dataset.csv")

%time dataClose.to_csv(path_or_buf = filePath)

Wall time: 620 ms


In [24]:
# %time dataClose_reloaded = pd.read_csv(filepath_or_buffer = filePath, index_col = 0, parse_dates = True)
# dataClose_reloaded.tail()
# dataClose.index

In [None]:
# removeFile(filePath)

## 1.3. Data Exploration <a name="data-exploration"></a>

### 1.3.1. Summary Statistics <a name="summary-statistics"></a>

### 1.3.2. Returns <a name="returns"></a>

Fare plot stacked dei returns

### 1.3.3. Resampling <a name="resampling"></a>

### 1.3.4. Rolling Statistics <a name="rolling-statistics"></a>

- mean e std e 1-variate stuff

#### 1.3.4.1 Rolling Correlation Matrix <a name="rolling-correlation-matrix"></a>

### 1.3.5. _Focus on:_ Covid-19 crisis <a name="focus-on:-covid-19-crisis"></a>

## 1.3. S&P500 - VIX correlation analysis <a name="s&p500-vix-correlation-analysis"></a>