# Notebook Instructions

1. All the <u>code and data files</u> used in this course are available in the downloadable unit of the <u>last section of this course</u>.
2. You can run the notebook document sequentially (one cell at a time) by pressing **Shift + Enter**. 
3. While a cell is running, a [*] is shown on the left. After the cell is run, the output will appear on the next line.

This course is based on specific versions of Python packages. You can find the details of the packages in <a href='https://quantra.quantinsti.com/quantra-notebook' target="_blank" >this manual</a>.

## Import Time Series Data

There are a number of ways available to fetch financial time series data. You either use Python packages to download the data or you can read the data stored in a CSV file.

In this notebook, you will learn the following ways:

1. [Import Data from Yahoo Finance](#y-finance)<br>
2. [Read Data from a CSV File](#read-data)


<a id='y-finance'></a> 
## Import Data from Yahoo Finance

To fetch data from Yahoo finance, you need to first install the `yfinance` package. You can do it using the `pip` command. The `pip` command is a tool for installing and managing Python packages.

```python
!pip install yfinance
```

You can fetch data from `yfinance` package using the `download` method.

Syntax:
```python
import yfinance as yf
yf.download(ticker, start, end)
```
1. **ticker:** Ticker of the stock
2. **start:** Start date
3. **end:** End date

In [1]:
# Import yfinance
import yfinance as yf

# Fetch the Coca Cola price data
data = yf.download('KO', start="2019-01-01", end="2021-03-31")

# Display the data
data.head()

[*********************100%***********************]  1 of 1 completed


Unnamed: 0_level_0,Open,High,Low,Close,Adj Close,Volume
Date,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1
2018-12-31,47.490002,47.540001,46.959999,47.349998,44.010803,10576300
2019-01-02,46.939999,47.220001,46.560001,46.93,43.620419,11603700
2019-01-03,46.82,47.369999,46.529999,46.639999,43.350868,14714400
2019-01-04,46.75,47.57,46.639999,47.57,44.215286,13013700
2019-01-07,47.57,47.75,46.900002,46.950001,43.639008,13135500


From the above output, you can see that the data imported has the close as well as the adjusted close price. The other information available are the open price, high price, low price and volume.

You can also fetch the adjusted price data from `yfinance` package by adding one more parameter `auto_adjust` in `download`.

Syntax:
```python
yf.download(ticker, start, end, auto_adjust=True)
```
**auto_adjust:** Set this to `True` to download the adjusted price. By default, it is `False`.

In [2]:
# Fetch the adjusted price data
data = yf.download('KO', start="2019-01-01", end="2021-03-31", auto_adjust=True)

# Display the data
data.head()

[*********************100%***********************]  1 of 1 completed


Unnamed: 0_level_0,Open,High,Low,Close,Volume
Date,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1
2018-12-31,44.140933,44.187406,43.648307,44.010803,10576300
2019-01-02,43.629712,43.889968,43.276513,43.620419,11603700
2019-01-03,43.518175,44.029387,43.248625,43.350868,14714400
2019-01-04,43.453114,44.215286,43.350871,44.215286,13013700
2019-01-07,44.215283,44.38259,43.592534,43.639008,13135500


Now you have the adjusted price data.

<a id='read-data'></a> 
## Read Data from a CSV File

A CSV (Comma Separated Value) file stores tabular content in form of plain text. The values are separated by a comma. You can use `pandas.read_csv()` to read the CSV file. The stock price of Coca Cola from January 2019 to March 2021 is stored in a CSV file, `coca_cola_price.csv`. It has the following information:

|Column No.|Column Name|
|---|---|
|1.|Date|
|2.|Open|
|3.|High|
|4.|Low|
|5.|Close|
|6.|Adj Close|
|7.|Volume|

Syntax:
```python
import pandas as pd
pd.read_csv(filename,index_col)
```
1. **filename**: Path + name of the file in the string format
2. **index_col**: The column number to set as index (index begins from 0)

In [3]:
# Import pandas as alias 'pd'
import pandas as pd

# The data is stored in the directory 'data_modules'
path = '../data_modules/'

# Loading the data using CSV file and set 'Date' as the index
coca_cola = pd.read_csv(path + 'coca_cola_price.csv', index_col=0)

# Display the data
coca_cola.head()

Unnamed: 0_level_0,Open,High,Low,Close,Adj Close,Volume
Date,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1
2019-01-02,46.939999,47.220001,46.560001,46.93,43.620419,11603700
2019-01-03,46.82,47.369999,46.529999,46.639999,43.350868,14714400
2019-01-04,46.75,47.57,46.639999,47.57,44.215286,13013700
2019-01-07,47.57,47.75,46.900002,46.950001,43.639008,13135500
2019-01-08,47.25,47.57,47.040001,47.48,44.13163,15420700


This is a time series data with `Date` column as the index. So you can convert the index to datetime format. Working with datetime index is much simpler than working with string index as you can do a lot of operations with a datetime object in Python. You can use `pandas.to_datetime` method to convert the index to datetime.

Syntax:
```python
pandas.to_datetime(DataFrame.index)
```

In [4]:
# Convert the index to date_time format
coca_cola.index = pd.to_datetime(coca_cola.index)

# Display the data
coca_cola.head()

Unnamed: 0_level_0,Open,High,Low,Close,Adj Close,Volume
Date,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1
2019-01-02,46.939999,47.220001,46.560001,46.93,43.620419,11603700
2019-01-03,46.82,47.369999,46.529999,46.639999,43.350868,14714400
2019-01-04,46.75,47.57,46.639999,47.57,44.215286,13013700
2019-01-07,47.57,47.75,46.900002,46.950001,43.639008,13135500
2019-01-08,47.25,47.57,47.040001,47.48,44.13163,15420700


You can also explore various other methods to get financial data. Read the blog on [stock market data and analysis](https://blog.quantinsti.com/stock-market-data-analysis-python/) for the same. <br><br>