# Example 1: Yahoo Data Loader

This example demostrates how to use `YahooDataLoader` to query daily price data from Yahoo Finance.

In [1]:
from grandma_stock_valuation import YahooDataLoader

The `YahooDataLoader` class is initiated with the following parameters:
* `ticker` (str): the ticker to be queired. I will use "IVV" (iShares Core S&P 500 ETF) as an example.
* `date_start` (date): the start date to query. If None, there must be an exsiting data file for the loader to derive the starting date. I will demonstrate this in later sections.
* `date_end_ex` (date): the end date (exclusive) to query. Exclusive means that, for example, if you want to query to end of 2021, you should use "2022-01-01".<br>If None, the loader will use tomorrow's date, which means to query to end of today.
* `verbose` (int):  2 to print detailed information; 1 to print high-level information; 0 to suppress print.
* `printfunc` (function): function to print messages. Default to `print`.

In [2]:
ticker = 'IVV'

yahoo = YahooDataLoader(ticker, date_start='2022-01-01', date_end_ex='2022-02-01', verbose=2)

The `YahooDataLoader` class comes with the `queryEOD()` method. 

<br>

`queryEOD()` queries and refreshed daily prices and volume data of the ticker. It takes the following parameters:
* `save` (bool): if True, save the queried data to a csv specified by `file_name`.<br>If the csv file already exists, amend the existing file with the queried data.
* `file_name` (str): the csv file to save the queried data.

If `save=False`, `queryEOD()` will return the queried data.

With `save=True`:
* If `file_name=None`, `queryEOD()` will save the quereid data to "_data/<ticker>_EOD.csv.gz".
    * If "_data/" does not exist, it will be created.
* If the file already exist, `queryEOD()` will remove duplicated periods and append new data to the file, then return the entire refreshed data (existing + new).
* If the file does not exist, `queryEOD()` will save to the file and return the queried data.

<br>
Say, we will query this data for the first time, so before running the example below, please remove "IVV_EOD.csv.gz", if it is already in your "_data" folder.


In [3]:
df = yahoo.queryEOD(save=True, file_name=None)

df.head()

IVV: Queried EOD data contains 20 rows over 20 dates from 2022-01-03 to 2022-01-31.
IVV: Save queried data to _data\IVV_EOD.csv.gz.


Unnamed: 0,date,open,high,low,close,close_adj,volume
0,2022-01-03,478.380005,479.899994,475.910004,479.839996,479.839996,5560300
1,2022-01-04,481.369995,482.070007,477.660004,479.679993,479.679993,5452000
2,2022-01-05,479.269989,480.029999,470.290009,470.329987,470.329987,7211400
3,2022-01-06,469.950012,472.850006,467.480011,469.980011,469.980011,5959100
4,2022-01-07,470.019989,471.25,466.670013,468.100006,468.100006,7673400


Now let's refresh this existing data to the most recent date.

Note that:
* In code below, I will set `date_start=None`: `queryEOD()` will search for the existing "_data/IVV_EOD.csv.gz" and use the latest date as the start date.
* The returned dataframe will contain both the existing data (saved in the previous step) and the newly queried data.

In [4]:
yahoo = YahooDataLoader(ticker, date_start=None, date_end_ex=None, verbose=2)

df = yahoo.queryEOD(save=True, file_name=None)

df.tail()

IVV: Existing EOD data file found at _data\IVV_EOD.csv.gz.
IVV: Existing EOD data file contains 20 rows over 20 dates from 2022-01-03 to 2022-01-31.
IVV: Queried EOD data contains 19 rows over 19 dates from 2022-02-01 to 2022-02-28.
IVV: Amended data file contains 39 rows over 39 dates from 2022-01-03 to 2022-02-28.


Unnamed: 0,date,open,high,low,close,close_adj,volume
0,2022-01-03,478.380005,479.899994,475.910004,479.839996,479.839996,5560300
1,2022-01-04,481.369995,482.070007,477.660004,479.679993,479.679993,5452000
2,2022-01-05,479.269989,480.029999,470.290009,470.329987,470.329987,7211400
3,2022-01-06,469.950012,472.850006,467.480011,469.980011,469.980011,5959100
4,2022-01-07,470.019989,471.25,466.670013,468.100006,468.100006,7673400
