# Yahooo Finance Exploration

This notebook is dedicated to exploring basic yfinance functionalities.

In [None]:
import yfinance as yf
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt

## Searching

### Search

We can perform searches on Yahoo Finance using the [Search API](https://yfinance-python.org/reference/api/yfinance.Search.html#yfinance.Search). This allows simple queries using asset name or ticker. The focus is on market quotes, but query parameters allow to include optional results like news, research, or company breakdown. 

In [None]:
# By default, results include quotes, news, and corporate breakdown. 
# We disable news and increase max quote results to 10.
search_handler = yf.Search(
    "Apple Inc.", 
    max_results = 10,
    recommended = 10,
    news_count = 0, 
    include_cb = False
)

search_res = search_handler.search()

# Let's look at the raw response in dictionary form
search_res.response

In [None]:
print(f"There are {search_res.response["count"]} results for this query")

quotes_info = [(exch_dict["symbol"], exch_dict["exchDisp"], exch_dict["exchange"]) 
               for exch_dict in search_res.quotes 
               if exch_dict["quoteType"] == "EQUITY"]
print("Apple Inc. is quoted on the following markets:", quotes_info)

### Lookup

The [Lookup API](https://yfinance-python.org/reference/api/yfinance.Lookup.html#yfinance.Lookup) queries Yahoo Finance for tickers. It focuses on instrument lookup, and presents data in a useful DataFrame format, with methods dedicated to quote types. Continuing with our _Apple Inc._ example, we obtain:

In [None]:
lookup_res = yf.Lookup("Apple Inc.")
lookup_res.all.head()

The first result is the primary market for Apple Inc. The query appears to involve the ```symbol``` (ticker) and ```shortName``` fields. As we can't restrict it with additional parameters, and the basic info returned is not sufficient, further investigations would be required to select a specific instrument.

We can also lookup directly by ticker. While we obviously find what we are looking for, search results appear to be spurious, including substring matches both in the ```symbol``` and ```shortName``` fields.

In [None]:
lookup_res = yf.Lookup("AAPL")
lookup_res.get_all().head()

In [None]:
# Resulting stocks
print("Stocks:")
lookup_res.get_stock().head()

In [None]:
print("ETFs:")
lookup_res.get_etf().head()

### Screen

To perform more complex (and general) searches, we have to provide ```*Query``` objects, like ```EquityQuery```, to the [Screen API](https://yfinance-python.org/reference/yfinance.screener.html). It allows to construct filters based on region, sector, exchange, performance _etc._.

In [None]:
nms_changers = yf.EquityQuery("and", [
    yf.EquityQuery("eq", ["exchange", "NMS"]),
    yf.EquityQuery("gt", ["percentchange", 3])
    ])

pd.DataFrame.from_dict(
    yf.screen(nms_changers,
              sortField = "percentchange", 
              sortAsc = False, 
              size = 10), 
    orient="index",
    columns = ["Results"])

There are some [predefined screeners](https://yfinance-python.org/reference/api/yfinance.screen.html#yfinance.screen) available:

In [None]:
query = yf.PREDEFINED_SCREENER_QUERIES['day_gainers']["query"]
yf.screen(query, size = 1)

## Downloading Ticker Info

To retrieve info on a single financial instrument, for example [Apple](https://it.finance.yahoo.com/quote/AAPL/):

In [None]:
ticker = yf.Ticker("AAPL")

We can obtain a lot of data from this Ticker object and its methods. Here are some personal notes on what catches my eyes:

* ```ticker.get_info``` returns a dictionary with miscellaneos info, both anagraphical (e.g. short name, market, exchange, quote type, quote currency, etc.) and market quotes (e.g. day high, volume, bid/ask, etc.);
* ```ticker.get_history_metadata``` returns a dictionary with details on the security's trading operations, e.g. current trading period, recent past trading period hours, etc.;
* ```ticker.get_actions``` returns a DataFrame of past corporate actions, but only Dividends and Stock Splits;
* ```ticker.get_balance_sheet``` returns a DataFrame with a history of the company's balance sheets. Of course this only applies to stocks;
* ```ticker.get_calendar``` returns info on future events, like the next ex-dividend date;
* ```ticker.option_chain``` returns an Options object with info on [all options](https://finance.yahoo.com/quote/AAPL/options/) on this underlying;
* ```ticker.get_news``` returns a list of news on this security;
* ```ticker.funds_data``` returns info on fund-related data (when applicable).

There a lot of other methods that can be explored [here](https://yfinance-python.org/reference/api/yfinance.Ticker.html#yfinance.Ticker). Unfortunately the API is poorly documented.

In [None]:
info = ticker.get_info()

name = info["shortName"]
symbol = info["symbol"]
isin = ticker.isin

print(f"Here is info on {name}, symbol {symbol}, of type {info["typeDisp"]}, quoted on {info["fullExchangeName"]} with ISIN code {isin}:")
print(">>", info["longBusinessSummary"])

The info is complex, we can create a ```DataFrame``` to better handle it:

In [None]:
info_df = pd.DataFrame.from_dict(ticker.get_info(), orient="index", columns = ["AAPL"])

info_df.head(10)

We can use the ```Ticker``` to obtain ```Sector``` and ```Industry``` specific info:

In [None]:
# Ticker to Sector and Industry
sector = yf.Sector(info.get("sectorKey"))
industry = yf.Industry(info.get("industryKey"))

# Sector and Industry to Ticker
tech_ticker = sector.ticker
software_ticker = industry.ticker

# What do we obtain from these new tickers?
df = pd.concat([
    pd.DataFrame.from_dict(tech_ticker.get_info(), orient="index"), 
    pd.DataFrame.from_dict(software_ticker.get_info(), orient="index")
    ], axis=1)
df.columns = ["Sector Ticker", "Industry Ticker"]
df.head(10)

## Downloading Historical Data

We can now download financial data. To import [S&P500](https://finance.yahoo.com/quote/%5EGSPC) daily data from Yahoo Finance, the ticker is "^GSPC":

In [None]:
file = "../data/gspc.csv"

try:
    df_sp500 = pd.read_csv(file, parse_dates = ["Date"])
    print("Read historical data from", file) 
except FileNotFoundError:
    df_sp500 = yf.download("^GSPC", start="1900-01-01", multi_level_index=False, auto_adjust=True)
    df_sp500.to_csv(file)
    print("Dowloaded historic data and wrote them into", file)

df_sp500.info()

Preliminary data exploration:

In [None]:
df_sp500.head()

In [None]:
df_sp500.describe()

The _Open_ column presents some zero values, which is unusual for financial data and denotes missing values. Let's investigate:

In [None]:
zeros= sum(df_sp500["Open"] == 0)
print(f"There are {zeros} zeros in this column:")
df_sp500["Open"].plot()
plt.yscale("log")
plt.ylabel("S&P 500 Open")
plt.show()

Notice that the historical record of opening values is incomplete, but it becomes more reliable starting in the early 1980s, thanks to advancements in trading technology. Much of the pre-1980s data was reconstructed from newspapers, end-of-day reports, or monthly summaries, which often included only high, low, close, and volume. More accurate historical data exists, but it is not available for free in yahoo finance. Missing data is filled with **zero**.

Let's plot _Close_ values, which we expect to be more reliable:

In [None]:
df_sp500["Close"].plot()
plt.yscale("log")
plt.ylabel("S&P 500 Close")
plt.show()

Historic data is alredy adjusted to account for corporate events (splits), but not for dividends.

## Option Chains

To download option chains, we first build a ```Ticker```, then identify the list of quoted expiry dates. Finally, we can download the options for a given expiry date.

In [None]:
ticker = yf.Ticker("AAPL")

# Ticker.options returns a list of expiration dates
expiration_dates = ticker.options

# Ticker.option_chain returns an object representing the option chain
chain_dict = dict([(expiry, ticker.option_chain(expiry)) for expiry in expiration_dates])

In [None]:
date = expiration_dates[0]
opts = chain_dict[date].calls

print(f"Options expiring on {date}:")
opts.head()

## Final Example

Let us conclude this notebook with a complete workflow example using Apple stocks.

In [None]:
lookup_res = yf.Lookup("Apple Inc.")

apple_stocks = lookup_res.get_stock()
apple_stocks.reset_index(inplace = True)
print(f"There are {len(apple_stocks)} stock results:")
apple_stocks.head()

From these results, we can obtain ```Ticker``` objects and related info. Using the fact that ```Ticker.get_info``` returns a dictionary:

In [None]:
infos = {}
for i, row in apple_stocks.iterrows():
    ticker = yf.Ticker(row["symbol"])
    info = ticker.get_info()
    infos[info["symbol"]] = info

ticker_info = pd.DataFrame.from_dict(infos, orient = "index")
ticker_info.head()

Let's visualize it better:

In [None]:
ticker_info.loc[:, ["shortName", "longName", "exchange", "fullExchangeName"]]

Let us download and plot recent historic data for the primary market:

In [None]:
history = yf.Ticker(ticker_info.index[0]).history()

history[["Open", "Close"]].plot()

To create more complex representations, we require all the columns to have valid data:

In [None]:
candlestick = history.reset_index()

fig = go.Figure(
    data=[
        go.Candlestick(
            x=candlestick["Date"], 
            open=candlestick['Open'],
            high=candlestick['High'],
            low=candlestick['Low'],
            close=candlestick['Close']
        )
    ]
)

fig.update_layout(
    title=dict(text='Apple Inc. 500 Candlestick Graph with Rangeslider'),
    yaxis=dict(title=dict(text='Apple Inc. Price (USD)'))
)

fig.show()