# Yfinance, Option Chain, and Data

Focus: Gain familiarity with Yfinance, the option chain data, and how to manipulate the data for later computations of implied volatilty surfaces and PCA.

Notes:
- The ability to calculate an implied volatity through BS, simple samples, and rootfinding methods has already been implemented at this point 

- The focus of this notebook will be to further these implementations with real market data

- There will be conceptual note for both myself and other to follow along in this notebook

In [15]:
#Allow imports from the src directory
import sys
from pathlib import Path
project_root = Path().resolve().parents[0]
sys.path.append(str(project_root))

# Import self-made libraries to run checks
from src.black_scholes import black_scholes_price
from src.implied_vol import implied_volatility

# Standard imports
import yfinance as yf
import pandas as pd
import numpy as np

Cell 1 Focus/Notes:  
<br>
.option_chain()
- Have to pass in a date of string type labeled as "YYYY-MM-DD"
- Cannot pull multiple dates without a loop and adding to data frame

Option chain data
- Pulls IV but also prints in decimal form not percent
- Ex. Yfinance IV = 2.98 = 298%  
<br>
- Bid is the highest premium someone in the market is willing to pay for the option contract
- Ask is the lowest premium someon will accept to sell the contract
- Mid price = (bid+ask)/2 which is our fair market price estimate  
<br>
- Volume: How many contracts traded today
- Open interest: How many contracts still open and active
- Contract size: Shares per contract (REGULAR =  100 shares)



In [None]:
#Initialization 
ticker = "SPY"
tk = yf.Ticker(ticker)

#Example of not using a specific expiration date
SPYdf = tk.option_chain()
print("SPYdf data type:", type(SPYdf), "<- Notes that it does not return a DataFrame. Instead: <class 'tuple'>\n")

#Pulling available expiration dates
expirations = tk.options
print("SPY Expirations:", expirations, "\n")

#Pull option chain for a specific date
SPYdf = tk.option_chain('2026-02-20')
print("SPYdf data type:", type(SPYdf), "<- Now its a pndas DataFrame\n")
print("View of dataframe:\n")
print(SPYdf)

SPYdf data type: <class 'yfinance.ticker.Options'> <- Notes that it does not return a DataFrame. Instead: <class 'tuple'>

SPY Expirations: ('2026-02-06', '2026-02-09', '2026-02-10', '2026-02-11', '2026-02-12', '2026-02-13', '2026-02-20', '2026-02-27', '2026-03-06', '2026-03-13', '2026-03-20', '2026-03-31', '2026-04-17', '2026-04-30', '2026-05-15', '2026-05-29', '2026-06-18', '2026-06-30', '2026-07-31', '2026-09-18', '2026-09-30', '2026-12-18', '2026-12-31', '2027-01-15', '2027-03-19', '2027-06-17', '2027-12-17', '2028-01-21', '2028-06-16', '2028-12-15') 


Calls DataFrame:
        contractSymbol             lastTradeDate  strike  lastPrice     bid  \
0  SPY260206C00490000 2026-01-29 16:12:17+00:00   490.0     197.63  197.77   
1  SPY260206C00500000 2026-02-06 17:10:04+00:00   500.0     188.11  187.66   
2  SPY260206C00505000 2026-02-06 17:10:04+00:00   505.0     183.12  182.67   
3  SPY260206C00510000 2026-02-02 21:11:50+00:00   510.0     186.06  177.68   
4  SPY260206C00515000 2026-0

In [13]:
#Separating calls and puts into their own DataFrames & do small view
calls = SPYdf.calls
puts = SPYdf.puts
print("\nCalls DataFrame:\n", calls.head(5))
print("\nPuts DataFrame:\n", puts.head(5))

# View of columns in calls DataFrames
print("\nCalls DataFrame columns:\n", calls.columns)
print("There are ", len(calls), " call option for SPY with expiration date 2026-02-20\n")
print("There are ", len(puts), " put option for SPY with expiration date 2026-02-20\n")


Calls DataFrame:
        contractSymbol             lastTradeDate  strike  lastPrice     bid  \
0  SPY260206C00490000 2026-01-29 16:12:17+00:00   490.0     197.63  197.77   
1  SPY260206C00500000 2026-02-06 17:10:04+00:00   500.0     188.11  187.66   
2  SPY260206C00505000 2026-02-06 17:10:04+00:00   505.0     183.12  182.67   
3  SPY260206C00510000 2026-02-02 21:11:50+00:00   510.0     186.06  177.68   
4  SPY260206C00515000 2026-02-02 21:12:08+00:00   515.0     181.03  172.77   

      ask     change  percentChange  volume  openInterest  impliedVolatility  \
0  200.53   0.000000       0.000000     NaN             2           2.980471   
1  190.45  10.470001       5.893943    32.0            21           2.736331   
2  185.48  -8.750000      -4.560379    17.0             2           2.679691   
3  180.48   0.000000       0.000000     2.0             5           2.609378   
4  175.52   0.000000       0.000000     2.0             4           2.587894   

   inTheMoney contractSize curr

# Implied Volatility Check
Here I am going to check that my implied volatity caluclations matches the values given by the Yfinance option chain

In [27]:
#Imports
from src.risk_free_rate import calculate_risk_free_rate
from datetime import datetime, timezone

#Get current stock price
tk = yf.Ticker("SPY")
S = tk.info.get("regularMarketPrice", None)
if S is None:
    S = tk.history(period="1d")["Close"].iloc[-1]

#Calculate risk-free rate
r = calculate_risk_free_rate()

#Get current time in UTC
now = datetime.now(timezone.utc)
print(now)
print("now data type:", type(now))

#Calculate time to expiration
exp = "2026-02-20"
exp_dt = datetime.strptime(exp, "%Y-%m-%d").replace(tzinfo=timezone.utc)
print("exp_dt data type:", type(exp_dt), "\n")

T = ((exp_dt - now).total_seconds()) / ((365.0 * 24 * 60 * 60))  # Time to expiration in years
print("Time to expiration T (in years):", T, "\n")
print("T data type:", type(T), "\n")


first5_calls = calls.head(100)
first5_puts = puts.head(5)

#Compare Yfinance IV with my calculated IV for calls - using mid price as the option price for IV calculation
for row in first5_calls.itertuples(index=False):
    bid = row.bid
    ask = row.ask
    K = row.strike
    yahoo_iv = row.impliedVolatility  # already decimal 
    mid = 0.5 * (bid + ask)

    # ---- My implied vol function call ----
    # Adjust this line to match your function signature
    calc_iv = implied_volatility(price=mid, S=S, K=K, T=T, r=r, option_type="call")

    print(
        f"K={K:>7.2f} mid={mid:>8.3f} | YahooIV={yahoo_iv:>8.4f}  CalcIV={calc_iv:>8.4f}  Diff={calc_iv - yahoo_iv:+.4f}"
    )

print ("\nNow using last price as the option price for IV calculation\n")
print("\n\n\n\n\n")

#Compare Yfinance IV with my calculated IV for calls - using last price as the option price for IV calculation
for row in first5_calls.itertuples(index=False):
    bid = row.bid
    ask = row.ask
    K = row.strike
    yahoo_iv = row.impliedVolatility  # already decimal 
    price = row.lastPrice

    # ---- My implied vol function call ----
    # Adjust this line to match your function signature
    calc_iv = implied_volatility(price=price, S=S, K=K, T=T, r=r, option_type="call")

    print(
        f"K={K:>7.2f} mid={mid:>8.3f} | YahooIV={yahoo_iv:>8.4f}  CalcIV={calc_iv:>8.4f}  Diff={calc_iv - yahoo_iv:+.4f}"
    )

2026-02-07 23:04:47.710014+00:00
now data type: <class 'datetime.datetime'>
exp_dt data type: <class 'datetime.datetime'> 

Time to expiration T (in years): 0.03298174435521309 

T data type: <class 'float'> 

K= 490.00 mid= 199.150 | YahooIV=  2.9805  CalcIV=     nan  Diff=+nan
K= 500.00 mid= 189.055 | YahooIV=  2.7363  CalcIV=     nan  Diff=+nan
K= 505.00 mid= 184.075 | YahooIV=  2.6797  CalcIV=     nan  Diff=+nan
K= 510.00 mid= 179.080 | YahooIV=  2.6094  CalcIV=     nan  Diff=+nan
K= 515.00 mid= 174.145 | YahooIV=  2.5879  CalcIV=     nan  Diff=+nan
K= 520.00 mid= 169.135 | YahooIV=  2.5049  CalcIV=     nan  Diff=+nan
K= 525.00 mid= 164.070 | YahooIV=  2.3770  CalcIV=     nan  Diff=+nan
K= 530.00 mid= 159.125 | YahooIV=  2.3477  CalcIV=     nan  Diff=+nan
K= 535.00 mid= 154.135 | YahooIV=  2.2813  CalcIV=     nan  Diff=+nan
K= 540.00 mid= 149.065 | YahooIV=  2.1543  CalcIV=     nan  Diff=+nan
K= 545.00 mid= 144.050 | YahooIV=  2.0713  CalcIV=     nan  Diff=+nan
K= 550.00 mid= 139.0