# Financial Ratio Calculation for Quantile Strategy

This notebook calculates three financial ratios on a daily basis for the quantile-based long-short trading strategy:

1. **Debt to Market Cap** - Leverage indicator
2. **Return on Investment (ROI)** - Profitability relative to capital employed
3. **Price to Earnings (P/E)** - Valuation metric

**Key Methodology:**
- Fundamental data (debt, earnings, ROI) updates at filing dates and forward-fills
- Market capitalization updates daily with stock prices
- Ratios recalculate daily as prices change



The link to the GitHub Repository with full code base to backtester and strategy code is here:

https://github.com/sidsahacodes/qts/tree/main/quantile_strats

## Setup and Configuration

In [1]:
import zipfile
import pandas as pd
import numpy as np
from pathlib import Path

BASE_DIR = Path(r"C:\Users\15126\Desktop\Chicago\Winter\qts\hw\quantile_strats\Data_export\ZacksFundamentalsB")
UNIVERSE_PATH = Path(r'C:\Users\15126\Desktop\Chicago\Winter\qts\hw\quantile_strats\investment_universe.csv')

PERIOD_START = pd.to_datetime("2018-01-01")
PERIOD_END = pd.to_datetime("2023-06-30")

## Load Investment Universe

Random sample of 500 tickers from the pre-filtered investment universe.

In [2]:
universe_df = pd.read_csv(UNIVERSE_PATH)

np.random.seed(42)
universe_df = universe_df.sample(n=500, random_state=42)
universe_tickers = set(universe_df['ticker'])

print(f"Universe: {len(universe_tickers)} tickers")
print(f"Sample: {sorted(universe_tickers)[:10]}")

Universe: 500 tickers
Sample: ['ACEL', 'ACIW', 'AENZ', 'AGR', 'AGRO', 'AHCO', 'AIMC', 'ALGT', 'ALK', 'ALNT']


## Load Fundamental and Price Data

Load and filter Zacks Fundamentals B tables and QuoteMedia price data to the universe and analysis period.

In [3]:
def load_zacks_table(zip_name):
    with zipfile.ZipFile(BASE_DIR / zip_name) as z:
        csv_files = [f for f in z.namelist() if f.endswith(".csv")]
        with z.open(csv_files[0]) as f:
            return pd.read_csv(f, low_memory=False)

print("Loading fundamental data tables...")

mktv = load_zacks_table("MKTV_20240123.zip")
mktv['per_end_date'] = pd.to_datetime(mktv['per_end_date'])
mktv = mktv[(mktv['ticker'].isin(universe_tickers)) & 
            (mktv['per_end_date'] >= PERIOD_START) & 
            (mktv['per_end_date'] <= PERIOD_END)].copy()

fc = load_zacks_table("FC_20240123.zip")
fc['per_end_date'] = pd.to_datetime(fc['per_end_date'])
fc['filing_date'] = pd.to_datetime(fc['filing_date'])
fc = fc[(fc['ticker'].isin(universe_tickers)) & 
        (fc['per_end_date'] >= PERIOD_START) & 
        (fc['per_end_date'] <= PERIOD_END)].copy()

fr = load_zacks_table("FR_20240123.zip")
fr['per_end_date'] = pd.to_datetime(fr['per_end_date'])
fr = fr[(fr['ticker'].isin(universe_tickers)) & 
        (fr['per_end_date'] >= PERIOD_START) & 
        (fr['per_end_date'] <= PERIOD_END)].copy()

shrs = load_zacks_table("SHRS_20240123.zip")
shrs['per_end_date'] = pd.to_datetime(shrs['per_end_date'])
shrs = shrs[(shrs['ticker'].isin(universe_tickers)) & 
            (shrs['per_end_date'] >= PERIOD_START) & 
            (shrs['per_end_date'] <= PERIOD_END)].copy()

print(f"MKTV: {len(mktv):,} | FC: {len(fc):,} | FR: {len(fr):,} | SHRS: {len(shrs):,}")

Loading fundamental data tables...
MKTV: 11,000 | FC: 13,276 | FR: 13,276 | SHRS: 11,000


In [4]:
print("Loading price data...")

prices_list = []
PRICES_ZIP = BASE_DIR / "PRICES_20241105.zip"
PRICE_CSV = "QUOTEMEDIA_PRICES_247f636d651d8ef83d8ca1e756cf5ee4.csv"

with zipfile.ZipFile(PRICES_ZIP) as z:
    with z.open(PRICE_CSV) as f:
        for i, chunk in enumerate(pd.read_csv(f, usecols=["ticker", "date", "adj_close"],
                                               parse_dates=["date"], chunksize=5_000_000, 
                                               low_memory=False), 1):
            chunk = chunk[(chunk["ticker"].isin(universe_tickers)) &
                         (chunk["date"] >= PERIOD_START) &
                         (chunk["date"] <= PERIOD_END)]
            prices_list.append(chunk)
            if i % 5 == 0:
                print(f"  Chunk {i}")

prices = pd.concat(prices_list, ignore_index=True).sort_values(['ticker', 'date']).reset_index(drop=True)

print(f"\nPrices: {len(prices):,} rows | {prices['ticker'].nunique()} tickers | {prices['date'].min().date()} to {prices['date'].max().date()}")

Loading price data...
  Chunk 5
  Chunk 10

Prices: 689,181 rows | 500 tickers | 2018-01-02 to 2023-06-30


## Prepare Fundamental Data

Merge tables and calculate derived metrics:
- **Debt**: Net long-term debt (or total if net unavailable) + current portion
- **EPS**: Diluted EPS (or basic if unavailable), negative values set to 0.001
- **Return**: Calculated from ROI and Investment (Market Cap + Debt)

In [5]:
fc_data = fc[['ticker', 'per_end_date', 'filing_date', 'per_type',
              'tot_lterm_debt', 'net_lterm_debt', 'curr_portion_debt',
              'eps_diluted_net', 'eps_basic_net']].copy()

mktv_data = mktv[['ticker', 'per_end_date', 'mkt_val']].copy()
shrs_data = shrs[['ticker', 'per_end_date', 'shares_out']].copy()
fr_data = fr[['ticker', 'per_end_date', 'ret_invst']].copy()

fundamentals = (fc_data
    .merge(mktv_data, on=['ticker', 'per_end_date'], how='left')
    .merge(shrs_data, on=['ticker', 'per_end_date'], how='left')
    .merge(fr_data, on=['ticker', 'per_end_date'], how='left')
)

fundamentals = fundamentals[fundamentals['filing_date'].notna()].copy()
fundamentals = fundamentals.sort_values(['ticker', 'per_end_date', 'per_type'])
fundamentals = fundamentals.drop_duplicates(['ticker', 'per_end_date'], keep='first')

fundamentals['debt'] = fundamentals['net_lterm_debt'].fillna(
    fundamentals['tot_lterm_debt'].fillna(0) + fundamentals['curr_portion_debt'].fillna(0)
)

fundamentals['eps'] = fundamentals['eps_diluted_net'].fillna(fundamentals['eps_basic_net'])
fundamentals.loc[fundamentals['eps'] < 0, 'eps'] = 0.001

fundamentals['investment_at_report'] = fundamentals['mkt_val'] + fundamentals['debt']
fundamentals['return'] = fundamentals['ret_invst'] * fundamentals['investment_at_report']

print(f"Prepared fundamentals: {len(fundamentals):,} records | {fundamentals['ticker'].nunique()} tickers")
print(f"Filing dates: {fundamentals['filing_date'].min().date()} to {fundamentals['filing_date'].max().date()}")

Prepared fundamentals: 9,436 records | 500 tickers
Filing dates: 2018-03-08 to 2023-12-29


## Create Daily Time Series

Fundamentals become known the day after `filing_date` and forward-fill until the next filing. Each ticker gets daily records with prices and forward-filled fundamental data.

In [6]:
all_dates = pd.Series(sorted(prices['date'].unique()))
daily_data_list = []

for ticker in universe_tickers:
    ticker_fundamentals = fundamentals[fundamentals['ticker'] == ticker].copy()
    ticker_prices = prices[prices['ticker'] == ticker].copy()
    
    if len(ticker_fundamentals) == 0 or len(ticker_prices) == 0:
        continue
    
    ticker_fundamentals['known_date'] = ticker_fundamentals['filing_date'] + pd.Timedelta(days=1)
    
    ticker_daily = ticker_prices[['date', 'adj_close']].copy()
    ticker_daily['ticker'] = ticker
    
    ticker_daily = pd.merge_asof(
        ticker_daily.sort_values('date'),
        ticker_fundamentals[['known_date', 'per_end_date', 'mkt_val', 'debt',
                             'eps', 'shares_out', 'return', 'ret_invst']].sort_values('known_date'),
        left_on='date',
        right_on='known_date',
        direction='backward'
    )
    
    daily_data_list.append(ticker_daily)

daily_data = pd.concat(daily_data_list, ignore_index=True)
daily_data = daily_data.dropna(subset=['per_end_date'])

print(f"Daily records: {len(daily_data):,} | {daily_data['date'].min().date()} to {daily_data['date'].max().date()}")

Daily records: 626,571 | 2018-03-09 to 2023-06-30


## Calculate Daily Ratios

**Methodology:**
- Get the stock price at each `per_end_date` (report date)
- Scale market cap daily: `Market Cap(t) = Market Cap(report) × Price(t) / Price(report)`
- Update ratios:
  - **Debt/Market Cap** = Debt / Market Cap(t)
  - **ROI** = Return / [Market Cap(t) + Debt]
  - **P/E** = Market Cap(t) / (Shares × EPS)

In [7]:
report_prices_list = []

for ticker in daily_data['ticker'].unique():
    ticker_prices = prices[prices['ticker'] == ticker].sort_values('date')
    ticker_reports = daily_data[daily_data['ticker'] == ticker][['per_end_date']].drop_duplicates()
    
    for per_end in ticker_reports['per_end_date'].unique():
        prior_prices = ticker_prices[ticker_prices['date'] <= per_end]
        if len(prior_prices) > 0:
            price_at_report = prior_prices.iloc[-1]['adj_close']
            report_prices_list.append({'ticker': ticker, 'per_end_date': per_end, 
                                      'price_at_report': price_at_report})

report_prices_df = pd.DataFrame(report_prices_list)
daily_data = daily_data.merge(report_prices_df, on=['ticker', 'per_end_date'], how='left')

daily_data['mkt_val_current'] = daily_data['mkt_val'] * (daily_data['adj_close'] / daily_data['price_at_report'])

daily_data['debt_to_mktcap'] = daily_data['debt'] / daily_data['mkt_val_current']

daily_data['investment_current'] = daily_data['mkt_val_current'] + daily_data['debt']
daily_data['roi'] = daily_data['return'] / daily_data['investment_current']

daily_data['pe_ratio'] = daily_data['mkt_val_current'] / (daily_data['shares_out'] * daily_data['eps'])

for col in ['debt_to_mktcap', 'roi', 'pe_ratio']:
    daily_data[col] = daily_data[col].replace([np.inf, -np.inf], np.nan)

print("✓ Ratios calculated\n")
print(daily_data[['ticker', 'date', 'debt_to_mktcap', 'roi', 'pe_ratio']].head(10))
print(f"\nNaN counts: Debt/MktCap={daily_data['debt_to_mktcap'].isna().sum():,} | "
      f"ROI={daily_data['roi'].isna().sum():,} | PE={daily_data['pe_ratio'].isna().sum():,}")

✓ Ratios calculated

  ticker       date  debt_to_mktcap       roi    pe_ratio
0    DOV 2018-04-30       -0.024410  1.552509  110.355693
1    DOV 2018-05-01       -0.024635  1.567239  109.343802
2    DOV 2018-05-02       -0.025206  1.604492  106.867644
3    DOV 2018-05-03       -0.025277  1.609089  106.570028
4    DOV 2018-05-04       -0.024727  1.573210  108.939045
5    DOV 2018-05-07       -0.024446  1.554916  110.189029
6    DOV 2018-05-08       -0.024139  1.534859  111.593772
7    DOV 2018-05-09       -0.023660  1.503676  113.852071
8    DOV 2018-05-10       -0.023725  1.507895  113.541121
9    DOV 2018-05-11       -0.023586  1.498884  114.207442

NaN counts: Debt/MktCap=1,754 | ROI=3,274 | PE=4,859


## Summary Statistics

In [8]:
print("Ratio Statistics:\n")
print(daily_data[['debt_to_mktcap', 'roi', 'pe_ratio']].describe())

print(f"\nTotal observations: {len(daily_data):,}")
print(f"Tickers: {daily_data['ticker'].nunique()}")
print(f"Date range: {daily_data['date'].min().date()} to {daily_data['date'].max().date()}")
print(f"Average days per ticker: {len(daily_data) / daily_data['ticker'].nunique():.0f}")

Ratio Statistics:

       debt_to_mktcap            roi      pe_ratio
count   624817.000000  623297.000000  6.217120e+05
mean         0.066763       1.991143  1.140419e+04
std          0.285017     159.817336  8.232533e+04
min         -2.629277  -11777.560214  5.026296e-01
25%         -0.009670      -0.246668  3.483581e+01
50%          0.003314       1.688442  9.044871e+01
75%          0.071976       4.583504  3.759990e+03
max          7.951443   20521.023094  4.364350e+06

Total observations: 626,571
Tickers: 500
Date range: 2018-03-09 to 2023-06-30
Average days per ticker: 1253


In [9]:
# Save daily ratios to Excel
print("Saving daily ratios to Excel...")

output_path = r'C:\Users\15126\Desktop\Chicago\Winter\qts\hw\quantile_strats\daily_ratios.xlsx'

# Select key columns for export
columns_to_save = [
    'ticker', 'date', 'adj_close', 'per_end_date',
    'mkt_val', 'debt', 'eps', 'shares_out',
    'debt_to_mktcap', 'roi', 'pe_ratio'
]

daily_data[columns_to_save].to_excel(output_path, index=False, engine='openpyxl')

print(f"✓ Saved to: {output_path}")
print(f"Rows: {len(daily_data):,} | Columns: {len(columns_to_save)}")

Saving daily ratios to Excel...
✓ Saved to: C:\Users\15126\Desktop\Chicago\Winter\qts\hw\quantile_strats\daily_ratios.xlsx
Rows: 626,571 | Columns: 11
