# Greenblatt Magic formula

## The Magic Formula: Explanation and Formula

The **Magic Formula** is an investment strategy developed by **Joel Greenblatt** to identify high-quality companies that are also undervalued. It ranks companies based on two key financial ratios:

---

## 📌 Core Idea

> **Buy good companies at cheap prices.**

The strategy identifies:
- **"Good companies"** → those with high returns on capital (efficient use of capital).
- **"Cheap companies"** → those with high earnings yield (undervalued based on operating profits).

---

## 🧮 The Formula

1. **Earnings Yield (EY):**
$$
\text{Earnings Yield} = \frac{\text{EBIT}}{\text{Enterprise Value}}
$$
- Measures how cheap the stock is.
- EBIT = Earnings Before Interest and Taxes
- Enterprise Value = Market Cap + Debt - Cash

2. **Return on Capital (ROC):**
$$
\text{Return on Capital} = \frac{\text{EBIT}}{\text{Net Working Capital} + \text{Net Fixed Assets}}
$$
- Measures the quality of the business (how efficiently it uses its capital).

---

## 🔍 Implementation Steps

1. **Filter the universe**: Remove financials, utilities, and companies with very small market cap.
2. **Rank all remaining stocks** by:
   - Earnings Yield (high = better)
   - Return on Capital (high = better)
3. **Compute combined rank**:
   $$
   \text{Combined Rank} = \text{Rank}_{EY} + \text{Rank}_{ROC}
   $$
4. **Sort by Combined Rank** (lowest = best overall).
5. **Pick top N stocks** (e.g., top 20–30).
6. **Hold for 1 year**, rebalance annually.

---

## ✅ Why It Works

- Avoids paying too much for popular stocks.
- Focuses on operationally efficient, consistently profitable companies.
- Enforces a disciplined, rules-based approach.

---

## ⚠️ Notes and Caveats

- Avoids subjective judgement; however, **screening accuracy** depends on **quality of financial data**.
- May underperform in short-term or irrational markets.
- Works best over a multi-year horizon (3–5+ years).
- I have added a minimum 40% margin restriction to the original formula. 

---


In [None]:
import os
from tenacity import retry, stop_after_attempt, wait_exponential
import requests
import pandas as pd
from tqdm import tqdm

api_key = os.getenv('financial_modeling_prep_api_key')
assert api_key is not None

In [None]:
# Retry settings: 5 attempts with exponential backoff
@retry(stop=stop_after_attempt(5), wait=wait_exponential(multiplier=1, min=10, max=60))
def fetch_json(url):
    response = requests.get(url)
    response.raise_for_status()
    return response.json()

def get_magic_formula(api_key, symbols, minimum_margin=0.4, minimum_marketcap=500_000_000, minimum_revenue=100_000_000):
    import pandas as pd
    from tqdm import tqdm

    base_url = "https://financialmodelingprep.com/stable"
    results = []

    for symbol in tqdm(symbols, desc="Processing symbols"):
        try:
            # Step 1: Check market cap
            ev_url = f"{base_url}/enterprise-values/?symbol={symbol}&apikey={api_key}"
            ev_data = fetch_json(ev_url)[0]
            market_cap = ev_data.get('marketCapitalization')  
            
            if not ev_data or market_cap < minimum_marketcap:
                continue

            # Step 2: Exclude financials and real estate sectors
            profile_url = f"{base_url}/profile?symbol={symbol}&apikey={api_key}"
            profile_data = fetch_json(profile_url)
            sector = profile_data[0].get('sector') 
            company_name = profile_data[0].get('companyName')
            
            if not profile_data or sector in ['Financial Services', 'Real Estate']:
                continue

            # Step 3: Fetch income statement
            income_url = f"{base_url}/income-statement/?symbol={symbol}&apikey={api_key}"
            income_data = fetch_json(income_url)[0]
            net_income = income_data.get('netIncome')
            revenue = income_data.get('revenue')

            # Filter: minimum revenue
            if revenue < minimum_revenue:
                continue

            # Skip if net income is non-positive
            if net_income is None or net_income <= 0:
                continue

            cash_url = f"{base_url}/cash-flow-statement/?symbol={symbol}&apikey={api_key}"
            cash_data = fetch_json(cash_url)[0]
            cashflow = cash_data['operatingCashFlow']
            
            #if abs(net_income - cashflow) > 0.5 * net_income:
            if (net_income > 0 and cashflow < 0) or (net_income < 0 and cashflow > 0):
                print(f"Cashflow mismatch for {symbol}")
                continue  # net income not supported by cash flow

            # Step 4: Fetch balance sheet
            balance_url = f"{base_url}/balance-sheet-statement/?symbol={symbol}&apikey={api_key}"
            balance_data = fetch_json(balance_url)[0]

            # Step 5: Fetch dividend
            dividend = f"{base_url}/dividends/?symbol={symbol}&apikey={api_key}"
            dividend_data = fetch_json(dividend)
            dividend = 0
            if dividend_data:
                dividend_data = dividend_data[0]
                dividend = dividend_data.get('yield')
                div_date = dividend_data.get('date')

            # Extract relevant fields
            ebit = income_data.get('ebit')
            
            total_assets = balance_data.get('totalAssets')
            current_liabilities = balance_data.get('totalCurrentLiabilities')
            enterprise_value = ev_data.get('enterpriseValue')
            operating_expenses = income_data.get('operatingExpenses')
            other_expenses = income_data.get('otherExpenses')

            # Ensure all required fields are present
            if None in (ebit, revenue, total_assets, current_liabilities, enterprise_value):
                print(f"missing data for {symbol}")
                continue

            # Filter: operating expenses must be non-negative
            if operating_expenses is not None and operating_expenses < 0:
                continue

            # Filter: exclude extreme accounting adjustments
            if other_expenses is not None and other_expenses < -revenue:
                continue

            # Filter: remove absurdly high EBIT margins (e.g., > 200%)
            ebit_margin = ebit / revenue
            if ebit_margin < minimum_margin or ebit_margin > 2:
                continue

            # Compute Greenblatt metrics
            earnings_yield = ebit / enterprise_value
            capital = total_assets - current_liabilities
            if capital <= 0:
                continue
            return_on_capital = ebit / capital

            # Append result
            results.append({
                'symbol': symbol,
                'company_name': company_name,
                'dividend': dividend,
                'div date': div_date,
                'sector' : sector,
                'ebit_margin': ebit_margin,
                'earnings_yield': earnings_yield,
                'return_on_capital': return_on_capital
            })

        except Exception as e:
            print(f'Symbol: {symbol}\n')
            print(e)

    # Step 6: Compile DataFrame and rank
    df = pd.DataFrame(results)
    if df.empty:
        return df

    df['ey_rank'] = df['earnings_yield'].rank(ascending=False)
    df['roc_rank'] = df['return_on_capital'].rank(ascending=False)
    df['combined_rank'] = df['ey_rank'] + df['roc_rank']
    df_sorted = df.sort_values(by='combined_rank')

    return df_sorted #[['symbol', 'ebit_margin', 'earnings_yield', 'return_on_capital', 'combined_rank']]


In [None]:
symbols = pd.read_csv("russel1000.csv")
symbols = symbols['Ticker'].to_list()

top_stocks = get_magic_formula(api_key, symbols, minimum_margin=0.05)
print(top_stocks)
russel1000_filtered = top_stocks
russel1000_filtered.to_csv('russel1000_magic_formula.csv')

In [100]:
russel1000_filtered.head(20)

Unnamed: 0,symbol,company_name,dividend,sector,ebit_margin,earnings_yield,return_on_capital,ey_rank,roc_rank,combined_rank
77,MO,"Altria Group, Inc.",6.731562,Consumer Defensive,0.723048,0.132416,0.560009,16.0,10.0,26.0
274,JBL,Jabil Inc.,0.192158,Technology,0.066614,0.133316,0.345484,15.0,44.0,59.0
437,BBWI,"Bath & Body Works, Inc.",2.84495,Consumer Cyclical,0.183386,0.106695,0.368031,35.0,38.0,73.0
460,HRB,"H&R Block, Inc.",2.633889,Consumer Cyclical,0.233053,0.102539,0.375378,41.0,35.0,76.0
248,PHM,"PulteGroup, Inc.",0.877282,Consumer Cyclical,0.223233,0.170498,0.252484,5.0,82.0,87.0
516,WFRD,Weatherford International plc,1.715462,Energy,0.162706,0.147025,0.259024,8.0,80.0,88.0
337,THC,Tenet Healthcare Corporation,0.049383,Healthcare,0.293927,0.271328,0.24665,2.0,91.0,93.0
256,NVR,"NVR, Inc.",18.244444,Consumer Cyclical,0.203279,0.089892,0.409763,70.0,28.0,98.0
481,CROX,"Crocs, Inc.",0.0,Consumer Cyclical,0.248616,0.127087,0.250457,19.0,84.0,103.0
580,DDS,"Dillard's, Inc.",6.565325,Consumer Cyclical,0.118713,0.10423,0.285435,37.0,68.0,105.0


In [None]:
symbols = pd.read_csv("all_listed_companies.csv")
symbols = symbols['Symbol'].to_list()

top_stocks = get_magic_formula(api_key, symbols, minimum_margin=0.05)
print(top_stocks)
world_filtered = top_stocks
world_filtered.to_csv('world_magic_formula.csv')

In [None]:
world_filtered.head(40)

## Tests

In [None]:
base_url = "https://financialmodelingprep.com/stable"
profile_url = f"{base_url}/price-target-consensus?symbol=NVDA&apikey={api_key}"
response = requests.get(profile_url)
response.raise_for_status()
profile_data = response.json()

In [None]:
profile_data[0]#['yield']

In [None]:
profile_data[0] #left tilted median bigger than average, 