# S&P 500 Top 50 — Stock Probability Analyzer
**finance2** | Combined Technical + Fundamental Analysis

This notebook uses the `lib/` package to:
1. Fetch OHLCV data for the top 50 S&P 500 stocks (via `yfinance`)
2. Compute 7 technical indicators per stock
3. Fetch 10 fundamental metrics per stock (P/E, ROE, growth, …)
4. Score each component 0–1 and blend into a combined composite probability
5. Visualise results with ranked charts and heatmaps

> ⚠️ **Disclaimer:** For educational purposes only. Not financial advice.

## 1 — Install dependencies
Run once; skip on subsequent runs if packages are already installed.

In [None]:
# Uncomment and run if packages are not yet installed
# %pip install yfinance pandas numpy matplotlib requests lxml html5lib --quiet

## 2 — Imports

In [None]:
import sys, os
sys.path.insert(0, os.path.dirname(os.path.abspath('__file__')))

from lib import data, indicators, scoring, fundamentals, display
import pandas as pd
import matplotlib.pyplot as plt

print("Imports OK")

## 3 — Configuration
Adjust the parameters below before running the full analysis.

In [None]:
# ── Stock universe ──────────────────────────────────────────────────────
# Option A (recommended): fetch live top-50 by market cap (~90 s, requires internet)
TICKERS = data.get_top_n_sp500(n=50, verbose=True)
# Option B: instant hardcoded fallback (accurate as of early 2025)
# TICKERS = data.TOP_50_SP500
# Option C: custom subset for quick testing
# TICKERS = ["AAPL", "MSFT", "NVDA", "AMZN", "GOOGL"]

# ── Technical indicator weights (must sum to 1.0) ──────────────────────
from lib.scoring import WEIGHTS
print("\nCurrent technical weights:")
for k, v in WEIGHTS.items():
    bar = "█" * round(v * 40)
    print(f"  {k:<16} {v:.2f}  {bar}")

# ── Technical / Fundamental blend ──────────────────────────────────
# Proportion of the final composite score that comes from each analysis type.
TECH_WEIGHT = scoring.TECH_WEIGHT   # default 0.60
FUND_WEIGHT = scoring.FUND_WEIGHT   # default 0.40
print(f"\nBlend: {TECH_WEIGHT:.0%} technical  +  {FUND_WEIGHT:.0%} fundamental")

# ── Data parameters ─────────────────────────────────────────────────────
LOOKBACK_DAYS = 400    # calendar days of OHLCV history to fetch

print(f"\nUniverse: {len(TICKERS)} stocks  |  Lookback: {LOOKBACK_DAYS} days")

## 4 — Fetch OHLCV data & compute technical indicators

In [None]:
print("Fetching OHLCV data and computing technical scores…\n")
tech_results = scoring.analyze_universe(
    TICKERS,
    lookback_days=LOOKBACK_DAYS,
    verbose=True,
)
print(f"\n✓ Technical analysis complete for {len(tech_results)} stocks.")

## 5 — Fetch fundamental data

> This step calls `yfinance` for each ticker individually (`.info` endpoint).
> Expect **1–2 minutes** for 50 stocks.

Metrics fetched:
| Category | Metrics |
|---|---|
| Valuation | Trailing P/E, Price-to-Book, PEG ratio |
| Growth | Revenue growth (YoY), Earnings growth (YoY) |
| Quality | Return on equity, Net profit margin, FCF yield |
| Risk | Debt-to-equity ratio |
| Sentiment | Analyst consensus recommendation |

In [None]:
print("Fetching fundamental data…\n")
fund_results = fundamentals.analyze_fundamentals(
    list(tech_results.keys()),
    verbose=True,
)
print(f"\n✓ Fundamental analysis complete for {len(fund_results)} stocks.")

## 6 — Combined scoring
Blend the technical and fundamental composites using the configured weights.

In [None]:
results = scoring.merge_fundamental(
    tech_results,
    fund_results,
    tech_weight=TECH_WEIGHT,
    fund_weight=FUND_WEIGHT,
)

# Sort by combined composite score
results = dict(sorted(results.items(), key=lambda x: x[1]["composite"], reverse=True))

print(f"✓ Combined scores computed for {len(results)} stocks.")
print(f"\n{'Rank':<5} {'Ticker':<8} {'Tech':>6} {'Fund':>6} {'Combined':>9} {'Signal'}")
print("-" * 50)
for rank, (ticker, r) in enumerate(results.items(), 1):
    print(f"{rank:<5} {ticker:<8} {r['tech_composite']:>6.3f} {r['fund_composite']:>6.3f} "
          f"{r['composite']:>9.3f}  {r['signal']}")

## 7 — Summary table

In [None]:
rows = []
for rank, (ticker, r) in enumerate(results.items(), 1):
    rows.append({
        "Rank":     rank,
        "Ticker":   ticker,
        "Price":    f"${r['price']:.2f}" if r.get('price') else "N/A",
        "Tech":     f"{r['tech_composite']:.3f}",
        "Fund":     f"{r['fund_composite']:.3f}",
        "Combined": f"{r['composite']:.3f}",
        "Signal":   r["signal"],
    })

df_summary = pd.DataFrame(rows).set_index("Rank")
df_summary

## 8 — Score visualisations

In [None]:
tickers_sorted = list(results.keys())
tech_scores    = [results[t]["tech_composite"]  for t in tickers_sorted]
fund_scores    = [results[t]["fund_composite"]  for t in tickers_sorted]
comb_scores    = [results[t]["composite"]        for t in tickers_sorted]
signals        = [results[t]["signal"]           for t in tickers_sorted]

SIGNAL_COLORS = {
    "Strong Buy":  "#2e7d32",
    "Buy":         "#66bb6a",
    "Neutral":     "#ffa726",
    "Sell":        "#ef5350",
    "Strong Sell": "#b71c1c",
}

fig, axes = plt.subplots(3, 1, figsize=(16, 14))

# — Combined composite
ax = axes[0]
colors = [SIGNAL_COLORS[s] for s in signals]
bars = ax.barh(tickers_sorted[::-1], comb_scores[::-1], color=colors[::-1])
ax.axvline(x=0.75, color="#2e7d32", linestyle="--", alpha=0.5, label="Strong Buy")
ax.axvline(x=0.60, color="#66bb6a", linestyle="--", alpha=0.5, label="Buy")
ax.axvline(x=0.40, color="#ffa726", linestyle="--", alpha=0.5, label="Neutral")
ax.set_xlabel("Combined Composite Score")
ax.set_title(f"Combined Score ({TECH_WEIGHT:.0%} Technical + {FUND_WEIGHT:.0%} Fundamental)",
             fontsize=13, fontweight="bold")
ax.set_xlim(0, 1)
ax.legend(loc="lower right", fontsize=8)
for bar, score in zip(bars, comb_scores[::-1]):
    ax.text(min(score + 0.01, 0.97), bar.get_y() + bar.get_height() / 2,
            f"{score:.3f}", va="center", fontsize=7)

# — Technical vs Fundamental side-by-side
ax2 = axes[1]
import numpy as np
x = np.arange(len(tickers_sorted))
w = 0.35
ax2.bar(x - w/2, tech_scores, w, label="Technical", color="#1565c0", alpha=0.8)
ax2.bar(x + w/2, fund_scores, w, label="Fundamental", color="#e65100", alpha=0.8)
ax2.set_xticks(x)
ax2.set_xticklabels(tickers_sorted, rotation=45, ha="right", fontsize=8)
ax2.set_ylabel("Score")
ax2.set_title("Technical vs Fundamental Component Scores", fontsize=13, fontweight="bold")
ax2.set_ylim(0, 1)
ax2.legend()
ax2.axhline(0.6, color="gray", linestyle="--", alpha=0.4)

# — Signal distribution
ax3 = axes[2]
from collections import Counter
sig_counts = Counter(signals)
for sig in ["Strong Sell", "Sell", "Neutral", "Buy", "Strong Buy"]:
    if sig not in sig_counts:
        sig_counts[sig] = 0
keys   = ["Strong Buy", "Buy", "Neutral", "Sell", "Strong Sell"]
values = [sig_counts[k] for k in keys]
bar_colors = [SIGNAL_COLORS[k] for k in keys]
ax3.bar(keys, values, color=bar_colors)
ax3.set_ylabel("Number of Stocks")
ax3.set_title("Signal Distribution (Combined Score)", fontsize=13, fontweight="bold")
for i, v in enumerate(values):
    if v > 0:
        ax3.text(i, v + 0.1, str(v), ha="center", fontweight="bold")

plt.tight_layout(pad=2.0)
plt.show()

## 9 — Technical indicator heatmap

In [None]:
import numpy as np

tech_components = ["rsi", "macd", "bollinger", "moving_averages", "stochastic", "momentum", "volume"]

rows_data, row_labels = [], []
for ticker in tickers_sorted:
    ts = results[ticker].get("tech_scores", {})
    rows_data.append([ts.get(c, 0.5) for c in tech_components])
    row_labels.append(ticker)

matrix = np.array(rows_data)

fig, ax = plt.subplots(figsize=(12, max(8, len(tickers_sorted) * 0.35)))
im = ax.imshow(matrix, cmap="RdYlGn", vmin=0, vmax=1, aspect="auto")

ax.set_xticks(range(len(tech_components)))
ax.set_xticklabels([c.replace("_", "\n") for c in tech_components], fontsize=9)
ax.set_yticks(range(len(row_labels)))
ax.set_yticklabels(row_labels, fontsize=8)

for i in range(len(row_labels)):
    for j in range(len(tech_components)):
        val = matrix[i, j]
        ax.text(j, i, f"{val:.2f}", ha="center", va="center",
                fontsize=7, color="black" if 0.25 < val < 0.75 else "white")

plt.colorbar(im, ax=ax, shrink=0.6, label="Score (0=Bearish, 1=Bullish)")
ax.set_title("Technical Indicator Heatmap", fontsize=13, fontweight="bold", pad=12)
plt.tight_layout()
plt.show()

## 10 — Fundamental indicator heatmap

In [None]:
fund_components = [
    "pe", "pb", "peg", "revenue_growth", "earnings_growth",
    "roe", "profit_margin", "debt_equity", "fcf_yield", "analyst"
]
fund_labels = [
    "P/E", "P/B", "PEG", "Rev\nGrowth", "EPS\nGrowth",
    "ROE", "Profit\nMargin", "D/E", "FCF\nYield", "Analyst"
]

rows_data, row_labels = [], []
for ticker in tickers_sorted:
    fs = results[ticker].get("fund_scores", {})
    rows_data.append([fs.get(c, 0.5) for c in fund_components])
    row_labels.append(ticker)

matrix = np.array(rows_data)

fig, ax = plt.subplots(figsize=(14, max(8, len(tickers_sorted) * 0.35)))
im = ax.imshow(matrix, cmap="RdYlGn", vmin=0, vmax=1, aspect="auto")

ax.set_xticks(range(len(fund_labels)))
ax.set_xticklabels(fund_labels, fontsize=9)
ax.set_yticks(range(len(row_labels)))
ax.set_yticklabels(row_labels, fontsize=8)

for i in range(len(row_labels)):
    for j in range(len(fund_components)):
        val = matrix[i, j]
        ax.text(j, i, f"{val:.2f}", ha="center", va="center",
                fontsize=7, color="black" if 0.25 < val < 0.75 else "white")

plt.colorbar(im, ax=ax, shrink=0.6, label="Score (0=Weak, 1=Strong)")
ax.set_title("Fundamental Indicator Heatmap", fontsize=13, fontweight="bold", pad=12)
plt.tight_layout()
plt.show()

## 11 — Deep dive on a single stock
Set `DEEP_TICKER` to any stock in the universe for a detailed breakdown.

In [None]:
DEEP_TICKER = tickers_sorted[0]   # default: highest-ranked stock

r = results[DEEP_TICKER]
print(f"\n{'='*55}")
print(f"  Deep Dive: {DEEP_TICKER}")
print(f"{'='*55}")
print(f"  Price            : ${r.get('price', 0):.2f}")
print(f"  Tech composite   : {r['tech_composite']:.3f}")
print(f"  Fund composite   : {r['fund_composite']:.3f}")
print(f"  Combined score   : {r['composite']:.3f}")
print(f"  Signal           : {r['signal']}")

print(f"\n--- Technical Component Scores ---")
for k, v in r.get('tech_scores', {}).items():
    bar = '\u2588' * round(v * 30)
    print(f"  {k:<18} {v:.3f}  {bar}")

print(f"\n--- Fundamental Component Scores ---")
fund_names = {
    'pe': 'P/E ratio', 'pb': 'Price/Book', 'peg': 'PEG ratio',
    'revenue_growth': 'Revenue growth', 'earnings_growth': 'Earnings growth',
    'roe': 'Return on equity', 'profit_margin': 'Profit margin',
    'debt_equity': 'Debt/Equity', 'fcf_yield': 'FCF yield', 'analyst': 'Analyst rating'
}
for k, v in r.get('fund_scores', {}).items():
    bar = '\u2588' * round(v * 30)
    label = fund_names.get(k, k)
    print(f"  {label:<22} {v:.3f}  {bar}")

print(f"\n--- Raw Fundamental Data ---")
raw = r.get('fund_raw', {})
for field, val in raw.items():
    if val is not None:
        if field in ('marketCap', 'freeCashflow'):
            print(f"  {field:<28} ${val/1e9:.1f}B")
        elif field in ('trailingPE', 'forwardPE', 'priceToBook', 'pegRatio',
                       'debtToEquity', 'currentRatio', 'recommendationMean'):
            print(f"  {field:<28} {val:.2f}")
        else:
            print(f"  {field:<28} {val*100:.1f}%")

## 12 — Save results to CSV

In [None]:
rows_out = []
for rank, (ticker, r) in enumerate(results.items(), 1):
    row = {
        "Rank":            rank,
        "Ticker":          ticker,
        "Price":           r.get("price"),
        "Tech_Composite":  round(r["tech_composite"], 4),
        "Fund_Composite":  round(r["fund_composite"], 4),
        "Combined":        round(r["composite"], 4),
        "Signal":          r["signal"],
    }
    # Technical component scores
    for k, v in r.get("tech_scores", {}).items():
        row[f"tech_{k}"] = round(v, 4)
    # Fundamental component scores
    for k, v in r.get("fund_scores", {}).items():
        row[f"fund_{k}"] = round(v, 4)
    # Key raw fundamental data
    raw = r.get("fund_raw", {})
    for field in ["trailingPE", "priceToBook", "pegRatio", "revenueGrowth",
                  "earningsGrowth", "returnOnEquity", "profitMargins",
                  "debtToEquity", "recommendationMean"]:
        row[field] = raw.get(field)
    rows_out.append(row)

df_out = pd.DataFrame(rows_out)
df_out.to_csv("sp500_analysis_results.csv", index=False)
print(f"Results saved to sp500_analysis_results.csv  ({len(df_out)} rows, {len(df_out.columns)} columns)")
df_out.head()