<a href="https://colab.research.google.com/github/aminaalavi/AI-Projects/blob/main/FinAlphaAgents/FinAlphaAgents.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Alpha Agents - Autogen/AG2 GroupChat - Round Robin Debate with Tools

###LLM based Multi-Agents for Equity Portfolio Selection
This notebook is an inspired by and simpler version of the brilliant paper: AlphaAgents: Large Language Model based Multi-Agents for Equity Portfolio Constructions. https://arxiv.org/abs/2508.11152

It sets up GroupChat with round‑robin, coordinator + three agents and their tools that helps in the contruction of Equity Portfolios


In [1]:
# %%capture
!pip -q install autogen==0.9.9 yfinance feedparser sec-edgar-downloader beautifulsoup4 lxml scikit-learn matplotlib pandas

  Preparing metadata (setup.py) ... [?25l[?25hdone
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m834.0/834.0 kB[0m [31m28.7 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m81.3/81.3 kB[0m [31m5.7 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m45.5/45.5 kB[0m [31m3.5 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m147.8/147.8 kB[0m [31m10.6 MB/s[0m eta [36m0:00:00[0m
[?25h  Building wheel for sgmllib3k (setup.py) ... [?25l[?25hdone


### Imports and basic data helper
What happens here:
- Imports all Python libs used throughout (NumPy, Pandas, matplotlib, AutoGen, etc.).
- Defines `AS_OF_DATE` (today) and a `DEFAULT_TICKER` for quick tests.
- Implements `fetch_closes(ticker, as_of, …)` to download adjusted **daily Close** prices via `yfinance` and return a clean `list[float]` for the last ~1 trading year.
- Later cells compute valuation metrics from these close prices.


In [2]:
import os, re, json, math, glob, html, urllib.parse, datetime as dt
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import yfinance as yf
import feedparser
from bs4 import BeautifulSoup
from sec_edgar_downloader import Downloader as SecDownloader
from sklearn.feature_extraction.text import TfidfVectorizer

from autogen import ConversableAgent, GroupChat, GroupChatManager

import datetime as dt
AS_OF_DATE = dt.date.today()
DEFAULT_TICKER = 'AAPL'  # quick single-ticker smoke test

def fetch_closes(ticker: str, as_of: dt.date, lookback_days: int = 365*2, n_keep: int = 260):
    raw = yf.download(ticker, start=as_of - dt.timedelta(days=lookback_days), end=as_of + dt.timedelta(days=1), auto_adjust=True, progress=False)
    px = raw.loc[:pd.Timestamp(as_of)]
    close = px['Close'].dropna().astype(float)
    return close.tolist()[-n_keep:]


### LLM configuration for AutoGen
What this sets up:
- Chooses the model from `AG2_MODEL` (defaults to `gpt-4o-mini`).
- Reads `OPENAI_API_KEY` from Colab `userdata` or environment variables. Make sure you have your key in the google secrets
- Builds a **flat** `llm_config` expected by AutoGen 0.9.9 with `temperature=0.0` for deterministic outputs.
Safety checks:
- Asserts that an API key is present and prints the selected model.


In [3]:
# Build a robust legacy-flat llm_config for AutoGen 0.9.9
AG2_MODEL = os.environ.get('AG2_MODEL', 'gpt-4o-mini')
OPENAI_API_KEY = None
try:
    from google.colab import userdata
    OPENAI_API_KEY = userdata.get('OPENAI_API_KEY')
except Exception:
    pass
OPENAI_API_KEY = OPENAI_API_KEY or os.environ.get('OPENAI_API_KEY')
assert OPENAI_API_KEY, 'Missing OPENAI_API_KEY (set in Colab userdata or env var)'

llm_config = {
    'config_list': [
        {'model': AG2_MODEL, 'api_key': OPENAI_API_KEY, 'api_type': 'openai'}
    ],
    'temperature': 0.0
}
print('llm_config ready for model=', AG2_MODEL)

llm_config ready for model= gpt-4o-mini


### Core tools: valuation metrics, news/filings fetchers, and mini-RAG
What’s defined:
- `compute_valuation_metrics_from_closes(closes)`: computes **annualized return** and **annualized volatility** from a list of closes (uses daily returns and 252 trading-day convention).
- `news_rss_tool(ticker, as_of_date, limit)`: queries **Google News RSS** for the ticker up to the given date and returns a list of `{title, publisher, link, date}`.
- `edgar_fetch_tool(ticker, as_of_date, forms)`: downloads recent **SEC 10-K/10-Q** text snippets for the company (best-effort).
- `rag_search_tool(query, docs, k)`: TF-IDF ranks the provided `docs` against `query` and returns the top-k snippets.
Why it’s here:
- These functions act as the “ground truth” tools that agents will reference.


In [4]:
def compute_valuation_metrics_from_closes(closes: list):
    """Annualized return/volatility from close prices."""
    arr = np.asarray(closes, dtype=float)
    arr = arr[~np.isnan(arr)]
    if arr.size < 2:
        return {'ann_return': None, 'ann_vol': None, 'n': int(arr.size) - 1}
    daily = np.diff(arr) / arr[:-1]
    n = daily.size
    r_cum = float(np.prod(1.0 + daily) - 1.0)
    r_ann = float((1.0 + r_cum) ** (252.0 / n) - 1.0)
    sigma_daily = float(np.std(daily, ddof=1)) if n > 1 else 0.0
    sigma_ann = float(sigma_daily * math.sqrt(252.0))
    return {'ann_return': r_ann, 'ann_vol': sigma_ann, 'n': int(n)}

def news_rss_tool(ticker: str, as_of_date: str, limit: int = 10):
    """Simple Google News RSS up to as_of_date (fast + robust)."""
    company = ''
    try:
        info = yf.Ticker(ticker).get_info()
        company = info.get('shortName') or info.get('longName') or ''
    except Exception:
        pass
    q = ' '.join(x for x in [ticker, 'stock', company] if x).strip()
    url = f"https://news.google.com/rss/search?q={urllib.parse.quote_plus(q)}&hl=en-US&gl=US&ceid=US:en"
    feed = feedparser.parse(url)
    asof = dt.date.fromisoformat(as_of_date)
    items = []
    for e in feed.entries:
        pub = getattr(e, 'published_parsed', None) or getattr(e, 'updated_parsed', None)
        if not pub:
            continue
        d = dt.date(pub.tm_year, pub.tm_mon, pub.tm_mday)
        if d <= asof:
            title = getattr(e, 'title', '') or ''
            link = getattr(e, 'link', '') or ''
            src = getattr(getattr(e, 'source', None), 'title', '') if hasattr(e, 'source') else ''
            if title:
                items.append({'title': title, 'publisher': src, 'link': link, 'date': str(d)})
        if len(items) >= limit:
            break
    return items

def edgar_fetch_tool(ticker: str, as_of_date: str, forms=("10-K","10-Q")):
    """Best-effort text snippets from latest filings <= as_of_date."""
    os.environ.setdefault('SEC_COMPANY', 'AlphaAgentsDemo')
    os.environ.setdefault('SEC_EMAIL', 'demo@example.com')
    dl = SecDownloader(company_name=os.environ['SEC_COMPANY'], email_address=os.environ['SEC_EMAIL'], download_folder='/content/sec-filings')
    # Download a few copies and pick the latest <= as_of
    for form in forms:
        try:
            dl.get(form, ticker, amount=3, download_details=False)
        except Exception:
            pass

    roots = [getattr(dl, 'download_folder', '/content/sec-filings')]
    roots += [os.path.join(roots[0], 'sec-edgar-filings')]
    roots = [r for r in roots if os.path.isdir(r)]
    candidates = []
    for r in roots:
        for form in forms:
            candidates += glob.glob(os.path.join(r, ticker, form, '**', '*.htm*'), recursive=True)
            candidates += glob.glob(os.path.join(r, 'sec-edgar-filings', ticker, form, '**', 'full-submission.*'), recursive=True)
    candidates = sorted(set(candidates))
    asof = dt.date.fromisoformat(as_of_date)

    def _decode(b):
        try: return b.decode('utf-8', errors='ignore')
        except: return b.decode('latin-1', errors='ignore')

    def _infer_date(s):
        m = re.search(r"FILED AS OF DATE:\s*(\d{8})", s)
        if m:
            ymd = m.group(1); return dt.date(int(ymd[:4]), int(ymd[4:6]), int(ymd[6:8]))
        m = re.search(r"\b(20\d{2})(\d{2})(\d{2})\b", s[:60000])
        if m:
            y,mn,d = map(int, m.groups()); return dt.date(y,mn,d)
        return None

    latest = {}
    for p in candidates:
        try:
            with open(p, 'rb') as f: blob = f.read(600_000)
            s = _decode(blob)
        except Exception:
            continue
        d = _infer_date(s)
        if not d or d > asof:
            continue
        form = '10-K' if '/10-K/' in p or '\\10-K\\' in p else ('10-Q' if '/10-Q/' in p or '\\10-Q\\' in p else None)
        if form not in forms:
            continue
        prev = latest.get(form)
        if (prev is None) or (d > prev[0]):
            latest[form] = (d, p, s)

    texts = []
    for form, tup in latest.items():
        raw = tup[2]
        try:
            soup = BeautifulSoup(raw, 'lxml'); text = re.sub(r"\s+", ' ', soup.get_text('\n')).strip()
            if len(text) < 500: text = re.sub(r"\s+", ' ', raw).strip()
        except Exception:
            text = re.sub(r"\s+", ' ', raw).strip()
        texts.append(text[:20000])
    return texts

def rag_search_tool(query: str, docs: list, k: int = 3):
    if not docs:
        return []
    vec = TfidfVectorizer(stop_words='english', max_df=0.9)
    X = vec.fit_transform(docs + [query])
    sims = (X[-1] @ X[:-1].T).toarray().ravel()
    idx = np.argsort(-sims)[:k]
    return [docs[i] for i in idx]

### Robust data prep and JSON seeds for agents
What this does:
- Re-implements a more robust `fetch_closes` to always return a clean list of floats (handles odd `yfinance` DataFrame shapes).
- Sets `ticker`/`AS_OF_DATE`, fetches the close series, and computes valuation JSON.
- Pulls **news** via RSS and **filings** snippets via SEC downloader.
- Runs a tiny **RAG** over filings to extract a few targeted snippets (e.g., cash flow, margins, risks).
- Prints quick summaries so you can verify the inputs.
These JSON snippets will be **seeded** into the debate so Round-2 agents talk about the same facts.


In [5]:
# Patch fetch_closes to avoid DataFrame.tolist() errors, then prep all inputs

import pandas as pd, datetime as dt, yfinance as yf

def fetch_closes(ticker, as_of_date, lookback_days=365*2, max_points=260):
    """
    Robust: downloads up to as_of_date, picks the Close series even if yfinance
    returns MultiIndex columns, and returns a float list (last ~1 year).
    """
    start = as_of_date - dt.timedelta(days=lookback_days)
    raw = yf.download(
        ticker,
        start=start,
        end=as_of_date + dt.timedelta(days=1),
        auto_adjust=True,
        progress=False,
    )
    if raw is None or len(raw) == 0:
        return []

    # Select the Close column robustly
    if isinstance(raw.columns, pd.MultiIndex):
        if ('Close', ticker) in raw.columns:
            ser = raw[('Close', ticker)]
        else:
            close_cols = [c for c in raw.columns if c[0] == 'Close']
            ser = raw[close_cols[0]] if close_cols else raw.droplevel(0, axis=1).iloc[:, 0]
    else:
        ser = raw['Close'] if 'Close' in raw.columns else raw.iloc[:, 0]

    ser = pd.to_numeric(pd.Series(ser).dropna(), errors='coerce').dropna()
    return ser.astype(float).tolist()[-max_points:]

# === Use the patched fetch_closes ===
ticker = DEFAULT_TICKER
close_list = fetch_closes(ticker, AS_OF_DATE)

# Run your tools
val_json = compute_valuation_metrics_from_closes(close_list)
news_json = news_rss_tool(ticker, str(AS_OF_DATE), limit=10)
filings_snippets = edgar_fetch_tool(ticker, str(AS_OF_DATE), forms=("10-K","10-Q"))

# Tiny RAG picks
rag_samples = {}
try:
    if filings_snippets:
        rag_samples = {
            "cash_flow": [s[:400] for s in (rag_search_tool("cash flow from operations", filings_snippets, k=2) or [])],
            "margins":   [s[:400] for s in (rag_search_tool("gross margin trend", filings_snippets, k=2) or [])],
            "risks":     [s[:400] for s in (rag_search_tool("risk factors", filings_snippets, k=2) or [])],
        }
except Exception:
    rag_samples = {}

print("Prepared:", ticker)
print("close_list points:", len(close_list))
print("Valuation JSON:", val_json)
print("News items:", len(news_json) if news_json else 0)
print("Filings snippets:", len(filings_snippets) if filings_snippets else 0)


Prepared: AAPL
close_list points: 260
Valuation JSON: {'ann_return': 0.07046056523464173, 'ann_vol': 0.31865808132579876, 'n': 259}
News items: 10
Filings snippets: 0


### Sentiment source: SEC + Google News (deduped + date-bounded)
What’s included:
- Utility functions to normalize dates and robustly parse RSS entries.
- Fetches company name via `yfinance` (best-effort) to enrich the Google News query.
- Merges **SEC filings** items with **Google News RSS** items, de-duplicates by link/title+date, and sorts by recency up to `AS_OF_DATE`.
- Registers a `news_combined_from_env(...)` tool on the Sentiment agent (if available) and prints a small preview.
Why it’s useful:
- Gives the Sentiment agent a broader, more realistic news surface while staying within a reproducible, lightweight pipeline.


In [6]:
# === Add Google RSS to Sentiment: merge SEC + Google News (robust) ===
import os, re, html, json, datetime as dt, urllib.parse
from bs4 import BeautifulSoup

try:
    import feedparser
except Exception as e:
    raise ImportError("feedparser is required: `pip install feedparser`") from e

try:
    import yfinance as yf
except Exception as e:
    raise ImportError("yfinance is required: `pip install yfinance`") from e

def _to_date(obj):
    if isinstance(obj, dt.date):
        return obj
    if isinstance(obj, str):
        return dt.date.fromisoformat(obj.strip())
    raise ValueError("as_of_date must be ISO string 'YYYY-MM-DD' or datetime.date")

def _domain_from_url(url: str) -> str:
    try:
        from urllib.parse import urlparse
        netloc = urlparse(url or "").netloc.lower()
        return netloc.replace("www.","") if netloc else ""
    except Exception:
        return ""

# --- Google News RSS (company-aware) ---
def google_rss_news(ticker: str, as_of_date, limit: int = 10, days_window: int = 120):
    """
    Fetch Google News RSS items mentioning the ticker/company within [as_of_date - window, as_of_date].
    Returns list[ {title, publisher, link, date} ] sorted by date desc.
    """
    asof = _to_date(as_of_date)
    start = asof - dt.timedelta(days=days_window)

    # Try to enrich query with company name
    company = ""
    try:
        info = yf.Ticker(ticker).get_info()
        company = (info.get("shortName") or info.get("longName") or "").strip()
    except Exception:
        pass

    q = " ".join(x for x in [ticker, "stock", company] if x).strip()
    rss_url = f"https://news.google.com/rss/search?q={urllib.parse.quote_plus(q)}&hl=en-US&gl=US&ceid=US:en"
    feed = feedparser.parse(rss_url)

    items = []
    for e in getattr(feed, "entries", []):
        # date
        tt = getattr(e, "published_parsed", None) or getattr(e, "updated_parsed", None)
        if not tt:
            continue
        d = dt.date(tt.tm_year, tt.tm_mon, tt.tm_mday)
        if not (start <= d <= asof):
            continue

        # fields
        title = html.unescape((getattr(e, "title", "") or "").strip())
        link  = (getattr(e, "link", "") or "").strip()
        src   = ""
        if hasattr(e, "source") and getattr(e, "source"):
            src = getattr(e.source, "title", "") or ""
        if not src:
            src = _domain_from_url(link)
        if not title:
            continue

        items.append({
            "title": title,
            "publisher": (src or "Google RSS").strip(),
            "link": link,
            "date": str(d),
        })
        if len(items) >= limit:
            break

    # sort desc by date
    items.sort(key=lambda x: x["date"], reverse=True)
    return items

# --- Combine SEC items + Google RSS, dedup, sort ---
def news_combined_tool(ticker: str, as_of_date, sec_limit: int = 10, rss_limit: int = 10, days_window: int = 120):
    """
    Merge SEC 8-K/6-K 'news' with Google RSS, deduplicate by link/title+date, sort by date desc.
    Returns list[ {title, publisher, link, date, source} ].
    """
    # SEC items: prefer sec_news_tool if present, else news_rss_tool
    sec_items = []
    try:
        if "sec_news_tool" in globals():
            sec_items = sec_news_tool(ticker, str(_to_date(as_of_date)), limit=sec_limit, days_window=days_window) or []
        elif "news_rss_tool" in globals():
            sec_items = news_rss_tool(ticker, str(_to_date(as_of_date)), limit=sec_limit) or []
        for it in sec_items:
            it["source"] = it.get("source") or "SEC"
    except Exception:
        sec_items = []

    # Google RSS
    rss_items = []
    try:
        rss_items = google_rss_news(ticker, _to_date(as_of_date), limit=rss_limit, days_window=days_window) or []
        for it in rss_items:
            it["source"] = it.get("source") or "GoogleRSS"
    except Exception:
        rss_items = []

    # Dedup by link (preferred) or normalized title+date
    def _key(it):
        link = (it.get("link") or "").strip().lower()
        if link:
            return ("link", link)
        title = re.sub(r"\s+", " ", (it.get("title") or "").strip().lower())
        d = it.get("date") or ""
        return ("title_date", f"{title}|{d}")

    seen, merged = set(), []
    for it in (sec_items + rss_items):
        k = _key(it)
        if k in seen:
            continue
        seen.add(k)
        merged.append(it)

    merged.sort(key=lambda x: x.get("date", ""), reverse=True)
    return merged

# --- Wrapper for SentimentAgent (accepts either `limit` or per-source limits) ---
def news_combined_from_env(limit: int = None, sec_limit: int = 8, rss_limit: int = 8, days_window: int = 120):
    """
    Wrapper bound to globals: ticker, AS_OF_DATE.
    You may pass either `limit` (applied to both sources) or specific sec_limit/rss_limit.
    """
    if limit is not None:
        sec_limit = rss_limit = int(limit)
    return news_combined_tool(ticker, AS_OF_DATE, sec_limit=sec_limit, rss_limit=rss_limit, days_window=days_window)

# Register on SentimentAgent (adds a new callable; won't remove the old one)
try:
    sentiment_agent.register_function(function_map={"news_combined_from_env": news_combined_from_env})
    print("Registered: news_combined_from_env() on SentimentAgent ✅")
except Exception as e:
    print("Warning: could not register on sentiment_agent (define it first). Error:", e)

# --- Quick smoke test (optional) ---
try:
    preview = news_combined_from_env(limit=4)
    print(f"Combined news preview: {len(preview)} items "
          f"(SEC+RSS). Sample:")
    print(json.dumps(preview[:4], indent=2))
except Exception as e:
    print("Preview error:", e)


Combined news preview: 4 items (SEC+RSS). Sample:
[
  {
    "title": "Jim Cramer Says Apple Inc. (AAPL)\u2019s $600 Billion Investment In US Could Make Trump Ask Others To Do The Same - Yahoo Finance",
    "publisher": "Yahoo Finance",
    "link": "https://news.google.com/rss/articles/CBMie0FVX3lxTE9ud3YxNlpEQjlsVk5yZUNzeXdwcm5tY1cxNlNJQllVS3dpaHFkVWpDYloyNVNfOWQxTnN6UXlfUDF6OU9mZVQzZDk4RGZJUS1XZTJrdkhkaVdncW1oY2pCdXhaejg4Nm9QWXotOGQ2WXgzbElubk14TngzMA?oc=5",
    "date": "2025-09-06",
    "source": "SEC"
  },
  {
    "title": "Can Apple stock recover post Google\u2019s court win (AAPL:NASDAQ) - Seeking Alpha",
    "publisher": "Seeking Alpha",
    "link": "https://news.google.com/rss/articles/CBMijAFBVV95cUxQUDBIcWlQbU1vTTdQcUhCLXhBbjhDR2RMamE3VlJUVktpcFRKQTVYUjB5S2JSZ1VnYjFKRTEzSmFCOFJTb3J2T25EbkZYMk9zaEMtdmozWmdRNHZLVVk0X2ZmMVd3U1BkbXZ2U1BYRkZLU3RhNFRtbzYyOHA3ay1YVkJaeExUYnBLazh0Nw?oc=5",
    "date": "2025-09-06",
    "source": "SEC"
  },
  {
    "title": "Apple's Big Event On Septem

### Build the four agents and wire up tools
What’s configured:
- Four AutoGen agents with focused system prompts:
  - **ValuationAgent**: only interprets the seeded valuation JSON (annualized return/vol only).
  - **SentimentAgent**: summarizes tone/risks from the combined SEC+news list.
  - **FundamentalAgent**: pulls 10-K/10-Q snippets and runs mini-RAG on them.
  - **Coordinator**: orchestrates the debate and synthesizes the final decision.
- Registers tool wrappers so agents can call:
  - valuation metrics from the precomputed environment,
  - combined news (`news_combined_from_env`) and/or SEC news,
  - filings fetch + `rag_search_tool`.
Result:
- A clean, tool-aligned agent team ready for a seeded Round-2 debate.


In [7]:
# === Build agents (clean + tool-aligned) ===

VAL_SYS = (
    "You are the ValuationAgent. Use ONLY the seeded JSON above (no recalculation, no new tools). "
    "Echo the exact numeric values from your JSON. If 'ann_return_str'/'ann_vol_str' exist, echo those EXACT strings; "
    "otherwise echo 'ann_return' and 'ann_vol' formatted to 3 dp. If a value is missing, say 'N/A'. "
    "Report/interpret annualized return and annualized volatility only; do NOT mention P/E, P/B, or EV/EBITDA. "
    "Formulas (for reference only): R_annualized = (1+R_cumulative)^(252/n) - 1; sigma_annualized = sigma_daily*sqrt(252)."
)

SENT_SYS = (
    "You are the SentimentAgent. Use the provided news tool(s) to summarize tone and risks from SEC + news. "
    "Do not invent items—only use the returned list."
)
FUND_SYS = (
    "You are the FundamentalAgent. Use filings_from_env to fetch 10-K/10-Q snippets, and rag_search_tool over those "
    "snippets to discuss cash flow [CF], margins [GM], and risks [R]. Do not make up values."
)
COORD_SYS = (
    "You are the Coordinator. You orchestrate the debate and later synthesize the final report. "
    "Do not call tools yourself."
)

valuation_agent   = ConversableAgent(name="ValuationAgent",   system_message=VAL_SYS,  llm_config=llm_config, human_input_mode="NEVER")
sentiment_agent   = ConversableAgent(name="SentimentAgent",   system_message=SENT_SYS,  llm_config=llm_config, human_input_mode="NEVER")
fundamental_agent = ConversableAgent(name="FundamentalAgent", system_message=FUND_SYS,  llm_config=llm_config, human_input_mode="NEVER")
coordinator       = ConversableAgent(name="Coordinator",      system_message=COORD_SYS, llm_config=llm_config, human_input_mode="NEVER")

# === Tool wrappers (no globals(); use ticker/AS_OF_DATE already defined in your notebook) ===

def val_metrics_from_env():
    """Compute ann_return/ann_vol from market data for the global `ticker` as of `AS_OF_DATE`."""
    closes = fetch_closes(ticker, AS_OF_DATE)  # your helper returns a Series/DataFrame/list
    return compute_valuation_metrics_from_closes(closes)

# Prefer the combined SEC+Google tool; also expose an alias so prompts that say 'sec_news_from_env' still work.
def news_from_env(limit: int = 10, days_window: int = 120):
    return news_combined_tool(ticker, AS_OF_DATE, sec_limit=limit, rss_limit=limit, days_window=days_window)

def sec_news_from_env(limit: int = 10):
    # alias so older prompts still function; routes to combined feed
    return news_from_env(limit=limit)

def filings_from_env(forms=("10-K","10-Q")):
    return edgar_fetch_tool(ticker, str(AS_OF_DATE), forms=forms)

# === Register tools on the right agents ===
valuation_agent.register_function(function_map={"val_metrics_from_env": val_metrics_from_env})

# Register the combined news tool (and keep the legacy name, in case your prompt calls it)
sentiment_agent.register_function(function_map={
    "news_combined_from_env": lambda limit=10, days_window=120: news_from_env(limit=limit, days_window=days_window),
    "sec_news_from_env": sec_news_from_env,
})

# Fundamentals: filings + RAG
fundamental_agent.register_function(function_map={
    "filings_from_env": filings_from_env,
    "rag_search_tool": rag_search_tool,   # you already defined this helper earlier
})

print("Agents and tools ready.")


Agents and tools ready.


In [8]:
import datetime as dt
AS_OF_DATE = dt.date.today()
ticker = 'AAPL'

### Run a single-ticker debate with **seeded facts** and **no in-chat tool calls**
Highlights:
- Re-defines minimal fallbacks (valuation, Google RSS, etc.) so the cell is self-contained even if you edited earlier helpers.
- **Seeds Turn-1** with JSON from tools run **outside** the chat (ground truth).
- **Suspends tools** for all agents during Round-2 to prevent new/hallucinated calls.
- Builds a `GroupChat` and runs up to a few rounds with round-robin speakers.
- Extracts a strict `JSON_DECISION` of the form `{"decision":"BUY|SELL|HOLD","confidence":0.5–0.9}`.
- Restores the agents’ tool maps afterward.
Why this pattern:
- Keeps the debate on the same factual base and makes the output easy to parse for downstream use.


In [9]:
# === Pre-seeded Round-2 debate with tool SUSPENSION (drop-in) ===
# - Restores robust helpers (fetch_closes -> list, valuation metrics)
# - Falls back for news/filings if you accidentally deleted those funcs
# - Seeds Turn-1 JSON from tools run in Python
# - Suspends in-chat tools so Round-2 cannot hallucinate new calls
# - Extracts a clean JSON_DECISION (BUY/SELL/HOLD only)

import re, json, math, datetime as dt
import numpy as np
import pandas as pd
import yfinance as yf

# ---------- Robust helpers (safe to redefine) ----------
def fetch_closes(ticker: str, as_of: dt.date, lookback_days: int = 365*2, n_keep: int = 260):
    """
    Returns a 1-D list[float] of Close prices up to (and including) as_of.
    Works even if yfinance returns a DataFrame or Series.
    """
    raw = yf.download(
        ticker,
        start=as_of - dt.timedelta(days=lookback_days),
        end=as_of + dt.timedelta(days=1),
        auto_adjust=True,
        progress=False,
    )
    if not isinstance(raw, pd.DataFrame) or 'Close' not in raw.columns:
        raise ValueError("yfinance returned no 'Close' column for ticker=%s" % ticker)
    px = raw.loc[:pd.Timestamp(as_of), 'Close'].dropna()
    # Normalize to list[float]
    if isinstance(px, pd.Series):
        data = px.astype(float).tolist()
    elif isinstance(px, pd.DataFrame):
        # squeeze 1-col df -> series -> list
        data = px.squeeze().astype(float).tolist()
    else:
        data = list(map(float, list(px)))
    return data[-n_keep:]

def compute_valuation_metrics_from_closes(close_list):
    """
    Given list[float] of prices (chronological), compute annualized return/volatility.
    """
    arr = pd.Series(close_list, dtype=float).dropna()
    if len(arr) < 2:
        return {"ann_return": 0.0, "ann_vol": 0.0, "n": int(arr.size)}
    rets = arr.pct_change().dropna()
    n = rets.size
    mu = rets.mean()
    sigma = rets.std(ddof=1) if n > 1 else 0.0
    ann_return = (1 + mu) ** 252 - 1
    ann_vol = sigma * math.sqrt(252)
    return {"ann_return": float(ann_return), "ann_vol": float(ann_vol), "n": int(n)}

# ---------- Minimal fallbacks if you deleted your news/filings tools ----------
# Google News RSS (fallback) - used if you don't have your SEC/combined tools anymore
try:
    import feedparser, urllib.parse
except Exception:
    feedparser = None

def _google_rss_fallback(ticker: str, as_of_date: str, limit: int = 10, days_window: int = 120):
    if feedparser is None:
        return []
    # Try to enrich with company name (optional)
    try:
        info = yf.Ticker(ticker).get_info()
        company = info.get("shortName") or info.get("longName") or ""
    except Exception:
        company = ""
    q = " ".join(x for x in [ticker, "stock", company] if x).strip()
    rss_url = f"https://news.google.com/rss/search?q={urllib.parse.quote_plus(q)}&hl=en-US&gl=US&ceid=US:en"
    feed = feedparser.parse(rss_url)
    asof = dt.date.fromisoformat(as_of_date)
    start = asof - dt.timedelta(days=days_window)
    out = []
    for e in getattr(feed, "entries", []):
        pub = getattr(e, "published_parsed", None) or getattr(e, "updated_parsed", None)
        if not pub:
            continue
        d = dt.date(pub.tm_year, pub.tm_mon, pub.tm_mday)
        if start <= d <= asof:
            title = getattr(e, "title", "") or ""
            link  = getattr(e, "link", "") or ""
            src = getattr(getattr(e, "source", None), "title", "") if hasattr(e, "source") else ""
            if title:
                out.append({"title": title, "publisher": src or "Google RSS", "link": link, "date": str(d)})
        if len(out) >= limit:
            break
    out.sort(key=lambda x: x["date"], reverse=True)
    return out

# Provide a simple, safe wrapper name the rest of your notebook expects
if 'news_rss_tool' not in globals():
    def news_rss_tool(ticker: str, as_of_date: str, limit: int = 10):
        return _google_rss_fallback(ticker, as_of_date, limit=limit, days_window=120)

# Filings fallback (no-op) if you removed your SEC parser; returns [] but keeps pipeline running
if 'edgar_fetch_tool' not in globals():
    def edgar_fetch_tool(ticker: str, as_of_date: str, forms=("10-K","10-Q")):
        return []

# rag_search_tool fallback (identity) if missing
if 'rag_search_tool' not in globals():
    def rag_search_tool(query: str, docs: list, k: int = 2):
        return (docs or [])[:k]

# ---------- Sanity checks (agents + llm) ----------
assert 'valuation_agent'   in globals(), "valuation_agent not initialized."
assert 'sentiment_agent'   in globals(), "sentiment_agent not initialized."
assert 'fundamental_agent' in globals(), "fundamental_agent not initialized."
assert 'coordinator'       in globals(), "coordinator not initialized."
assert 'llm_config'        in globals(), "llm_config not set."

# Ticker/as-of (use your existing globals if set)
ticker = globals().get('ticker', 'AAPL')
AS_OF_DATE = globals().get('AS_OF_DATE', dt.date.today())

# ---------- 0) Run tools OUTSIDE the chat (ground truth) ----------
closes = fetch_closes(ticker, AS_OF_DATE)
val_json_raw = compute_valuation_metrics_from_closes(closes)

# >>> Added: format the seeded valuation JSON with fixed strings the agent can echo
def _fmt3(x):
    return "N/A" if x is None else f"{x:.3f}"

val_json = {
    **val_json_raw,
    "ann_return_str": _fmt3(val_json_raw.get("ann_return")),
    "ann_vol_str":    _fmt3(val_json_raw.get("ann_vol")),
}

# Prefer combined tool if you have it; else pure RSS fallback above
if 'news_combined_tool' in globals():
    news_json = news_combined_tool(ticker, str(AS_OF_DATE), sec_limit=8, rss_limit=8, days_window=120)
else:
    news_json = news_rss_tool(ticker, str(AS_OF_DATE), limit=10)

filings_snips = edgar_fetch_tool(ticker, str(AS_OF_DATE), forms=("10-K","10-Q"))

rag_samples = {}
try:
    if filings_snips:
        rag_samples = {
            "cash_flow": [s[:400] for s in rag_search_tool("cash flow from operations", filings_snips, k=2)],
            "margins":   [s[:400] for s in rag_search_tool("gross margin trend", filings_snips, k=2)],
            "risks":     [s[:400] for s in rag_search_tool("risk factors", filings_snips, k=2)],
        }
except Exception:
    pass

print("Seed data —")
print("val_json=", val_json)
print("news_items=", len(news_json))
print("filing_snips=", len(filings_snips))

# ---------- 1) Seed Turn-1 JSON from each agent ----------
seed_msgs = [
    {"name": "ValuationAgent",   "content": json.dumps(val_json)},
    {"name": "SentimentAgent",   "content": json.dumps(news_json)},
    {"name": "FundamentalAgent", "content": json.dumps({"filings_snippets": filings_snips, "rag_samples": rag_samples})},
]

# ---------- 2) Temporarily DISABLE in-chat tools for Round-2 ----------
def _suspend_tools(agent):
    old = getattr(agent, "_function_map", {}).copy() if hasattr(agent, "_function_map") else {}
    if hasattr(agent, "_function_map"):
        agent._function_map.clear()
    return old

def _restore_tools(agent, old):
    if hasattr(agent, "_function_map"):
        agent._function_map = old

_sav_val = _suspend_tools(valuation_agent)
_sav_sen = _suspend_tools(sentiment_agent)
_sav_fun = _suspend_tools(fundamental_agent)

# ---------- 3) Build GroupChat & run Round-2 only (interpretations + decisions) ----------
from autogen import GroupChat, GroupChatManager

groupchat = GroupChat(
    agents=[valuation_agent, sentiment_agent, fundamental_agent, coordinator],
    messages=seed_msgs,
    max_round=5,
    speaker_selection_method="round_robin",
)
manager = GroupChatManager(groupchat=groupchat, llm_config=llm_config)

task = (
f"Round 2 only for {ticker} as of {AS_OF_DATE}.\n"
"Each agent must interpret ONLY the JSON provided for them below. Do NOT call tools.\n"
"IMPORTANT ENFORCEMENT:\n"
"- Output ONLY content for your own role. Do not include any section headers or text for other agents.\n"
"- If a required format element is missing, write 'N/A' exactly as instructed.\n"
"- Confidence must be between 0.5 and 0.9 (not 0.0).\n\n"

"--- ValuationAgent JSON ---\n"
f"```json\n{json.dumps(val_json, ensure_ascii=False)}\n```\n"
"ValuationAgent:\n"
"- Echo the exact strings if present: ann_return_str and ann_vol_str (verbatim). If they are absent, echo ann_return and ann_vol rounded to exactly 3 dp.\n"
"- Then 1 short paragraph (≤120 words) interpreting risk/return.\n"
'- End with exactly one line: DECISION_JSON: {"stance":"BUY|SELL|HOLD","confidence":0.5-0.9}\n'
"- Do not include content labeled for SentimentAgent or FundamentalAgent.\n\n"

"--- SentimentAgent JSON ---\n"
f"```json\n{json.dumps(news_json, ensure_ascii=False)[:4000]}\n```\n"
"SentimentAgent:\n"
"- Bullet the top 3 items *copied from the JSON above* as: (date — publisher — short title). If fewer than 3 exist, bullet what you have.\n"
"- Then 1 short paragraph (≤120 words) summarizing tone/risks.\n"
'- End with exactly one line: DECISION_JSON: {"stance":"BUY|SELL|HOLD","confidence":0.5-0.9}\n'
"- Do not include content labeled for ValuationAgent or FundamentalAgent.\n\n"

"--- FundamentalAgent JSON ---\n"
f"```json\n{json.dumps({'filings_snippets': filings_snippets, 'rag_samples': rag_samples}, ensure_ascii=False)[:4000]}\n```\n"
"FundamentalAgent:\n"
"- Use ONLY the JSON above. If filings_snippets is empty, your first line MUST be exactly: [CF]=N/A, [GM]=N/A, [R]=N/A\n"
"- Otherwise discuss [CF],[GM],[R] in ≤120 words using only those snippets/samples (no invented values).\n"
'- End with exactly one line: DECISION_JSON: {"stance":"BUY|SELL|HOLD","confidence":0.5-0.9}\n'
"- Do not include content labeled for ValuationAgent or SentimentAgent.\n\n"

"Coordinator:\n"
"- After all three have spoken, synthesize a concise final report (≤120 words) and end with exactly one line:\n"
'  JSON_DECISION: {"decision":"BUY|SELL|HOLD","confidence":0.5-0.9}\n'
"- Do NOT call tools."
)



_ = coordinator.initiate_chat(manager, message=task)

# ---------- 4) Restore tools ----------
_restore_tools(valuation_agent, _sav_val)
_restore_tools(sentiment_agent, _sav_sen)
_restore_tools(fundamental_agent, _sav_fun)

# ---------- 5) (Optional) Show transcript ----------
print("\n--- Transcript ---")
for i, m in enumerate(manager.groupchat.messages, 1):
    who = m.get("name") or m.get("role")
    print(f"[{i:02d}] {who}\n{str(m.get('content'))[:1000]}\n")

# ---------- 6) Robust final JSON extraction ----------
ALLOWED = {"BUY","SELL","HOLD"}

def _clean_json(s):
    return (s or "").strip().replace("“", '"').replace("”", '"').replace("’", "'").replace("‘", "'")

def _parse_json(s):
    try: return json.loads(_clean_json(s))
    except Exception: return None

def _extract_json_decision_from_text(text):
    if not text: return None
    # Prefer Coordinator JSON_DECISION
    m = re.search(r'JSON_DECISION:\s*(?:```json\s*)?({.*?})(?:\s*```)?', text, flags=re.I|re.S)
    if m:
        j = _parse_json(m.group(1))
        if j:
            dec = str(j.get("decision") or j.get("stance") or "").upper()
            if dec in ALLOWED and "|" not in dec:
                try: conf = float(j.get("confidence", 0.0))
                except Exception: conf = 0.5
                return {"decision": dec, "confidence": conf}
    # Fallback: any json with decision/stance
    for mm in re.finditer(r'(?:```json\s*)?({[^{}]*"(?:decision|stance)"[^{}]*})(?:\s*```)?', text, flags=re.I|re.S):
        j = _parse_json(mm.group(1));
        if not j: continue
        dec = str(j.get("decision") or j.get("stance") or "").upper()
        if dec in ALLOWED and "|" not in dec:
            try: conf = float(j.get("confidence", 0.0))
            except Exception: conf = 0.5
            return {"decision": dec, "confidence": conf}
    return None

def _votes_consensus(msgs):
    votes = []
    agent_re = re.compile(r'DECISION_JSON:\s*(?:```json\s*)?({.*?})(?:\s*```)?', re.I|re.S)
    for m in msgs:
        who = m.get("name") or ""
        if who not in ("ValuationAgent","SentimentAgent","FundamentalAgent"):
            continue
        mm = agent_re.search(str(m.get("content") or ""))
        if not mm: continue
        j = _parse_json(mm.group(1));
        if not j: continue
        dec = str(j.get("stance") or j.get("decision") or "").upper()
        if dec in ALLOWED and "|" not in dec:
            try: conf = float(j.get("confidence", 0.0))
            except Exception: conf = 0.0
            votes.append((dec, conf))
    if not votes: return None
    from collections import Counter
    counts = Counter([d for d,_ in votes])
    top = counts.most_common()
    decision = "HOLD" if len(top)>1 and top[0][1]==top[1][1] else top[0][0]
    confs = [c for d,c in votes if d==decision] or [c for _,c in votes]
    return {"decision": decision, "confidence": round(sum(confs)/len(confs), 2)}

msgs = list(manager.groupchat.messages)
final_obj = {"decision":"HOLD","confidence":0.5}

# Prefer last Coordinator JSON_DECISION
coord_msgs = [m for m in msgs if m.get("name")=="Coordinator"]
picked = None
for m in reversed(coord_msgs):
    picked = _extract_json_decision_from_text(str(m.get("content") or ""))
    if picked: final_obj = picked; break

# Else scan all; else derive from votes
if not picked:
    for m in reversed(msgs):
        picked = _extract_json_decision_from_text(str(m.get("content") or ""))
        if picked: final_obj = picked; break
if not picked:
    voted = _votes_consensus(msgs)
    if voted: final_obj = voted

# Fallback if confidence ended up as 0.0 → derive from agent votes
if float(final_obj.get("confidence", 0.0)) == 0.0:
    ALLOWED = {"BUY","SELL","HOLD"}
    agent_pattern = re.compile(r'DECISION_JSON:\s*(?:```json\s*)?({.*?})(?:\s*```)?', re.I|re.S)
    votes = []
    for m in msgs:
        who = m.get("name") or ""
        if who not in ("ValuationAgent","SentimentAgent","FundamentalAgent"):
            continue
        mm = agent_pattern.search(str(m.get("content") or ""))
        if not mm:
            continue
        try:
            j = json.loads(mm.group(1))
        except Exception:
            continue
        stance = str(j.get("stance") or j.get("decision") or "").upper()
        conf   = float(j.get("confidence") or 0.0)
        if stance in ALLOWED and conf > 0.0:
            votes.append((stance, conf))
    if votes:
        from collections import Counter
        counts = Counter([s for s,_ in votes]).most_common()
        decision = counts[0][0]
        confs = [c for s,c in votes if s == decision] or [c for _,c in votes]
        final_obj = {"decision": decision, "confidence": round(sum(confs)/len(confs), 2)}
    else:
        # Sensible defaults if everyone gave 0.0:
        # unanimous stance → 0.65, mixed → 0.55
        stances = []
        for m in msgs:
            mm = agent_pattern.search(str(m.get("content") or ""))
            if not mm:
                continue
            try:
                j = json.loads(mm.group(1))
                stances.append(str(j.get("stance") or j.get("decision") or "").upper())
            except Exception:
                pass
        uniq = set([s for s in stances if s in ALLOWED])
        final_obj["confidence"] = 0.65 if len(uniq) == 1 and uniq else 0.55

# Sanity checks after the run
msgs = list(manager.groupchat.messages)

# 1) Valuation must echo the seeded strings if present
val_msg_texts = [str(m.get("content") or "") for m in msgs if m.get("name")=="ValuationAgent"]
valuation_echo_ok = True
if "ann_return_str" in val_json and "ann_vol_str" in val_json:
    want_r, want_v = val_json["ann_return_str"], val_json["ann_vol_str"]
    valuation_echo_ok = any((want_r in t and want_v in t) for t in val_msg_texts)

# 2) Fundamental must output explicit N/A line when no filings
fundamental_ok = True
if not filings_snippets:
    import re
    na_line = re.compile(r"\[CF\]\s*=\s*N/?A\s*,\s*\[GM\]\s*=\s*N/?A\s*,\s*\[R\]\s*=\s*N/?A", re.I)
    fund_texts = [str(m.get("content") or "") for m in msgs if m.get("name")=="FundamentalAgent"]
    fundamental_ok = any(na_line.search(t) for t in fund_texts)

print("CHECKS — valuation_echo_ok:", valuation_echo_ok, "| fundamentals_respected_empty:", fundamental_ok)


print("FINAL:", final_obj)


Seed data —
val_json= {'ann_return': 0.12528546739537183, 'ann_vol': 0.318658153567929, 'n': 259, 'ann_return_str': '0.125', 'ann_vol_str': '0.319'}
news_items= 8
filing_snips= 0
Coordinator (to chat_manager):

Round 2 only for AAPL as of 2025-09-07.
Each agent must interpret ONLY the JSON provided for them below. Do NOT call tools.
IMPORTANT ENFORCEMENT:
- Output ONLY content for your own role. Do not include any section headers or text for other agents.
- If a required format element is missing, write 'N/A' exactly as instructed.
- Confidence must be between 0.5 and 0.9 (not 0.0).

--- ValuationAgent JSON ---
```json
{"ann_return": 0.12528546739537183, "ann_vol": 0.318658153567929, "n": 259, "ann_return_str": "0.125", "ann_vol_str": "0.319"}
```
ValuationAgent:
- Echo the exact strings if present: ann_return_str and ann_vol_str (verbatim). If they are absent, echo ann_return and ann_vol rounded to exactly 3 dp.
- Then 1 short paragraph (≤120 words) interpreting risk/return.
- End wit

### Debate a small ticker universe and collect votes
What happens here:
- Defines a tiny `UNIVERSE` (e.g., `['INTC','NFLX','PANW']`) to keep runtime reasonable.
- For each ticker:
  - Prepares ground-truth data (valuation, news, filings + RAG),
  - Seeds Round-1 messages,
  - **Suspends tools** and runs the Round-2 debate,
  - Parses agent votes and coordinator decision into a compact record.
- Builds a `DataFrame` of `{ticker, decision, confidence}` and displays it.
Output:
- A quick summary table you can persist to CSV or feed into later analysis.


In [11]:
# === Debate a small universe (pre-seeded Round-2, no in-chat tools) ===
from tqdm.auto import tqdm
import json, re, math, datetime as dt
import pandas as pd

# --- tiny helpers reused locally so this cell is self-contained ---

ALLOWED = {"BUY","SELL","HOLD"}

def _suspend_tools(agent):
    old = getattr(agent, "_function_map", {}).copy() if hasattr(agent, "_function_map") else {}
    if hasattr(agent, "_function_map"):
        agent._function_map.clear()
    return old

def _restore_tools(agent, old):
    if hasattr(agent, "_function_map"):
        agent._function_map = old

def _clean_json(s):
    return (s or "").strip().replace("“", '"').replace("”", '"').replace("’", "'").replace("‘", "'")

def _parse_json(s):
    try: return json.loads(_clean_json(s))
    except Exception: return None

def _extract_json_decision_from_text(text):
    if not text: return None
    m = re.search(r'JSON_DECISION:\s*(?:```json\s*)?({.*?})(?:\s*```)?', text, flags=re.I|re.S)
    if m:
        j = _parse_json(m.group(1))
        if j:
            dec = str(j.get("decision") or j.get("stance") or "").upper()
            if dec in ALLOWED and "|" not in dec:
                try: conf = float(j.get("confidence", 0.0))
                except Exception: conf = 0.5
                return {"decision": dec, "confidence": conf}
    for mm in re.finditer(r'(?:```json\s*)?({[^{}]*"(?:decision|stance)"[^{}]*})(?:\s*```)?', text, flags=re.I|re.S):
        j = _parse_json(mm.group(1))
        if not j: continue
        dec = str(j.get("decision") or j.get("stance") or "").upper()
        if dec in ALLOWED and "|" not in dec:
            try: conf = float(j.get("confidence", 0.0))
            except Exception: conf = 0.5
            return {"decision": dec, "confidence": conf}
    return None

def _votes_consensus(msgs):
    votes = []
    agent_re = re.compile(r'DECISION_JSON:\s*(?:```json\s*)?({.*?})(?:\s*```)?', re.I|re.S)
    for m in msgs:
        who = m.get("name") or ""
        if who not in ("ValuationAgent","SentimentAgent","FundamentalAgent"):
            continue
        mm = agent_re.search(str(m.get("content") or ""))
        if not mm: continue
        j = _parse_json(mm.group(1))
        if not j: continue
        dec = str(j.get("stance") or j.get("decision") or "").upper()
        if dec in ALLOWED and "|" not in dec:
            try: conf = float(j.get("confidence", 0.0))
            except Exception: conf = 0.0
            votes.append((dec, conf))
    if not votes: return None
    from collections import Counter
    counts = Counter([d for d,_ in votes]).most_common()
    decision = "HOLD" if (len(counts) > 1 and counts[0][1] == counts[1][1]) else counts[0][0]
    confs = [c for d,c in votes if d == decision] or [c for _,c in votes]
    return {"decision": decision, "confidence": round(sum(confs)/len(confs), 2)}

def _format_val_strings(val_json):
    # add string echoes for 3dp, used by ValuationAgent
    if isinstance(val_json.get("ann_return"), (int, float)):
        val_json["ann_return_str"] = f"{float(val_json['ann_return']):.3f}"
    if isinstance(val_json.get("ann_vol"), (int, float)):
        val_json["ann_vol_str"] = f"{float(val_json['ann_vol']):.3f}"
    return val_json

def debate_one_ticker(tk: str, as_of=AS_OF_DATE):
    # 0) Run tools OUTSIDE the chat
    closes = fetch_closes(tk, as_of)
    val_json = compute_valuation_metrics_from_closes(closes)
    val_json = _format_val_strings(val_json)

    if "news_combined_tool" in globals():
        news_json = news_combined_tool(tk, as_of, sec_limit=8, rss_limit=8, days_window=120)
    else:
        news_json = news_rss_tool(tk, str(as_of), limit=10)

    filings_snips = edgar_fetch_tool(tk, str(as_of), forms=("10-K","10-Q"))
    rag_samples = {}
    try:
        if filings_snips:
            rag_samples = {
                "cash_flow": [s[:400] for s in (rag_search_tool("cash flow from operations", filings_snips, k=2) or [])],
                "margins":   [s[:400] for s in (rag_search_tool("gross margin trend", filings_snips, k=2) or [])],
                "risks":     [s[:400] for s in (rag_search_tool("risk factors", filings_snips, k=2) or [])],
            }
    except Exception:
        rag_samples = {}

    # 1) Seed Turn-1 JSON
    seed_msgs = [
        {"name": "ValuationAgent",   "content": json.dumps(val_json)},
        {"name": "SentimentAgent",   "content": json.dumps(news_json)},
        {"name": "FundamentalAgent", "content": json.dumps({"filings_snippets": filings_snips, "rag_samples": rag_samples})},
    ]

    # 2) Suspend tools
    _sav_val = _suspend_tools(valuation_agent)
    _sav_sen = _suspend_tools(sentiment_agent)
    _sav_fun = _suspend_tools(fundamental_agent)

    # 3) Debate (Round-2 only)
    gc = GroupChat(
        agents=[valuation_agent, sentiment_agent, fundamental_agent, coordinator],
        messages=seed_msgs,
        max_round=5,
        speaker_selection_method="round_robin",
    )
    mgr = GroupChatManager(groupchat=gc, llm_config=llm_config)

    task = (
        f"Round 2 only for {tk} as of {as_of}.\n"
        "Each agent must interpret ONLY the JSON provided for them below. Do NOT call tools.\n"
        "IMPORTANT:\n"
        "- Output ONLY your own role's content. Confidence must be 0.5–0.9 (not 0.0).\n\n"
        "--- ValuationAgent JSON ---\n```json\n" + json.dumps(val_json) + "\n```\n"
        "ValuationAgent:\n- Echo ann_return_str and ann_vol_str verbatim if present; else echo ann_return/ann_vol to 3 dp.\n"
        "- ≤120 words. End with: DECISION_JSON: {\"stance\":\"BUY|SELL|HOLD\",\"confidence\":0.5-0.9}\n\n"
        "--- SentimentAgent JSON ---\n```json\n" + json.dumps(news_json) + "\n```\n"
        "SentimentAgent:\n- Bullet top 3 as (date — publisher — short title). ≤120 words summary.\n"
        "- End with: DECISION_JSON: {\"stance\":\"BUY|SELL|HOLD\",\"confidence\":0.5-0.9}\n\n"
        "--- FundamentalAgent JSON ---\n```json\n" + json.dumps({"filings_snippets": filings_snips, "rag_samples": rag_samples}) + "\n```\n"
        "FundamentalAgent:\n- If filings_snippets is empty, first line must be: [CF]=N/A, [GM]=N/A, [R]=N/A\n"
        "- Else discuss [CF],[GM],[R] using only snippets. End with DECISION_JSON (0.5–0.9).\n\n"
        "Coordinator:\n- Synthesize (≤120 words). End with exactly one line:\n"
        '  JSON_DECISION: {"decision":"BUY|SELL|HOLD","confidence":0.5-0.9}\n'
        "- Do NOT call tools."
    )
    _ = coordinator.initiate_chat(mgr, message=task)

    # 4) Restore tools
    _restore_tools(valuation_agent, _sav_val)
    _restore_tools(sentiment_agent, _sav_sen)
    _restore_tools(fundamental_agent, _sav_fun)

    # 5) Extract final decision (prefer Coordinator; fall back to votes)
    msgs = list(mgr.groupchat.messages)
    final_obj = {"decision": "HOLD", "confidence": 0.5}
    coord_msgs = [m for m in msgs if m.get("name") == "Coordinator"]
    picked = None
    for m in reversed(coord_msgs):
        picked = _extract_json_decision_from_text(str(m.get("content") or ""))
        if picked: final_obj = picked; break
    if not picked:
        for m in reversed(msgs):
            picked = _extract_json_decision_from_text(str(m.get("content") or ""))
            if picked: final_obj = picked; break
    if (not picked) or final_obj.get("confidence", 0.0) == 0.0:
        voted = _votes_consensus(msgs)
        if voted: final_obj = voted

    # Ensure numeric
    try:
        final_obj["confidence"] = float(final_obj.get("confidence", 0.6))
    except Exception:
        final_obj["confidence"] = 0.6

    return final_obj, msgs

# === Run over your universe ===
UNIVERSE = ['INTC','NFLX','PANW']  # keep small for speed

records = []
for tk in tqdm(UNIVERSE, desc='Debating universe'):
    final, _msgs = debate_one_ticker(tk, AS_OF_DATE)
    records.append({
        "ticker": tk,
        "decision": final["decision"],
        "confidence": round(float(final["confidence"]), 2),
    })

decisions_df = pd.DataFrame(records)
display(decisions_df)


Debating universe:   0%|          | 0/3 [00:00<?, ?it/s]

Coordinator (to chat_manager):

Round 2 only for INTC as of 2025-09-07.
Each agent must interpret ONLY the JSON provided for them below. Do NOT call tools.
IMPORTANT:
- Output ONLY your own role's content. Confidence must be 0.5–0.9 (not 0.0).

--- ValuationAgent JSON ---
```json
{"ann_return": 0.4313782233713377, "ann_vol": 0.5845589180522366, "n": 259, "ann_return_str": "0.431", "ann_vol_str": "0.585"}
```
ValuationAgent:
- Echo ann_return_str and ann_vol_str verbatim if present; else echo ann_return/ann_vol to 3 dp.
- ≤120 words. End with: DECISION_JSON: {"stance":"BUY|SELL|HOLD","confidence":0.5-0.9}

--- SentimentAgent JSON ---
```json
[{"title": "Intel Corporation Stock (INTC) Opinions on U.S. Government Stake Acquisition - Quiver Quantitative", "publisher": "Quiver Quantitative", "link": "https://news.google.com/rss/articles/CBMiswFBVV95cUxQMS1IbXhETnpLLXV0ZGp3WV9qcmxnNndOUGVxS0RyQ2UybkxkeDlRTUJrWTVqUUpJNGtzNkZzNmJaSXFMYW9YcnJsSTNndV84bDhXbEF4N1BUYUlyN25mdENlYUx5akFPNXF4UFVJcGJu

Unnamed: 0,ticker,decision,confidence
0,INTC,HOLD,0.7
1,NFLX,HOLD,0.6
2,PANW,HOLD,0.7


In [None]:
# # Forward 4 months performance plot. Evaluation. To test this it is best to use an as-of-date in the past
# start_forward = AS_OF_DATE + dt.timedelta(days=1)
# EVAL_END_DATE = AS_OF_DATE + dt.timedelta(days=120)
# px_universe = yf.download(UNIVERSE + ['SPY'], start=start_forward, end=EVAL_END_DATE, auto_adjust=True, progress=False)['Close']
# rets = px_universe.pct_change().dropna()

# eq_universe = rets[UNIVERSE].mean(axis=1)
# spy = rets['SPY']
# buy_list = decisions_df[decisions_df['decision']=='BUY']['ticker'].tolist()
# strat = rets[buy_list].mean(axis=1) if buy_list else eq_universe*0.0

# def cum(x):
#     return (1+x).cumprod()

# ax = cum(strat).plot(label='Strategy BUYs')
# cum(eq_universe).plot(ax=ax, label='Equal-weight Universe')
# cum(spy).plot(ax=ax, label='SPY')
# ax.set_title('Cumulative Return 4 months forward from as-of date')
# ax.legend()
# plt.show()

## Citation
> Tianjiao Zhao, Jingrao Lyu, Stokes Jones, Harrison Garber, Stefano Pasquali, Dhagash Mehta.  
> **AlphaAgents: Large Language Model based Multi-Agents for Equity Portfolio Constructions.** *arXiv* (2025).  
> arXiv:2508.11152 [q-fin.ST]. https://arxiv.org/abs/2508.11152

[![arXiv](https://img.shields.io/badge/arXiv-2508.11152-b31b1b.svg)](https://arxiv.org/abs/2508.11152)
