# `webai.ticker` — User Guide

Orchestrates multi-query Tavily fan-out + LLM synthesis into validated Pydantic research objects for **individual tickers** and **market sectors**.

```
webai/ticker.py
│
├── Models
│   ├── TickerResearch        validated snapshot for one stock
│   └── SectorResearch        validated snapshot for one sector
│
├── TickerResearcher(model, ...)
│   ├── research_ticker(symbol, company_name=None)  → TickerResearch
│   └── research_sector(sector)                     → SectorResearch
│
└── TRUSTED_DOMAINS           module-level domain allowlist
```

| Method | Returns | Needs API keys |
|---|---|---|
| `research_ticker` | `TickerResearch` | TAVILY_API_KEY + OPENAI_API_KEY |
| `research_sector` | `SectorResearch` | TAVILY_API_KEY + OPENAI_API_KEY |

**Prerequisites:**
- Sections 1–2 (setup, model schemas) run **without** API keys.
- Sections 3–7 require both `TAVILY_API_KEY` and `OPENAI_API_KEY` in `.env`.

## Sections
1. [Setup](#1-setup)
2. [Output model schemas](#2-output-model-schemas)
3. [Initialization](#3-initialization)
4. [research_ticker](#4-research_ticker)
5. [research_sector](#5-research_sector)
6. [Customization](#6-customization)
7. [End-to-end pattern](#7-end-to-end-pattern)
8. [Error handling reference](#8-error-handling-reference)

## 1 — Setup <a id="1-setup"></a>

In [None]:
import json
import logging
import os

from dotenv import load_dotenv

from webai.ticker import TRUSTED_DOMAINS, SectorResearch, TickerResearch, TickerResearcher

load_dotenv()

logging.basicConfig(
    level=logging.WARNING,  # flip to DEBUG to see the full pipeline trace
    format="%(asctime)s [%(levelname)s] %(name)s — %(message)s",
    force=True,
)

_HAS_TAVILY = bool(os.environ.get("TAVILY_API_KEY"))
_HAS_OPENAI = bool(os.environ.get("OPENAI_API_KEY"))
_HAS_KEYS   = _HAS_TAVILY and _HAS_OPENAI

print(f"TAVILY_API_KEY : {'set' if _HAS_TAVILY else 'MISSING'}")
print(f"OPENAI_API_KEY : {'set' if _HAS_OPENAI else 'MISSING'}")
print(f"Live sections  : {'enabled' if _HAS_KEYS else 'SKIPPED — set both keys in .env'}")

## 2 — Output model schemas <a id="2-output-model-schemas"></a>

Both research methods return validated Pydantic models. You can inspect the schema without any API keys.

### 2a — `TickerResearch`

| Field | Type | Required | Description |
|---|---|---|---|
| `ticker` | `str` | yes | Uppercase symbol, e.g. `"NVDA"` |
| `company_name` | `str` | yes | Full name inferred by LLM |
| `sentiment` | `Literal[...]` | yes | `bullish / bearish / neutral / mixed` |
| `confidence` | `float [0–1]` | yes | Evidence consistency score |
| `key_catalyst` | `str` | yes | Single most important near-term catalyst |
| `risk_factors` | `list[str]` | no | 3–5 specific downside risks |
| `analyst_consensus` | `str` | no | Ratings / price targets if mentioned |
| `macro_context` | `str` | no | Macro / sector context from step-back queries |
| `sources` | `list[str]` | no | Source URLs used in synthesis |
| `freshness_warning` | `bool` | no | `True` if sources appear >30 days old |

In [None]:
# Pretty-print the JSON schema — no API key required
schema = TickerResearch.model_json_schema()
print(json.dumps(schema, indent=2))

### 2b — `SectorResearch`

| Field | Type | Required | Description |
|---|---|---|---|
| `sector` | `str` | yes | Exactly the string the caller passed in |
| `overall_health` | `Literal[...]` | yes | `strong / weak / stable / deteriorating / mixed` |
| `key_trends` | `list[str]` | no | 3–5 structural or cyclical trends |
| `tailwinds` | `list[str]` | no | 2–4 positive near-term drivers |
| `headwinds` | `list[str]` | no | 2–4 negative pressures / risks |
| `valuation_note` | `str` | no | P/E / multiples commentary; empty if absent |
| `leading_companies` | `list[str]` | no | Top 3–5 names mentioned in sources |
| `macro_sensitivity` | `str` | no | Sensitivity to rates, inflation, credit cycles |
| `outlook` | `str` | no | 1–2 sentence forward-looking statement |
| `sources` | `list[str]` | no | Source URLs used in synthesis |
| `freshness_warning` | `bool` | no | `True` if sources appear >30 days old |

In [None]:
schema = SectorResearch.model_json_schema()
print(json.dumps(schema, indent=2))

## 3 — Initialization <a id="3-initialization"></a>

`TickerResearcher` wraps both research pipelines. Build it once and reuse it for many calls — the two structured-output chains (`_synthesis_chain` and `_sector_synthesis_chain`) are constructed at init time.

### Constructor parameters

| Parameter | Default | Notes |
|---|---|---|
| `model` | *(required)* | Any LangChain `BaseChatModel` with `with_structured_output` support |
| `tavily_api_key` | `$TAVILY_API_KEY` | Falls back to env var |
| `openai_api_key` | `$OPENAI_API_KEY` | Passed to the underlying `WebSearcher` |
| `max_results_per_query` | `5` | Tavily results per fan-out query |
| `trusted_domains` | `TRUSTED_DOMAINS` | Override the allowlist for domain filtering |
| `filter_by_domain` | `True` | Disable to skip domain filtering entirely |
| `max_workers` | `8` | Thread pool size for parallel searches |
| `debug` | `False` | Sets logger to DEBUG + adds StreamHandler |

> **Invariant:** Initialization raises `ValueError` immediately if `TAVILY_API_KEY` is missing. It never silently continues without a valid Tavily key.

In [None]:
if not _HAS_KEYS:
    print("Skipping — requires TAVILY_API_KEY and OPENAI_API_KEY.")
else:
    from langchain_openai import ChatOpenAI

    model = ChatOpenAI(model="gpt-4o-mini", temperature=0)

    researcher = TickerResearcher(
        model=model,
        max_results_per_query=3,  # keep calls fast in the guide
        debug=False,
    )

    print(f"filter_by_domain  : {researcher.filter_by_domain}")
    print(f"max_workers       : {researcher.max_workers}")
    print(f"trusted_domains   : {len(researcher.trusted_domains)} entries")
    print(f"First few domains : {researcher.trusted_domains[:5]}")

## 4 — `research_ticker` <a id="4-research_ticker"></a>

Runs the full equity research pipeline for a single stock symbol.

**Pipeline:**
1. Normalise symbol to uppercase
2. Build anchor query: `"{SYMBOL} {company_name} stock analysis news"`
3. Fan-out via LLM query translation (expand → decompose → step-back), using finance-domain few-shot examples
4. Execute all queries in parallel against Tavily `topic="finance"`
5. Deduplicate by URL; filter to `TRUSTED_DOMAINS` (with unfiltered fallback)
6. Synthesise a `TickerResearch` via `model.with_structured_output(TickerResearch)`

**When to pass `company_name`:** Always pass it when the ticker is ambiguous or not widely known — it anchors both the Tavily queries and the LLM prompt, improving result relevance.

In [None]:
if not _HAS_KEYS:
    print("Skipping — requires TAVILY_API_KEY and OPENAI_API_KEY.")
else:
    ticker_result = researcher.research_ticker("NVDA", company_name="Nvidia")

    print(f"ticker            : {ticker_result.ticker}")
    print(f"company_name      : {ticker_result.company_name}")
    print(f"sentiment         : {ticker_result.sentiment}")
    print(f"confidence        : {ticker_result.confidence:.2f}")
    print(f"key_catalyst      : {ticker_result.key_catalyst}")
    print(f"freshness_warning : {ticker_result.freshness_warning}")
    print(f"sources           : {len(ticker_result.sources)} URLs")

In [None]:
# Full JSON export — useful for downstream serialisation or logging
if not _HAS_KEYS:
    print("Skipping — requires TAVILY_API_KEY and OPENAI_API_KEY.")
else:
    print(ticker_result.model_dump_json(indent=2))

In [None]:
# Iterate over list fields for downstream use
if not _HAS_KEYS:
    print("Skipping — requires TAVILY_API_KEY and OPENAI_API_KEY.")
else:
    print("Risk factors:")
    for i, risk in enumerate(ticker_result.risk_factors, 1):
        print(f"  {i}. {risk}")

    if ticker_result.analyst_consensus:
        print(f"\nAnalyst consensus : {ticker_result.analyst_consensus}")

    if ticker_result.macro_context:
        print(f"\nMacro context     : {ticker_result.macro_context}")

## 5 — `research_sector` <a id="5-research_sector"></a>

Runs the same pipeline as `research_ticker` but targets macro-level sector content instead of individual company news.

**Key differences from `research_ticker`:**

| Aspect | `research_ticker` | `research_sector` |
|---|---|---|
| Base query phrase | `"{SYMBOL} stock analysis news"` | `"{sector} sector financial health outlook"` |
| Few-shot examples | Finance ticker examples (AAPL, TSLA, MSFT) | Sector examples (Semiconductors, Healthcare, Energy) |
| Synthesis chain | `TickerResearch` schema | `SectorResearch` schema |
| Output anchor | `ticker` field locked to symbol | `sector` field locked to caller's string |

The sector name you pass is stored verbatim in `result.sector`, so you can safely group or key on it downstream.

In [None]:
if not _HAS_KEYS:
    print("Skipping — requires TAVILY_API_KEY and OPENAI_API_KEY.")
else:
    sector_result = researcher.research_sector("Semiconductors")

    print(f"sector            : {sector_result.sector}")
    print(f"overall_health    : {sector_result.overall_health}")
    print(f"freshness_warning : {sector_result.freshness_warning}")
    print(f"sources           : {len(sector_result.sources)} URLs")

    print("\nKey trends:")
    for trend in sector_result.key_trends:
        print(f"  • {trend}")

    print("\nTailwinds:")
    for tw in sector_result.tailwinds:
        print(f"  + {tw}")

    print("\nHeadwinds:")
    for hw in sector_result.headwinds:
        print(f"  - {hw}")

In [None]:
if not _HAS_KEYS:
    print("Skipping — requires TAVILY_API_KEY and OPENAI_API_KEY.")
else:
    print(f"Outlook           : {sector_result.outlook}")

    if sector_result.valuation_note:
        print(f"Valuation note    : {sector_result.valuation_note}")

    if sector_result.macro_sensitivity:
        print(f"Macro sensitivity : {sector_result.macro_sensitivity}")

    if sector_result.leading_companies:
        print(f"Leading companies : {', '.join(sector_result.leading_companies)}")

    print("\nFull JSON:")
    print(sector_result.model_dump_json(indent=2))

## 6 — Customization <a id="6-customization"></a>

### 6a — Trusted domain allowlist

`TRUSTED_DOMAINS` is a module-level list you can read or extend. Pass `trusted_domains=` to `TickerResearcher` to override it for a specific instance without mutating the module-level constant.

> **Domain filtering invariant:** If every result is excluded by the allowlist, the pipeline automatically falls back to the full unfiltered set rather than raising. You'll see a `WARNING` log line when this happens. Set `filter_by_domain=False` to skip filtering entirely.

In [None]:
# Inspect the default allowlist — no API key needed
print(f"Default trusted domains ({len(TRUSTED_DOMAINS)} entries):")
for domain in TRUSTED_DOMAINS:
    print(f"  {domain}")

In [None]:
if not _HAS_KEYS:
    print("Skipping — requires TAVILY_API_KEY and OPENAI_API_KEY.")
else:
    from langchain_openai import ChatOpenAI

    # Add a custom domain on top of the defaults
    custom_domains = TRUSTED_DOMAINS + ["techcrunch.com", "arstechnica.com"]

    researcher_custom = TickerResearcher(
        model=ChatOpenAI(model="gpt-4o-mini", temperature=0),
        max_results_per_query=3,
        trusted_domains=custom_domains,
    )
    print(f"Custom researcher has {len(researcher_custom.trusted_domains)} trusted domains")

    # Or disable domain filtering entirely (accept any Tavily result)
    researcher_open = TickerResearcher(
        model=ChatOpenAI(model="gpt-4o-mini", temperature=0),
        max_results_per_query=3,
        filter_by_domain=False,
    )
    print(f"Open researcher   filter_by_domain={researcher_open.filter_by_domain}")

### 6b — Debug logging

Pass `debug=True` to route the full pipeline trace to stderr via Python's `logging` module. Each stage logs the number of queries, raw results, deduped results, and domain-filtered results.

```python
researcher_debug = TickerResearcher(model=model, debug=True)
# Now all webai.ticker DEBUG lines stream to stderr
result = researcher_debug.research_ticker("AAPL", company_name="Apple")
```

To suppress debug output in production, pass `debug=False` (the default) and configure your application's root logger instead.

## 7 — End-to-end pattern <a id="7-end-to-end-pattern"></a>

A common use-case: given a list of tickers, produce both a per-stock snapshot **and** the health of its parent sector, then combine them into a single report dict. The helper below runs all calls through the shared `TickerResearcher` instance so the LLM chains are built only once.

In [None]:
def research_portfolio(
    researcher: TickerResearcher,
    holdings: list[dict],  # [{"symbol": str, "company": str, "sector": str}]
) -> dict:
    """
    For each holding, fetch a TickerResearch snapshot and its sector's
    SectorResearch snapshot (deduplicating sector calls).

    Returns a dict keyed by symbol with nested ticker and sector results.
    """
    sector_cache: dict[str, SectorResearch] = {}
    report: dict[str, dict] = {}

    for h in holdings:
        symbol  = h["symbol"]
        company = h.get("company")
        sector  = h["sector"]

        # --- ticker ---
        ticker_res = researcher.research_ticker(symbol, company_name=company)

        # --- sector (cached so we only call Tavily once per unique sector) ---
        if sector not in sector_cache:
            sector_cache[sector] = researcher.research_sector(sector)
        sector_res = sector_cache[sector]

        report[symbol] = {
            "ticker": ticker_res,
            "sector": sector_res,
        }

    return report


# --- run the helper ---
if not _HAS_KEYS:
    print("Skipping — requires TAVILY_API_KEY and OPENAI_API_KEY.")
else:
    HOLDINGS = [
        {"symbol": "NVDA", "company": "Nvidia",    "sector": "Semiconductors"},
        {"symbol": "INTC", "company": "Intel",     "sector": "Semiconductors"},
        {"symbol": "JNJ",  "company": "Johnson & Johnson", "sector": "Healthcare"},
    ]

    portfolio = research_portfolio(researcher, HOLDINGS)

    for symbol, data in portfolio.items():
        t = data["ticker"]
        s = data["sector"]
        print(f"\n{'='*60}")
        print(f"  {symbol} — {t.company_name}")
        print(f"  Sentiment      : {t.sentiment}  (confidence {t.confidence:.2f})")
        print(f"  Key catalyst   : {t.key_catalyst}")
        print(f"  Sector         : {s.sector}  [{s.overall_health}]")
        print(f"  Sector outlook : {s.outlook[:120]}...")

## 8 — Error handling reference <a id="8-error-handling-reference"></a>

| Method | Guard condition | Exception raised |
|---|---|---|
| `TickerResearcher.__init__` | `TAVILY_API_KEY` missing | `ValueError` |
| `research_ticker` | `symbol` is empty string | `ValueError` |
| `research_ticker` | no results survive pipeline | `RuntimeError` |
| `research_ticker` | LLM synthesis fails | `RuntimeError` |
| `research_sector` | `sector` is empty string | `ValueError` |
| `research_sector` | no results survive pipeline | `RuntimeError` |
| `research_sector` | LLM synthesis fails | `RuntimeError` |

> **Fallback behaviour (not an error):** If domain filtering removes every result, the pipeline silently falls back to the full unfiltered set and logs a `WARNING`. You only get `RuntimeError` if *zero* results came back from Tavily in the first place.

In [None]:
# ValueError: empty symbol — no API key needed
if not _HAS_KEYS:
    print("Skipping — requires TAVILY_API_KEY and OPENAI_API_KEY.")
else:
    try:
        researcher.research_ticker("   ")  # whitespace-only
    except ValueError as e:
        print(f"research_ticker('   ') -> ValueError: {e}")

In [None]:
# ValueError: empty sector — no API key needed
if not _HAS_KEYS:
    print("Skipping — requires TAVILY_API_KEY and OPENAI_API_KEY.")
else:
    try:
        researcher.research_sector("")
    except ValueError as e:
        print(f"research_sector('')    -> ValueError: {e}")

In [None]:
# ValueError: missing Tavily key at construction time
# (This cell always runs — no live API needed)
import os as _os

_saved = _os.environ.pop("TAVILY_API_KEY", None)
try:
    from langchain_openai import ChatOpenAI as _ChatOpenAI
    _m = _ChatOpenAI(model="gpt-4o-mini")
    TickerResearcher(model=_m)
except ValueError as e:
    print(f"TickerResearcher(no key) -> ValueError: {e}")
finally:
    if _saved:
        _os.environ["TAVILY_API_KEY"] = _saved