Skip to content

v1.2.0

Choose a tag to compare

@xang1234 xang1234 released this 03 Jun 16:10
· 175 commits to main since this release
b966326

Stock Scanner v1.2.0

Stock Scanner v1.2.0 introduces automated IBD industry-group classification as a weekly, release-asset–driven pipeline, expands market coverage to twelve markets (adding Singapore, Canada, Germany/XETRA, Australia, and Malaysia), and hardens the static-site and release-asset workflows that back the multi-market data feeds.

Highlights

IBD industry-group classification (new)

  • Classifies the universe into IBD's industry-group taxonomy through a deterministic cascade: strict crosswalk → confident embedding match → paid LLM tiebreaker on a top-K shortlist → deterministic soft-attach fallbacks (relaxed-crosswalk plurality, then nearest centroid). Every symbol gets a tier-labelled assignment with a confidence score.
  • Runs as a weekly GitHub Actions pipeline that publishes a gzipped classification bundle plus a -latest manifest (sha256-verified) to a release asset, consumed by the static-site build — no Celery/runtime dependency.
  • Classification data-health scorecard + week-over-week regression gate: each build emits coverage %, tier mix, a confidence histogram, and an embedding-model fingerprint, then diffs against the prior week and gates the publish (warn / enforce / off).
  • Runtime safety for foreign markets: per-market wall-clock deadline, an LLM call budget, and request timeouts, so a slow market degrades to a graceful partial publish instead of timing out the job.
  • Foreign-market taxonomy support: a shared canonical taxonomy across markets with deterministic soft-attach, so non-US markets classify without an exhaustive per-market crosswalk.
  • Universe sector/industry backfill: yfinance sector/industry now persists onto StockUniverse at ingest (foreign rows previously landed without it), which feeds the crosswalk and embedding tiers and sharply reduces wasted LLM calls.
  • Robust bundle import: both the snapshot build and bundle import now collapse rows that re-canonicalize to the same symbol (e.g. Taiwan .TWO/.TW board variants), so a market's classification can no longer be aborted by a duplicate-symbol constraint violation.

Twelve-market coverage

  • Adds Singapore (SGX), Canada (TSX/TSXV), Germany (XETRA), Australia (ASX), and Malaysia (Bursa) — extending the seven-market set (US, HK, IN, JP, KR, TW, CN) to twelve.
  • Each new market is wired through universe listing sources, price/fundamental fields, currency formatting and FX, market badges and selectors, breadth and group-ranking views, benchmark registry and calendars, and the static-site and weekly-reference market matrices.

Static-site & release-asset pipeline

  • Per-market static fallbacks and "skip closed market" handling so a single market's gap or off-hours state no longer blocks the static export.
  • Release-asset retention/cleanup so accumulated weekly bundles are pruned.
  • Weekly-reference and static-site workflows extended across the full market matrix, with clearer missing-cache diagnostics.

Bootstrap, scan & breadth reliability

  • CN bootstrap coverage thresholds, checkpointing, and combined-coverage handling for more reliable first-run population.
  • yfinance retry/backoff widening and bounded provider timeouts so slow upstreams fail over instead of hanging workers.
  • Group-ranking column/data fixes with benchmark-cache fallbacks, breadth-scan improvements, predefined filters, and a TradingView charts popup.

Architecture & docs

  • Architecture and data-pipeline documentation refresh covering the multi-market and release-asset model.