Skip to content

feat(kalshi): add market_details and market_trades gold layer#9549

Merged
jeff-dude merged 20 commits intomainfrom
feat/kalshi-market-details-trades
Apr 17, 2026
Merged

feat(kalshi): add market_details and market_trades gold layer#9549
jeff-dude merged 20 commits intomainfrom
feat/kalshi-market-details-trades

Conversation

@los-xyz
Copy link
Copy Markdown
Contributor

@los-xyz los-xyz commented Apr 10, 2026

Summary

Adds two new Kalshi prediction market spells, transforming API-sourced bronze tables into a clean gold layer:

  • kalshi.market_details (TABLE): Market reference table joining markets_raw with event metadata from market_details_raw. Filtered to markets with >= 100 contracts traded.
  • kalshi.market_trades (VIEW): Trade-level table enriched with market metadata via inner join to market_details.

Design choices (bronze → gold)

  • >= 100 contracts filter: drops 85% of markets (dust/empty), keeps 99.7% of volume
  • 12 columns dropped: universally null (functional_strike, mve_*, is_provisional, fee_waiver_expiration_time), constant (response_price_units, notional_value_dollars, price_ranges), always zero (liquidity_dollars), borderline sparse (rules_secondary, primary_participant_key, settlement_timer_seconds), internal (created_hour)
  • INNER join on trades: ensures only trades for meaningful markets flow to gold layer
  • Pricing snapshot kept: despite being latest-state-only, the orderbook/OI columns are well-populated on active markets and useful for current-state analysis

Test plan

  • dbt compile passes for both models
  • CI builds and tests pass
  • Verify market_details unique on ticker
  • Verify market_trades unique on trade_id
  • Spot-check join completeness (NULL rates on enriched columns)

🤖 Generated with Claude Code

Add two new Kalshi prediction market spells built from API bronze tables:

- kalshi.market_details: market reference table joining markets_0003
  with event metadata from market_details_0003. Filtered to markets
  with >= 100 contracts traded (6.5M of 39.8M markets, 99.7% of volume).
  Drops 12 universally null/constant columns (55 → 43).

- kalshi.market_trades: trade-level view enriched with market metadata
  via inner join to market_details, filtering out dust market trades.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@github-actions github-actions Bot added WIP work in progress dbt: daily covers the Daily dbt subproject labels Apr 10, 2026
los-xyz and others added 2 commits April 10, 2026 11:39
- Rename sources from _0003 to _raw (market_trades_raw, markets_raw, market_details_raw)
- Update contributor from dpettas to allelosi

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@los-xyz los-xyz requested a review from jeff-dude April 10, 2026 10:44
los-xyz and others added 4 commits April 10, 2026 13:45
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
… in details

- market_trades: add amount_usd (yes_price_dollars * count_fp) and _updated_at
- market_details: extract category from product_metadata JSON

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@jeff-dude
Copy link
Copy Markdown
Member

pushed changes to make the models incremental, since larger tables.
would be good if you could test the output build in CI here. note that CI tables get dropped over the weekend, so you may need to rerun the actions to rebuild if looking later.
plz ensure these outputs still match what you expect

@los-xyz los-xyz marked this pull request as ready for review April 12, 2026 16:07
@cursor
Copy link
Copy Markdown

cursor Bot commented Apr 12, 2026

PR Summary

Medium Risk
Introduces new incremental/merge models with non-trivial refresh predicates, which can impact warehouse cost and data correctness (e.g., missed/duplicated rows) if predicates or keys are wrong; changes are isolated to new Kalshi datasets.

Overview
Adds a new Kalshi gold layer: kalshi.market_details (incremental merge on ticker) joins markets_raw with the latest per-event_ticker record from market_details_raw, filters to markets with volume_fp >= 100, and tracks source_updated_at to capture event-only metadata changes.

Adds kalshi.market_trades (incremental merge partitioned by block_month) that pulls from market_trades_raw, enriches trades via an inner join to kalshi_market_details, and reprocesses historical trades when market_details.source_updated_at changes to keep dimension columns current. Also registers the new Kalshi bronze sources (market_trades_raw, markets_raw, market_details_raw) and adds schema tests/docs for uniqueness and non-null keys.

Reviewed by Cursor Bugbot for commit 90f1a88. Configure here.

@github-actions github-actions Bot added ready-for-review this PR development is complete, please review and removed WIP work in progress labels Apr 12, 2026
Copy link
Copy Markdown

@cursor cursor Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cursor Bugbot has reviewed your changes and found 1 potential issue.

Autofix Details

Bugbot Autofix prepared a fix for the issue found in the latest run.

  • ✅ Fixed: Missing deduplication on event_details CTE causes row fan-out
    • Added QUALIFY with ROW_NUMBER() to event_details CTE to keep only the latest row per event_ticker, preventing join fan-out and duplicate ticker rows.

Create PR

Or push these changes by commenting:

@cursor push 491ff46c63
Preview (491ff46c63)
diff --git a/dbt_subprojects/daily_spellbook/models/_projects/kalshi/kalshi_market_details.sql b/dbt_subprojects/daily_spellbook/models/_projects/kalshi/kalshi_market_details.sql
--- a/dbt_subprojects/daily_spellbook/models/_projects/kalshi/kalshi_market_details.sql
+++ b/dbt_subprojects/daily_spellbook/models/_projects/kalshi/kalshi_market_details.sql
@@ -54,6 +54,7 @@
 		strike_period,
 		last_updated_ts
 	from {{ source('kalshi', 'market_details_raw') }}
+	qualify row_number() over (partition by event_ticker order by last_updated_ts desc) = 1
 )
 
 select

You can send follow-ups to the cloud agent here.

los-xyz and others added 9 commits April 12, 2026 18:17
Adds QUALIFY ROW_NUMBER() to keep only the latest row per event_ticker
from market_details_raw, preventing potential duplicate ticker rows if
the raw source ever contains multiple snapshots per event.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…_details

category (from $.category) is NULL until ingestion adds the field.
competition and competition_scope are available now from product_metadata
and give useful values (e.g., "Pro Football", "College Basketball (M)", "Game").

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
QUALIFY is not supported in Trino/DuneSQL. Rewrote event_details
deduplication as a subquery with ROW_NUMBER() + WHERE rn = 1.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Replace JSON extraction `try(json_extract_scalar(product_metadata, '$.category'))`
  with the native `category` column now available in market_details_raw
  (100% populated, 19 distinct values).

- Add `mve_collection_ticker` from markets_raw to the gold layer. 81% of
  Kalshi markets are multivariate events (MVE); this column links sub-event
  markets to their parent MVE collection (e.g., KXMVECBCHAMPIONSHIP-R),
  enabling downstream grouping and filtering by collection.

- Update _schema.yml descriptions accordingly.
- kalshi_market_details: scope event_details dedupe via inner join to markets CTE
  (avoids full market_details_raw scan on incremental runs); switch to explicit
  column projection; rename watermark_ts -> source_updated_at to avoid confusion
  with pipeline-oriented _updated_at; add _updated_at = now() for operational
  freshness; drop post_hook.
- kalshi_market_trades: explicit column projection; leading commas; consistent
  inner-join pre-scoping; update reference to source_updated_at; add
  merge_skip_unchanged = true to skip no-op dimension refreshes (matches
  polymarket analog); drop post_hook.
- _schema.yml: rename watermark_ts column, add _updated_at to market_details,
  refresh descriptions.

Made-with: Cursor
los-xyz added a commit that referenced this pull request Apr 17, 2026
…atic tag

- Drop kalshi_market_details.sql and kalshi_market_trades.sql — those
  land in PR #9549 instead; this PR depends on #9549 merging first.
- Revert sources/kalshi/_sources.yml (raw source declarations come from #9549).
- Trim _schema.yml to only the kalshi_ohlcv_hourly entry.
- Remove `static` tag from kalshi_ohlcv_hourly (config + schema) so the
  model refreshes with new data.
- Also remove `static` tag from polymarket_polygon.ohlcv_hourly for
  consistency.
jeff-dude and others added 2 commits April 17, 2026 13:35
Follow-up to 2e6d394 which landed only a partial snapshot. This commit
completes the review:

- kalshi_market_details: drop ci-stamp and post_hook; switch event_details
  pre-filter from `in (select ...)` to inner join (consistent with trades);
  add `now() as _updated_at` for pipeline-time freshness.
- kalshi_market_trades: drop ci-stamp and post_hook; reformat config block
  to single-line style; add merge_skip_unchanged = true; explicit column
  projection; leading commas throughout; newline after SQL keywords.
- _schema.yml: document _updated_at column on kalshi_market_details.

Made-with: Cursor
@jeff-dude
Copy link
Copy Markdown
Member

@cursor review

Copy link
Copy Markdown

@cursor cursor Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cursor Bugbot has reviewed your changes and found 1 potential issue.

Fix All in Cursor

Bugbot Autofix prepared a fix for the issue found in the latest run.

  • ✅ Fixed: amount_usd ignores taker side, wrong for No trades
    • Changed amount_usd calculation from always using yes_price_dollars to conditionally using yes_price_dollars or no_price_dollars based on taker_side.

Create PR

Or push these changes by commenting:

@cursor push 8872215acd
Preview (8872215acd)
diff --git a/dbt_subprojects/daily_spellbook/models/_projects/kalshi/_schema.yml b/dbt_subprojects/daily_spellbook/models/_projects/kalshi/_schema.yml
--- a/dbt_subprojects/daily_spellbook/models/_projects/kalshi/_schema.yml
+++ b/dbt_subprojects/daily_spellbook/models/_projects/kalshi/_schema.yml
@@ -186,7 +186,7 @@
       - name: no_price_dollars
         description: "Price paid for the No side"
       - name: amount_usd
-        description: "USD notional value of the trade (yes_price_dollars * count_fp)"
+        description: "USD notional value of the trade (taker price * count_fp, where taker price depends on taker_side)"
       - name: event_ticker
         description: "Parent event identifier"
       - name: series_ticker

diff --git a/dbt_subprojects/daily_spellbook/models/_projects/kalshi/kalshi_market_trades.sql b/dbt_subprojects/daily_spellbook/models/_projects/kalshi/kalshi_market_trades.sql
--- a/dbt_subprojects/daily_spellbook/models/_projects/kalshi/kalshi_market_trades.sql
+++ b/dbt_subprojects/daily_spellbook/models/_projects/kalshi/kalshi_market_trades.sql
@@ -69,7 +69,7 @@
 	, t.count_fp
 	, t.yes_price_dollars
 	, t.no_price_dollars
-	, t.yes_price_dollars * t.count_fp as amount_usd
+	, case when t.taker_side = 'yes' then t.yes_price_dollars else t.no_price_dollars end * t.count_fp as amount_usd
 	, md.event_ticker
 	, md.series_ticker
 	, md.market_type

You can send follow-ups to the cloud agent here.

Comment @cursor review or bugbot run to trigger another review on this PR

Reviewed by Cursor Bugbot for commit 90f1a88. Configure here.

Comment thread dbt_subprojects/daily_spellbook/models/_projects/kalshi/kalshi_market_trades.sql Outdated
The amount_usd field now correctly uses the taker's actual price based on taker_side. When taker_side is 'yes', it uses yes_price_dollars; when 'no', it uses no_price_dollars. This fixes incorrect USD values for all No-side trades.

Applied via @cursor push command
@jeff-dude jeff-dude merged commit cdd0daf into main Apr 17, 2026
3 checks passed
@jeff-dude jeff-dude deleted the feat/kalshi-market-details-trades branch April 17, 2026 18:54
@github-actions github-actions Bot locked and limited conversation to collaborators Apr 17, 2026
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

dbt: daily covers the Daily dbt subproject ready-for-review this PR development is complete, please review

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants