feat(kalshi): add market_details and market_trades gold layer#9549
feat(kalshi): add market_details and market_trades gold layer#9549
Conversation
Add two new Kalshi prediction market spells built from API bronze tables: - kalshi.market_details: market reference table joining markets_0003 with event metadata from market_details_0003. Filtered to markets with >= 100 contracts traded (6.5M of 39.8M markets, 99.7% of volume). Drops 12 universally null/constant columns (55 → 43). - kalshi.market_trades: trade-level view enriched with market metadata via inner join to market_details, filtering out dust market trades. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Rename sources from _0003 to _raw (market_trades_raw, markets_raw, market_details_raw) - Update contributor from dpettas to allelosi Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
… in details - market_trades: add amount_usd (yes_price_dollars * count_fp) and _updated_at - market_details: extract category from product_metadata JSON Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…sion refresh Made-with: Cursor
|
pushed changes to make the models incremental, since larger tables. |
PR SummaryMedium Risk Overview Adds Reviewed by Cursor Bugbot for commit 90f1a88. Configure here. |
There was a problem hiding this comment.
Cursor Bugbot has reviewed your changes and found 1 potential issue.
Autofix Details
Bugbot Autofix prepared a fix for the issue found in the latest run.
- ✅ Fixed: Missing deduplication on event_details CTE causes row fan-out
- Added QUALIFY with ROW_NUMBER() to event_details CTE to keep only the latest row per event_ticker, preventing join fan-out and duplicate ticker rows.
Or push these changes by commenting:
@cursor push 491ff46c63
Preview (491ff46c63)
diff --git a/dbt_subprojects/daily_spellbook/models/_projects/kalshi/kalshi_market_details.sql b/dbt_subprojects/daily_spellbook/models/_projects/kalshi/kalshi_market_details.sql
--- a/dbt_subprojects/daily_spellbook/models/_projects/kalshi/kalshi_market_details.sql
+++ b/dbt_subprojects/daily_spellbook/models/_projects/kalshi/kalshi_market_details.sql
@@ -54,6 +54,7 @@
strike_period,
last_updated_ts
from {{ source('kalshi', 'market_details_raw') }}
+ qualify row_number() over (partition by event_ticker order by last_updated_ts desc) = 1
)
selectYou can send follow-ups to the cloud agent here.
Adds QUALIFY ROW_NUMBER() to keep only the latest row per event_ticker from market_details_raw, preventing potential duplicate ticker rows if the raw source ever contains multiple snapshots per event. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…_details category (from $.category) is NULL until ingestion adds the field. competition and competition_scope are available now from product_metadata and give useful values (e.g., "Pro Football", "College Basketball (M)", "Game"). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
QUALIFY is not supported in Trino/DuneSQL. Rewrote event_details deduplication as a subquery with ROW_NUMBER() + WHERE rn = 1. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Replace JSON extraction `try(json_extract_scalar(product_metadata, '$.category'))` with the native `category` column now available in market_details_raw (100% populated, 19 distinct values). - Add `mve_collection_ticker` from markets_raw to the gold layer. 81% of Kalshi markets are multivariate events (MVE); this column links sub-event markets to their parent MVE collection (e.g., KXMVECBCHAMPIONSHIP-R), enabling downstream grouping and filtering by collection. - Update _schema.yml descriptions accordingly.
- kalshi_market_details: scope event_details dedupe via inner join to markets CTE (avoids full market_details_raw scan on incremental runs); switch to explicit column projection; rename watermark_ts -> source_updated_at to avoid confusion with pipeline-oriented _updated_at; add _updated_at = now() for operational freshness; drop post_hook. - kalshi_market_trades: explicit column projection; leading commas; consistent inner-join pre-scoping; update reference to source_updated_at; add merge_skip_unchanged = true to skip no-op dimension refreshes (matches polymarket analog); drop post_hook. - _schema.yml: rename watermark_ts column, add _updated_at to market_details, refresh descriptions. Made-with: Cursor
…atic tag - Drop kalshi_market_details.sql and kalshi_market_trades.sql — those land in PR #9549 instead; this PR depends on #9549 merging first. - Revert sources/kalshi/_sources.yml (raw source declarations come from #9549). - Trim _schema.yml to only the kalshi_ohlcv_hourly entry. - Remove `static` tag from kalshi_ohlcv_hourly (config + schema) so the model refreshes with new data. - Also remove `static` tag from polymarket_polygon.ohlcv_hourly for consistency.
Follow-up to 2e6d394 which landed only a partial snapshot. This commit completes the review: - kalshi_market_details: drop ci-stamp and post_hook; switch event_details pre-filter from `in (select ...)` to inner join (consistent with trades); add `now() as _updated_at` for pipeline-time freshness. - kalshi_market_trades: drop ci-stamp and post_hook; reformat config block to single-line style; add merge_skip_unchanged = true; explicit column projection; leading commas throughout; newline after SQL keywords. - _schema.yml: document _updated_at column on kalshi_market_details. Made-with: Cursor
|
@cursor review |
There was a problem hiding this comment.
Cursor Bugbot has reviewed your changes and found 1 potential issue.
Bugbot Autofix prepared a fix for the issue found in the latest run.
- ✅ Fixed:
amount_usdignores taker side, wrong for No trades- Changed amount_usd calculation from always using yes_price_dollars to conditionally using yes_price_dollars or no_price_dollars based on taker_side.
Or push these changes by commenting:
@cursor push 8872215acd
Preview (8872215acd)
diff --git a/dbt_subprojects/daily_spellbook/models/_projects/kalshi/_schema.yml b/dbt_subprojects/daily_spellbook/models/_projects/kalshi/_schema.yml
--- a/dbt_subprojects/daily_spellbook/models/_projects/kalshi/_schema.yml
+++ b/dbt_subprojects/daily_spellbook/models/_projects/kalshi/_schema.yml
@@ -186,7 +186,7 @@
- name: no_price_dollars
description: "Price paid for the No side"
- name: amount_usd
- description: "USD notional value of the trade (yes_price_dollars * count_fp)"
+ description: "USD notional value of the trade (taker price * count_fp, where taker price depends on taker_side)"
- name: event_ticker
description: "Parent event identifier"
- name: series_ticker
diff --git a/dbt_subprojects/daily_spellbook/models/_projects/kalshi/kalshi_market_trades.sql b/dbt_subprojects/daily_spellbook/models/_projects/kalshi/kalshi_market_trades.sql
--- a/dbt_subprojects/daily_spellbook/models/_projects/kalshi/kalshi_market_trades.sql
+++ b/dbt_subprojects/daily_spellbook/models/_projects/kalshi/kalshi_market_trades.sql
@@ -69,7 +69,7 @@
, t.count_fp
, t.yes_price_dollars
, t.no_price_dollars
- , t.yes_price_dollars * t.count_fp as amount_usd
+ , case when t.taker_side = 'yes' then t.yes_price_dollars else t.no_price_dollars end * t.count_fp as amount_usd
, md.event_ticker
, md.series_ticker
, md.market_typeYou can send follow-ups to the cloud agent here.
Comment @cursor review or bugbot run to trigger another review on this PR
Reviewed by Cursor Bugbot for commit 90f1a88. Configure here.
The amount_usd field now correctly uses the taker's actual price based on taker_side. When taker_side is 'yes', it uses yes_price_dollars; when 'no', it uses no_price_dollars. This fixes incorrect USD values for all No-side trades. Applied via @cursor push command


Summary
Adds two new Kalshi prediction market spells, transforming API-sourced bronze tables into a clean gold layer:
kalshi.market_details(TABLE): Market reference table joiningmarkets_rawwith event metadata frommarket_details_raw. Filtered to markets with >= 100 contracts traded.kalshi.market_trades(VIEW): Trade-level table enriched with market metadata via inner join tomarket_details.Design choices (bronze → gold)
functional_strike,mve_*,is_provisional,fee_waiver_expiration_time), constant (response_price_units,notional_value_dollars,price_ranges), always zero (liquidity_dollars), borderline sparse (rules_secondary,primary_participant_key,settlement_timer_seconds), internal (created_hour)Test plan
dbt compilepasses for both modelsmarket_detailsunique ontickermarket_tradesunique ontrade_id🤖 Generated with Claude Code