PR #577: Fix XGB training query filters + dispatch dedup return value by jaayslaughter-cpu · Pull Request #447 · jaayslaughter-cpu/mework

jaayslaughter-cpu · 2026-05-18T02:22:28Z

PR #577 — Critical Fixes Before Tonight's 2:30 AM Retrain

Problem 1: XGB training query missing model_prob filters

Training included 75 rows where model_prob < 59% (historical gate bypass) — causes biased models
Training included rows with model_prob = NULL (crash orphans from May 15-16) — causes null errors

Fix: Added AND model_prob IS NOT NULL AND model_prob >= 0.59 to the training query.

Problem 2: `dispatch_date_log` written even when Discord send fails

run_agent_tasklet() returned True unconditionally → dedup blocked same-day retries
May 15-16: pipeline crashed mid-dispatch, dedup blocked retries all day

Fix:

Added _any_discord_sent: bool = False before parlay loop
Set _any_discord_sent = True ONLY when Discord webhook confirmed (_flipped > 0)
Return _any_discord_sent — orchestrator's existing if result is True: _record_dispatch_ran_today() guard now works correctly

Summary by cubic

Fixes XGB training data selection and dispatch dedup logic to prevent biased models and unblock same‑day retries after failures. Enforces model_prob filters and only records a dispatch when a Discord send actually succeeds.

Bug Fixes
- Training query: added AND model_prob IS NOT NULL AND model_prob >= 0.59 to exclude nulls and low‑confidence rows.
- Dispatch dedup: track _any_discord_sent and return it; only mark dispatch as run when Discord confirms at least one send.

^{Written for commit fff78e8. Summary will update on new commits. Review in cubic}

Summary by CodeRabbit

Release Notes

Bug Fixes
- Discord dispatch confirmations now accurately reflect whether picks were successfully sent to ensure reliable status reporting.
Improvements
- Model training now applies stricter quality filters to historical data, improving the robustness of retraining cycles.

…e fix

coderabbitai · 2026-05-18T02:22:38Z

📝 Walkthrough

Walkthrough

run_agent_tasklet() now returns an explicit boolean flag indicating whether Discord dispatch was actually confirmed in the database, replacing a hardcoded True return. Separately, run_xgboost_tasklet() tightens its training data query to exclude NULL model probabilities and require minimum confidence of 0.59.

Changes

Discord Dispatch Confirmation Signal

Layer / File(s)	Summary
Discord dispatch confirmation flag lifecycle `tasklets.py`	`_any_discord_sent` boolean is initialized before the send loop, set to `True` when the DB update confirms at least one row's `discord_sent` field flipped, and returned at the end of the function in place of the previous constant `True`.

XGBoost Training Data Quality Gates

Layer / File(s)	Summary
Model probability filtering in training query `tasklets.py`	Training data query now filters out rows where `model_prob IS NULL` and only includes rows where `model_prob >= 0.59`, raising the quality threshold for historical data used in model retraining.

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~8 minutes

Possibly related PRs

jaayslaughter-cpu/mework#303: Uses the new run_agent_tasklet() return value to guard _record_dispatch_ran_today() logic based on confirmed Discord sends.
jaayslaughter-cpu/mework#416: Modifies run_agent_tasklet() dispatch/send flow dedup logic to prevent opposite-direction double dispatch, directly adjacent to this PR's confirmation signal changes.
jaayslaughter-cpu/mework#217: Adjusts how discord_sent flag is initialized/validated in season_record.py, affecting the same DB state that this PR now explicitly tracks.

Poem

🐰 The tasklet now tells the truth at last—
Did Discord truly send, or was it past?
And models train on confidence so high,
No whispers in the dark, no NULL goodbye! ✨

🚥 Pre-merge checks | ✅ 5

✅ Passed checks (5 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title directly and clearly describes the two main changes: XGB training query filters and dispatch dedup return value, matching the core objectives.
Docstring Coverage	✅ Passed	Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.
Linked Issues check	✅ Passed	Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check	✅ Passed	Check skipped because no linked issues were found for this pull request.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

📝 Generate docstrings

Create stacked PR
Commit on current branch

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch pr-577-xgb-training-dispatch-dedup-fix

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

deepsource-io · 2026-05-18T02:22:43Z

DeepSource Code Review

We reviewed changes in 65d509d...fff78e8 on this pull request. Below is the summary for the review, and you can see the individual issues we found as inline review comments.

See full review on DeepSource ↗

PR Report Card

Overall Grade	Security Reliability Complexity Hygiene

Code Review Summary

Analyzer	Updated (UTC)	Details
Docker	May 18, 2026 2:22a.m.	Review ↗
JavaScript	May 18, 2026 2:22a.m.	Review ↗
Python	May 18, 2026 2:22a.m.	Review ↗
SQL	May 18, 2026 2:22a.m.	Review ↗
Secrets	May 18, 2026 2:22a.m.	Review ↗

Important

AI Review is run only on demand for your team. We're only showing results of static analysis review right now. To trigger AI Review, comment @deepsourcebot review on this thread.

codacy-production · 2026-05-18T02:23:08Z

Up to standards ✅

🟢 Issues 0 issues

Results:
0 new issues

View in Codacy

_{NEW Get contextual insights on your PRs based on Codacy's metrics, along with PR and Jira context, without leaving GitHub. Enable AI reviewer}
_{TIP This summary will be updated as you push new changes.}

deepsource-io · 2026-05-18T02:23:37Z

                            agent_name, _flipped,
                            sum(1 for v in _send_clv_map.values() if v is not None),
                            len(_send_clv_map))
+                    if _flipped > 0:


IndentationError: unexpected indent

A Python syntax error is detected.

gemini-code-assist

Code Review

This pull request updates tasklets.py to track Discord confirmation status, ensuring the tasklet returns True only when at least one message is sent. It also introduces a model_prob filter in the run_xgboost_tasklet SQL query. A critical issue was identified regarding the scale of the model_prob filter; the current threshold of 0.59 should be updated to 59 to correctly reflect the percentage-based storage used elsewhere in the codebase.

gemini-code-assist · 2026-05-18T02:24:16Z

                  AND (lookahead_safe IS NULL OR lookahead_safe = TRUE)
                  AND agent_name NOT ILIKE '%seed%'
+                  AND model_prob IS NOT NULL
+                  AND model_prob >= 0.59


The filter model_prob >= 0.59 appears to have a scale mismatch. Throughout the codebase (e.g., in _BaseAgent._model_prob and _american_to_implied), model_prob is calculated and stored as a percentage in the 0-100 range (e.g., 60.0 for 60%). Using 0.59 in the SQL query will include almost all rows (including those with 55% probability), failing to exclude the sub-59% rows mentioned in the PR description. This should be changed to 59 to correctly filter for the 59% threshold.

AND model_prob >= 59

coderabbitai

Actionable comments posted: 2

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@tasklets.py`:
- Around line 7871-7872: The training filter currently compares model_prob using
the wrong scale (model_prob >= 0.59) so the 59% cutoff isn't applied; update the
SQL/condition that references model_prob (the training filter clause containing
"AND model_prob IS NOT NULL" / "AND model_prob >= 0.59") to use the percent
scale (e.g., model_prob >= 59 or model_prob >= 59.0) and ensure the column is
treated as a numeric type so the comparison behaves as intended.
- Around line 6466-6467: The if-statement checking _flipped is over-indented,
causing a syntax error; unindent the line "if _flipped > 0:" so it aligns with
the surrounding block that contains "_pg3.close()" and "logger.info()", then
ensure the following line setting "_any_discord_sent = True" remains inside that
if-block. Locate the occurrences of "_flipped", "_any_discord_sent",
"_pg3.close()", and "logger.info()" in the same function and adjust the
indentation to match the current block level.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 6b083747-ef52-4569-8128-41ffd3ce3c54

📥 Commits

Reviewing files that changed from the base of the PR and between 65d509d and fff78e8.

📒 Files selected for processing (1)

tasklets.py

coderabbitai · 2026-05-18T02:24:19Z

+                    if _flipped > 0:
+                        _any_discord_sent = True   # PR #577


⚠️ Potential issue | 🔴 Critical | ⚡ Quick win

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash # Verify indentation around the failing block nl -ba tasklets.py | sed -n '6458,6472p'

Repository: jaayslaughter-cpu/mework

Length of output: 109

🏁 Script executed:

#!/bin/bash # View lines 6458-6475 with visible whitespace to check indentation sed -n '6458,6475p' tasklets.py | cat -A

Repository: jaayslaughter-cpu/mework

Length of output: 1296

🏁 Script executed:

#!/bin/bash # Also try to check Python syntax using Python's ast module python3 <<'EOF' import ast try: with open('tasklets.py', 'r') as f: code = f.read() ast.parse(code) print("✓ tasklets.py has valid Python syntax") except SyntaxError as e: print(f"✗ Syntax error found:") print(f" Line {e.lineno}: {e.msg}") if e.text: print(f" Code: {e.text.rstrip()}") print(f" {' ' * (e.offset - 1)}^") EOF

Repository: jaayslaughter-cpu/mework

Length of output: 189

🏁 Script executed:

#!/bin/bash # Get exact context around lines 6466-6467 using sed with line numbers sed -n '6460,6475p' tasklets.py | head -20

Repository: jaayslaughter-cpu/mework

Length of output: 978

Fix the indentation error causing parse failure.

Line 6466 is over-indented, which makes the module invalid Python syntax. The if _flipped > 0: statement should be aligned with the preceding _pg3.close() and logger.info() calls at the same block level, not indented further.

Proposed fix

- if _flipped > 0: - _any_discord_sent = True # PR `#577` + if _flipped > 0: + _any_discord_sent = True # PR `#577`

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

if _flipped > 0:

_any_discord_sent = True # PR #577

if _flipped > 0:

_any_discord_sent = True # PR `#577`

🧰 Tools

🪛 Ruff (0.15.12)

[warning] 6466-6466: Unexpected indentation

(invalid-syntax)

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@tasklets.py` around lines 6466 - 6467, The if-statement checking _flipped is over-indented, causing a syntax error; unindent the line "if _flipped > 0:" so it aligns with the surrounding block that contains "_pg3.close()" and "logger.info()", then ensure the following line setting "_any_discord_sent = True" remains inside that if-block. Locate the occurrences of "_flipped", "_any_discord_sent", "_pg3.close()", and "logger.info()" in the same function and adjust the indentation to match the current block level.

coderabbitai · 2026-05-18T02:24:19Z

+                  AND model_prob IS NOT NULL
+                  AND model_prob >= 0.59


⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Use the correct model_prob scale in the training filter.

model_prob is stored as percent values (e.g., 59.0), so model_prob >= 0.59 is effectively always true for normal rows and won’t enforce the 59% gate.

Proposed fix

- AND model_prob >= 0.59 + AND model_prob >= 59.0

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

AND model_prob IS NOT NULL

AND model_prob >= 0.59

AND model_prob IS NOT NULL

AND model_prob >= 59.0

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@tasklets.py` around lines 7871 - 7872, The training filter currently compares model_prob using the wrong scale (model_prob >= 0.59) so the 59% cutoff isn't applied; update the SQL/condition that references model_prob (the training filter clause containing "AND model_prob IS NOT NULL" / "AND model_prob >= 0.59") to use the percent scale (e.g., model_prob >= 59 or model_prob >= 59.0) and ensure the column is treated as a numeric type so the comparison behaves as intended.

cubic-dev-ai

1 issue found across 1 file

Prompt for AI agents (unresolved issues)


Check if these issues are valid — if so, understand the root cause of each and fix them. If appropriate, use sub-agents to investigate and fix each issue separately.


<file name="tasklets.py">

<violation number="1" location="tasklets.py:7872">
P1: `model_prob >= 0.59` uses the wrong unit scale; with percent-based `model_prob`, this filter does not remove sub-59% training rows.</violation>
</file>

_{Reply with feedback, questions, or to request a fix. Tag @cubic-dev-ai to re-run a review.
Re-trigger cubic}

cubic-dev-ai · 2026-05-18T02:25:28Z

                  AND (lookahead_safe IS NULL OR lookahead_safe = TRUE)
                  AND agent_name NOT ILIKE '%seed%'
+                  AND model_prob IS NOT NULL
+                  AND model_prob >= 0.59


P1: model_prob >= 0.59 uses the wrong unit scale; with percent-based model_prob, this filter does not remove sub-59% training rows.

Prompt for AI agents

Check if this issue is valid — if so, understand the root cause and fix it. At tasklets.py, line 7872: <comment>`model_prob >= 0.59` uses the wrong unit scale; with percent-based `model_prob`, this filter does not remove sub-59% training rows.</comment> <file context> @@ -7865,6 +7868,8 @@ def run_xgboost_tasklet() -> None: AND (lookahead_safe IS NULL OR lookahead_safe = TRUE) AND agent_name NOT ILIKE '%seed%' + AND model_prob IS NOT NULL + AND model_prob >= 0.59 AND prop_type NOT IN ( 'fantasy_score', 'fantasy_hitter', 'fantasy_pitcher', </file context>

Suggested change

AND model_prob >= 0.59

AND model_prob >= 59

PR #577: XGB training model_prob filters + dispatch dedup return valu…

fff78e8

…e fix

deepsource-io Bot reviewed May 18, 2026

View reviewed changes

gemini-code-assist Bot reviewed May 18, 2026

View reviewed changes

coderabbitai Bot reviewed May 18, 2026

View reviewed changes

cubic-dev-ai Bot reviewed May 18, 2026

View reviewed changes

jaayslaughter-cpu merged commit 10c5b95 into main May 18, 2026
8 of 9 checks passed

coderabbitai Bot mentioned this pull request May 18, 2026

PR #582: QA Audit — Fix all 5 critical/warning issues #453

Merged

Conversation

jaayslaughter-cpu commented May 18, 2026 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

PR #577 — Critical Fixes Before Tonight's 2:30 AM Retrain

Problem 1: XGB training query missing model_prob filters

Problem 2: dispatch_date_log written even when Discord send fails

Summary by cubic

Summary by CodeRabbit

Release Notes

Uh oh!

coderabbitai Bot commented May 18, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Estimated code review effort

Possibly related PRs

Poem

Uh oh!

deepsource-io Bot commented May 18, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

DeepSource Code Review

PR Report Card

Code Review Summary

Uh oh!

codacy-production Bot commented May 18, 2026

Up to standards ✅

Uh oh!

deepsource-io Bot May 18, 2026

Choose a reason for hiding this comment

IndentationError: unexpected indent

Uh oh!

gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

gemini-code-assist Bot May 18, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot May 18, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot May 18, 2026

Choose a reason for hiding this comment

Uh oh!

cubic-dev-ai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

cubic-dev-ai Bot May 18, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

jaayslaughter-cpu commented May 18, 2026 •

edited by coderabbitai Bot

Loading

Problem 2: `dispatch_date_log` written even when Discord send fails

coderabbitai Bot commented May 18, 2026 •

edited

Loading

deepsource-io Bot commented May 18, 2026 •

edited

Loading