You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
PR #577 — Critical Fixes Before Tonight's 2:30 AM Retrain
Problem 1: XGB training query missing model_prob filters
Training included 75 rows where model_prob < 59% (historical gate bypass) — causes biased models
Training included rows with model_prob = NULL (crash orphans from May 15-16) — causes null errors
Fix: Added AND model_prob IS NOT NULL AND model_prob >= 0.59 to the training query.
Problem 2: dispatch_date_log written even when Discord send fails
run_agent_tasklet() returned True unconditionally → dedup blocked same-day retries
May 15-16: pipeline crashed mid-dispatch, dedup blocked retries all day
Fix:
Added _any_discord_sent: bool = False before parlay loop
Set _any_discord_sent = True ONLY when Discord webhook confirmed (_flipped > 0)
Return _any_discord_sent — orchestrator's existing if result is True: _record_dispatch_ran_today() guard now works correctly
Summary by cubic
Fixes XGB training data selection and dispatch dedup logic to prevent biased models and unblock same‑day retries after failures. Enforces model_prob filters and only records a dispatch when a Discord send actually succeeds.
Bug Fixes
Training query: added AND model_prob IS NOT NULL AND model_prob >= 0.59 to exclude nulls and low‑confidence rows.
Dispatch dedup: track _any_discord_sent and return it; only mark dispatch as run when Discord confirms at least one send.
run_agent_tasklet() now returns an explicit boolean flag indicating whether Discord dispatch was actually confirmed in the database, replacing a hardcoded True return. Separately, run_xgboost_tasklet() tightens its training data query to exclude NULL model probabilities and require minimum confidence of 0.59.
Changes
Discord Dispatch Confirmation Signal
Layer / File(s)
Summary
Discord dispatch confirmation flag lifecycle tasklets.py
_any_discord_sent boolean is initialized before the send loop, set to True when the DB update confirms at least one row's discord_sent field flipped, and returned at the end of the function in place of the previous constant True.
XGBoost Training Data Quality Gates
Layer / File(s)
Summary
Model probability filtering in training query tasklets.py
Training data query now filters out rows where model_prob IS NULL and only includes rows where model_prob >= 0.59, raising the quality threshold for historical data used in model retraining.
Estimated code review effort
🎯 2 (Simple) | ⏱️ ~8 minutes
Possibly related PRs
jaayslaughter-cpu/mework#303: Uses the new run_agent_tasklet() return value to guard _record_dispatch_ran_today() logic based on confirmed Discord sends.
jaayslaughter-cpu/mework#416: Modifies run_agent_tasklet() dispatch/send flow dedup logic to prevent opposite-direction double dispatch, directly adjacent to this PR's confirmation signal changes.
jaayslaughter-cpu/mework#217: Adjusts how discord_sent flag is initialized/validated in season_record.py, affecting the same DB state that this PR now explicitly tracks.
Poem
🐰 The tasklet now tells the truth at last—
Did Discord truly send, or was it past?
And models train on confidence so high,
No whispers in the dark, no NULL goodbye! ✨
Check skipped - CodeRabbit’s high-level summary is enabled.
Title check
✅ Passed
The title directly and clearly describes the two main changes: XGB training query filters and dispatch dedup return value, matching the core objectives.
Docstring Coverage
✅ Passed
Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.
Linked Issues check
✅ Passed
Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check
✅ Passed
Check skipped because no linked issues were found for this pull request.
✏️ Tip: You can configure your own custom pre-merge checks in the settings.
✨ Finishing Touches📝 Generate docstrings
Create stacked PR
Commit on current branch
🧪 Generate unit tests (beta)
Create PR with unit tests
Commit unit tests in branch pr-577-xgb-training-dispatch-dedup-fix
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.
We reviewed changes in 65d509d...fff78e8 on this pull request. Below is the summary for the review, and you can see the individual issues we found as inline review comments.
AI Review is run only on demand for your team. We're only showing results of static analysis review right now. To trigger AI Review, comment @deepsourcebot review on this thread.
NEW Get contextual insights on your PRs based on Codacy's metrics, along with PR and Jira context, without leaving GitHub. Enable AI reviewer TIP This summary will be updated as you push new changes.
The reason will be displayed to describe this comment to others. Learn more.
Code Review
This pull request updates tasklets.py to track Discord confirmation status, ensuring the tasklet returns True only when at least one message is sent. It also introduces a model_prob filter in the run_xgboost_tasklet SQL query. A critical issue was identified regarding the scale of the model_prob filter; the current threshold of 0.59 should be updated to 59 to correctly reflect the percentage-based storage used elsewhere in the codebase.
The reason will be displayed to describe this comment to others. Learn more.
The filter model_prob >= 0.59 appears to have a scale mismatch. Throughout the codebase (e.g., in _BaseAgent._model_prob and _american_to_implied), model_prob is calculated and stored as a percentage in the 0-100 range (e.g., 60.0 for 60%). Using 0.59 in the SQL query will include almost all rows (including those with 55% probability), failing to exclude the sub-59% rows mentioned in the PR description. This should be changed to 59 to correctly filter for the 59% threshold.
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 2
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@tasklets.py`:
- Around line 7871-7872: The training filter currently compares model_prob using
the wrong scale (model_prob >= 0.59) so the 59% cutoff isn't applied; update the
SQL/condition that references model_prob (the training filter clause containing
"AND model_prob IS NOT NULL" / "AND model_prob >= 0.59") to use the percent
scale (e.g., model_prob >= 59 or model_prob >= 59.0) and ensure the column is
treated as a numeric type so the comparison behaves as intended.
- Around line 6466-6467: The if-statement checking _flipped is over-indented,
causing a syntax error; unindent the line "if _flipped > 0:" so it aligns with
the surrounding block that contains "_pg3.close()" and "logger.info()", then
ensure the following line setting "_any_discord_sent = True" remains inside that
if-block. Locate the occurrences of "_flipped", "_any_discord_sent",
"_pg3.close()", and "logger.info()" in the same function and adjust the
indentation to match the current block level.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
Push a commit to this branch (recommended)
Create a new PR with the fixes
ℹ️ Review info⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: 6b083747-ef52-4569-8128-41ffd3ce3c54
📥 Commits
Reviewing files that changed from the base of the PR and between 65d509d and fff78e8.
The reason will be displayed to describe this comment to others. Learn more.
⚠️ Potential issue | 🔴 Critical | ⚡ Quick win
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash# Verify indentation around the failing block
nl -ba tasklets.py | sed -n '6458,6472p'
Repository: jaayslaughter-cpu/mework
Length of output: 109
🏁 Script executed:
#!/bin/bash# View lines 6458-6475 with visible whitespace to check indentation
sed -n '6458,6475p' tasklets.py | cat -A
Repository: jaayslaughter-cpu/mework
Length of output: 1296
🏁 Script executed:
#!/bin/bash# Also try to check Python syntax using Python's ast module
python3 <<'EOF'import asttry: with open('tasklets.py', 'r') as f: code = f.read() ast.parse(code) print("✓ tasklets.py has valid Python syntax")except SyntaxError as e: print(f"✗ Syntax error found:") print(f" Line {e.lineno}: {e.msg}") if e.text: print(f" Code: {e.text.rstrip()}") print(f" {' ' * (e.offset - 1)}^")EOF
Repository: jaayslaughter-cpu/mework
Length of output: 189
🏁 Script executed:
#!/bin/bash# Get exact context around lines 6466-6467 using sed with line numbers
sed -n '6460,6475p' tasklets.py | head -20
Repository: jaayslaughter-cpu/mework
Length of output: 978
Fix the indentation error causing parse failure.
Line 6466 is over-indented, which makes the module invalid Python syntax. The if _flipped > 0: statement should be aligned with the preceding _pg3.close() and logger.info() calls at the same block level, not indented further.
‼️IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
Suggested change
if_flipped>0:
_any_discord_sent=True# PR #577
if_flipped>0:
_any_discord_sent=True# PR `#577`
🧰 Tools🪛 Ruff (0.15.12)
[warning] 6466-6466: Unexpected indentation
(invalid-syntax)
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@tasklets.py` around lines 6466 - 6467, The if-statement checking _flipped is
over-indented, causing a syntax error; unindent the line "if _flipped > 0:" so
it aligns with the surrounding block that contains "_pg3.close()" and
"logger.info()", then ensure the following line setting "_any_discord_sent =
True" remains inside that if-block. Locate the occurrences of "_flipped",
"_any_discord_sent", "_pg3.close()", and "logger.info()" in the same function
and adjust the indentation to match the current block level.
The reason will be displayed to describe this comment to others. Learn more.
⚠️ Potential issue | 🟠 Major | ⚡ Quick win
Use the correct model_prob scale in the training filter.
model_prob is stored as percent values (e.g., 59.0), so model_prob >= 0.59 is effectively always true for normal rows and won’t enforce the 59% gate.
Proposed fix
- AND model_prob >= 0.59+ AND model_prob >= 59.0
📝 Committable suggestion
‼️IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
Suggested change
ANDmodel_probISNOTNULL
ANDmodel_prob>=0.59
ANDmodel_probISNOTNULL
ANDmodel_prob>=59.0
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@tasklets.py` around lines 7871 - 7872, The training filter currently compares
model_prob using the wrong scale (model_prob >= 0.59) so the 59% cutoff isn't
applied; update the SQL/condition that references model_prob (the training
filter clause containing "AND model_prob IS NOT NULL" / "AND model_prob >=
0.59") to use the percent scale (e.g., model_prob >= 59 or model_prob >= 59.0)
and ensure the column is treated as a numeric type so the comparison behaves as
intended.
The reason will be displayed to describe this comment to others. Learn more.
1 issue found across 1 file
Prompt for AI agents (unresolved issues)
Check if these issues are valid — if so, understand the root cause of each and fix them. If appropriate, use sub-agents to investigate and fix each issue separately.
<file name="tasklets.py">
<violation number="1" location="tasklets.py:7872">
P1: `model_prob >= 0.59` uses the wrong unit scale; with percent-based `model_prob`, this filter does not remove sub-59% training rows.</violation>
</file>
Reply with feedback, questions, or to request a fix. Tag @cubic-dev-ai to re-run a review. Re-trigger cubic
The reason will be displayed to describe this comment to others. Learn more.
P1: model_prob >= 0.59 uses the wrong unit scale; with percent-based model_prob, this filter does not remove sub-59% training rows.
Prompt for AI agents
Check if this issue is valid — if so, understand the root cause and fix it. At tasklets.py, line 7872:
<comment>`model_prob >= 0.59` uses the wrong unit scale; with percent-based `model_prob`, this filter does not remove sub-59% training rows.</comment>
<file context>
@@ -7865,6 +7868,8 @@ def run_xgboost_tasklet() -> None:
AND (lookahead_safe IS NULL OR lookahead_safe = TRUE)
AND agent_name NOT ILIKE '%seed%'
+ AND model_prob IS NOT NULL
+ AND model_prob >= 0.59
AND prop_type NOT IN (
'fantasy_score', 'fantasy_hitter', 'fantasy_pitcher',
</file context>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
PR #577 — Critical Fixes Before Tonight's 2:30 AM Retrain
Problem 1: XGB training query missing model_prob filters
model_prob < 59%(historical gate bypass) — causes biased modelsmodel_prob = NULL(crash orphans from May 15-16) — causes null errorsFix: Added
AND model_prob IS NOT NULL AND model_prob >= 0.59to the training query.Problem 2:
dispatch_date_logwritten even when Discord send failsrun_agent_tasklet()returnedTrueunconditionally → dedup blocked same-day retriesFix:
_any_discord_sent: bool = Falsebefore parlay loop_any_discord_sent = TrueONLY when Discord webhook confirmed (_flipped > 0)_any_discord_sent— orchestrator's existingif result is True: _record_dispatch_ran_today()guard now works correctlySummary by cubic
Fixes XGB training data selection and dispatch dedup logic to prevent biased models and unblock same‑day retries after failures. Enforces
model_probfilters and only records a dispatch when a Discord send actually succeeds.AND model_prob IS NOT NULL AND model_prob >= 0.59to exclude nulls and low‑confidence rows._any_discord_sentand return it; only mark dispatch as run when Discord confirms at least one send.Written for commit fff78e8. Summary will update on new commits. Review in cubic
Summary by CodeRabbit
Release Notes
Bug Fixes
Improvements