Skip to content

PR #587 — Fix XGBoost retrain crash: tz-aware vs naive datetime TypeError in sample_weights#458

Merged
jaayslaughter-cpu merged 1 commit into
mainfrom
fix/xgb-sample-weights-tz
May 18, 2026
Merged

PR #587 — Fix XGBoost retrain crash: tz-aware vs naive datetime TypeError in sample_weights#458
jaayslaughter-cpu merged 1 commit into
mainfrom
fix/xgb-sample-weights-tz

Conversation

@jaayslaughter-cpu
Copy link
Copy Markdown
Owner

@jaayslaughter-cpu jaayslaughter-cpu commented May 18, 2026

Root Cause

run_xgboost_tasklet() has been crashing on every nightly 2:30 AM retrain since PR #575 merged. The crash is a TypeError: can't subtract offset-naive and offset-aware datetimes in the sample_weights comprehension.

# BUG (PR #575 introduced this):
now_utc = datetime.datetime.now(datetime.timezone.utc)  # tz-AWARE
sample_weights = np.array([
    np.exp(-0.01 * max((now_utc - _parse_graded_at(r[2]).replace(tzinfo=None)).days, 0))
    #                   ^^^^^^^^ tz-aware   ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ tz-naive
    #                   TypeError on EVERY row
    for r in rows
], dtype=np.float32)

_parse_graded_at(r[2]).replace(tzinfo=None) strips timezone → naive datetime.
now_utc - naive_datetimeTypeError on every row, function crashes before model.fit().

This is why xgb_model_store has been empty since day 1. No model → base rate ~52-58% → MIN_PROB gate (60%) blocks all legs → eval_none=2182 every dispatch → zero picks sent.

Fix

_now_naive = now_utc.replace(tzinfo=None)  # PR #587: fix tz-aware vs naive TypeError
sample_weights = np.array([
    np.exp(-0.01 * max((_now_naive - _parse_graded_at(r[2]).replace(tzinfo=None)).days, 0))
    for r in rows
], dtype=np.float32)

One-line fix: strip timezone from now_utc too, so both operands are naive.

Impact

After this PR merges and Railway redeploys, tonight's 2:30 AM retrain will:

  1. Query 72 qualifying training rows
  2. Compute sample weights without crashing
  3. Train XGBoost on real features
  4. Save model to xgb_model_store (INSERT works — verified manually)
  5. Tomorrow's 8:30 AM dispatch loads the model → predictions > 60% → picks fire

Note

A manually-trained model was inserted into xgb_model_store (id=2) as a bridge fix for today's dispatch window. This PR ensures the automated 2:30 AM retrain works correctly going forward.


Summary by cubic

Fixes a TypeError in the XGBoost retrain by aligning datetime types in sample_weights. The nightly 2:30 AM retrain now completes and saves a model to xgb_model_store, enabling dispatch to use it.

  • Bug Fixes
    • Convert now_utc to a naive datetime before subtraction (_now_naive = now_utc.replace(tzinfo=None)) to prevent tz-aware vs naive errors in weight calculation.

Written for commit 03648dc. Summary will update on new commits. Review in cubic

Summary by CodeRabbit

  • Bug Fixes
    • Resolved timezone handling errors in the weekly model retraining process that previously caused weight calculation failures.

Review Change Stack

@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented May 18, 2026

📝 Walkthrough

Walkthrough

The PR fixes a timezone handling bug in XGBoost weekly retraining. The sample-weight recency decay now normalizes both the current timestamp and parsed graded_at timestamp to timezone-naive datetimes before computing days elapsed, preventing type errors from mixing aware and naive datetime values.

Changes

Sample-weight recency calculation

Layer / File(s) Summary
Timezone-naive datetime normalization
tasklets.py
The recency decay computation derives _now_naive from now_utc by removing tzinfo and forces the parsed graded_at timestamp to be naive before computing the time delta in days, preventing tz-aware vs tz-naive TypeError.

Estimated code review effort

🎯 1 (Trivial) | ⏱️ ~3 minutes

Poem

🐰 A rabbit hops through timezones with glee,
Stripping the tzinfo from UTC's spree,
Now aware and naive play along,
No type errors in the weight song! ✨

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 50.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The PR title directly and specifically describes the main change: fixing a timezone-aware vs naive datetime TypeError in XGBoost sample_weights calculation that was causing retrain crashes.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch fix/xgb-sample-weights-tz

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@secure-code-warrior-for-github
Copy link
Copy Markdown

Micro-Learning Topic: Security Misconfiguration (Detected by phrase)

Matched on "security misconfiguration"

What is this? (2min video)

Try a challenge in Secure Code Warrior

@deepsource-io
Copy link
Copy Markdown

deepsource-io Bot commented May 18, 2026

DeepSource Code Review

We reviewed changes in b606d71...03648dc on this pull request. Below is the summary for the review, and you can see the individual issues we found as inline review comments.

See full review on DeepSource ↗

PR Report Card

Overall Grade   Security  

Reliability  

Complexity  

Hygiene  

Code Review Summary

Analyzer Status Updated (UTC) Details
Docker May 18, 2026 5:42p.m. Review ↗
JavaScript May 18, 2026 5:42p.m. Review ↗
Python May 18, 2026 5:42p.m. Review ↗
SQL May 18, 2026 5:42p.m. Review ↗
Secrets May 18, 2026 5:42p.m. Review ↗

Important

AI Review is run only on demand for your team. We're only showing results of static analysis review right now. To trigger AI Review, comment @deepsourcebot review on this thread.

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@tasklets.py`:
- Around line 7997-8000: The current code drops tzinfo naively which can skew
day differences; convert both timestamps to UTC before removing tzinfo: call
_parse_graded_at(row[2]) and convert it to UTC (e.g., via
astimezone(datetime.timezone.utc)) then replace(tzinfo=None) and do the same for
now_utc (or simply use
now_utc.astimezone(datetime.timezone.utc).replace(tzinfo=None)) so that the
computation inside sample_weights (uses _now_naive, _parse_graded_at, rows)
compares timestamps in UTC rather than dropping offsets in place.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 70cf0117-d653-48d1-9b74-d914523f1db4

📥 Commits

Reviewing files that changed from the base of the PR and between b606d71 and 03648dc.

📒 Files selected for processing (1)
  • tasklets.py

Comment thread tasklets.py
Comment on lines +7997 to 8000
_now_naive = now_utc.replace(tzinfo=None) # PR #587: fix tz-aware vs naive TypeError
sample_weights = np.array([
np.exp(-0.01 * max((now_utc - _parse_graded_at(r[2]).replace(tzinfo=None)).days, 0))
np.exp(-0.01 * max((_now_naive - _parse_graded_at(r[2]).replace(tzinfo=None)).days, 0))
for r in rows
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Normalize aware timestamps to UTC before dropping tzinfo.

This avoids the TypeError, but replace(tzinfo=None) on an offset-aware graded_at drops the offset without conversion, which can skew .days and sample weights for non-UTC offsets.

Suggested fix
     _now_naive = now_utc.replace(tzinfo=None)  # PR `#587`: fix tz-aware vs naive TypeError
+    def _to_utc_naive(dt: datetime.datetime) -> datetime.datetime:
+        if dt.tzinfo is not None:
+            return dt.astimezone(datetime.timezone.utc).replace(tzinfo=None)
+        return dt
+
     sample_weights = np.array([
-        np.exp(-0.01 * max((_now_naive - _parse_graded_at(r[2]).replace(tzinfo=None)).days, 0))
+        np.exp(-0.01 * max((_now_naive - _to_utc_naive(_parse_graded_at(r[2]))).days, 0))
         for r in rows
     ], dtype=np.float32)
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
_now_naive = now_utc.replace(tzinfo=None) # PR #587: fix tz-aware vs naive TypeError
sample_weights = np.array([
np.exp(-0.01 * max((now_utc - _parse_graded_at(r[2]).replace(tzinfo=None)).days, 0))
np.exp(-0.01 * max((_now_naive - _parse_graded_at(r[2]).replace(tzinfo=None)).days, 0))
for r in rows
_now_naive = now_utc.replace(tzinfo=None) # PR `#587`: fix tz-aware vs naive TypeError
def _to_utc_naive(dt: datetime.datetime) -> datetime.datetime:
if dt.tzinfo is not None:
return dt.astimezone(datetime.timezone.utc).replace(tzinfo=None)
return dt
sample_weights = np.array([
np.exp(-0.01 * max((_now_naive - _to_utc_naive(_parse_graded_at(r[2]))).days, 0))
for r in rows
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@tasklets.py` around lines 7997 - 8000, The current code drops tzinfo naively
which can skew day differences; convert both timestamps to UTC before removing
tzinfo: call _parse_graded_at(row[2]) and convert it to UTC (e.g., via
astimezone(datetime.timezone.utc)) then replace(tzinfo=None) and do the same for
now_utc (or simply use
now_utc.astimezone(datetime.timezone.utc).replace(tzinfo=None)) so that the
computation inside sample_weights (uses _now_naive, _parse_graded_at, rows)
compares timestamps in UTC rather than dropping offsets in place.

@codacy-production
Copy link
Copy Markdown

Up to standards ✅

🟢 Issues 0 issues

Results:
0 new issues

View in Codacy

NEW Get contextual insights on your PRs based on Codacy's metrics, along with PR and Jira context, without leaving GitHub. Enable AI reviewer
TIP This summary will be updated as you push new changes.

Copy link
Copy Markdown
Contributor

@cubic-dev-ai cubic-dev-ai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No issues found across 1 file

Re-trigger cubic

Copy link
Copy Markdown

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request fixes a TypeError in tasklets.py by converting the timezone-aware now_utc variable to a naive datetime before calculating sample weights. The reviewer suggests a more robust long-term approach of maintaining timezone awareness throughout the calculation to prevent potential bugs with mixed timezones, providing a code suggestion for a helper function to ensure UTC awareness.

Comment thread tasklets.py
Comment on lines +7997 to 8001
_now_naive = now_utc.replace(tzinfo=None) # PR #587: fix tz-aware vs naive TypeError
sample_weights = np.array([
np.exp(-0.01 * max((now_utc - _parse_graded_at(r[2]).replace(tzinfo=None)).days, 0))
np.exp(-0.01 * max((_now_naive - _parse_graded_at(r[2]).replace(tzinfo=None)).days, 0))
for r in rows
], dtype=np.float32)
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

While this fix is correct and solves the immediate TypeError, a more robust long-term solution is to work with timezone-aware datetimes consistently, rather than stripping timezone information. This prevents subtle bugs if data from different timezones is introduced in the future.

A better approach would be to ensure the graded_at datetime is timezone-aware (in UTC) before subtracting it from the aware now_utc. This makes the code's intent more explicit and safer.

Here's an alternative implementation for consideration, which could be placed inside run_xgboost_tasklet:

def _ensure_utc_aware(dt):
    """Helper to ensure a datetime is timezone-aware and in UTC."""
    if dt.tzinfo is None:
        # Assume naive datetimes from DB are in UTC.
        return dt.replace(tzinfo=datetime.timezone.utc)
    # Convert other aware datetimes to UTC.
    return dt.astimezone(datetime.timezone.utc)

sample_weights = np.array([
    np.exp(-0.01 * max((now_utc - _ensure_utc_aware(_parse_graded_at(r[2]))).days, 0))
    for r in rows
], dtype=np.float32)

This pattern of ensuring datetimes are aware before use is a good practice for robustness.

@jaayslaughter-cpu jaayslaughter-cpu merged commit e3afc09 into main May 18, 2026
8 of 9 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant