PR #587 — Fix XGBoost retrain crash: tz-aware vs naive datetime TypeError in sample_weights by jaayslaughter-cpu · Pull Request #458 · jaayslaughter-cpu/mework

jaayslaughter-cpu · 2026-05-18T17:42:45Z

Root Cause

run_xgboost_tasklet() has been crashing on every nightly 2:30 AM retrain since PR #575 merged. The crash is a TypeError: can't subtract offset-naive and offset-aware datetimes in the sample_weights comprehension.

# BUG (PR #575 introduced this):
now_utc = datetime.datetime.now(datetime.timezone.utc)  # tz-AWARE
sample_weights = np.array([
    np.exp(-0.01 * max((now_utc - _parse_graded_at(r[2]).replace(tzinfo=None)).days, 0))
    #                   ^^^^^^^^ tz-aware   ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ tz-naive
    #                   TypeError on EVERY row
    for r in rows
], dtype=np.float32)

_parse_graded_at(r[2]).replace(tzinfo=None) strips timezone → naive datetime.
now_utc - naive_datetime → TypeError on every row, function crashes before model.fit().

This is why xgb_model_store has been empty since day 1. No model → base rate ~52-58% → MIN_PROB gate (60%) blocks all legs → eval_none=2182 every dispatch → zero picks sent.

Fix

_now_naive = now_utc.replace(tzinfo=None)  # PR #587: fix tz-aware vs naive TypeError
sample_weights = np.array([
    np.exp(-0.01 * max((_now_naive - _parse_graded_at(r[2]).replace(tzinfo=None)).days, 0))
    for r in rows
], dtype=np.float32)

One-line fix: strip timezone from now_utc too, so both operands are naive.

Impact

After this PR merges and Railway redeploys, tonight's 2:30 AM retrain will:

Query 72 qualifying training rows
Compute sample weights without crashing
Train XGBoost on real features
Save model to xgb_model_store (INSERT works — verified manually)
Tomorrow's 8:30 AM dispatch loads the model → predictions > 60% → picks fire

Note

A manually-trained model was inserted into xgb_model_store (id=2) as a bridge fix for today's dispatch window. This PR ensures the automated 2:30 AM retrain works correctly going forward.

Summary by cubic

Fixes a TypeError in the XGBoost retrain by aligning datetime types in sample_weights. The nightly 2:30 AM retrain now completes and saves a model to xgb_model_store, enabling dispatch to use it.

Bug Fixes
- Convert now_utc to a naive datetime before subtraction (_now_naive = now_utc.replace(tzinfo=None)) to prevent tz-aware vs naive errors in weight calculation.

^{Written for commit 03648dc. Summary will update on new commits. Review in cubic}

Summary by CodeRabbit

Bug Fixes
- Resolved timezone handling errors in the weekly model retraining process that previously caused weight calculation failures.

coderabbitai · 2026-05-18T17:42:56Z

📝 Walkthrough

Walkthrough

The PR fixes a timezone handling bug in XGBoost weekly retraining. The sample-weight recency decay now normalizes both the current timestamp and parsed graded_at timestamp to timezone-naive datetimes before computing days elapsed, preventing type errors from mixing aware and naive datetime values.

Changes

Sample-weight recency calculation

Layer / File(s)	Summary
Timezone-naive datetime normalization `tasklets.py`	The recency decay computation derives `_now_naive` from `now_utc` by removing `tzinfo` and forces the parsed `graded_at` timestamp to be naive before computing the time delta in days, preventing tz-aware vs tz-naive `TypeError`.

Estimated code review effort

🎯 1 (Trivial) | ⏱️ ~3 minutes

Poem

🐰 A rabbit hops through timezones with glee,
Stripping the tzinfo from UTC's spree,
Now aware and naive play along,
No type errors in the weight song! ✨

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 50.00% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (4 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The PR title directly and specifically describes the main change: fixing a timezone-aware vs naive datetime TypeError in XGBoost sample_weights calculation that was causing retrain crashes.
Linked Issues check	✅ Passed	Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check	✅ Passed	Check skipped because no linked issues were found for this pull request.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

📝 Generate docstrings

Create stacked PR
Commit on current branch

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch fix/xgb-sample-weights-tz

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

secure-code-warrior-for-github · 2026-05-18T17:42:59Z

Micro-Learning Topic: Security Misconfiguration (Detected by phrase)

Matched on "security misconfiguration"

What is this? (2min video)

Try a challenge in Secure Code Warrior

deepsource-io · 2026-05-18T17:43:03Z

DeepSource Code Review

We reviewed changes in b606d71...03648dc on this pull request. Below is the summary for the review, and you can see the individual issues we found as inline review comments.

See full review on DeepSource ↗

PR Report Card

Overall Grade	Security Reliability Complexity Hygiene

Code Review Summary

Analyzer	Updated (UTC)	Details
Docker	May 18, 2026 5:42p.m.	Review ↗
JavaScript	May 18, 2026 5:42p.m.	Review ↗
Python	May 18, 2026 5:42p.m.	Review ↗
SQL	May 18, 2026 5:42p.m.	Review ↗
Secrets	May 18, 2026 5:42p.m.	Review ↗

Important

AI Review is run only on demand for your team. We're only showing results of static analysis review right now. To trigger AI Review, comment @deepsourcebot review on this thread.

coderabbitai

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@tasklets.py`:
- Around line 7997-8000: The current code drops tzinfo naively which can skew
day differences; convert both timestamps to UTC before removing tzinfo: call
_parse_graded_at(row[2]) and convert it to UTC (e.g., via
astimezone(datetime.timezone.utc)) then replace(tzinfo=None) and do the same for
now_utc (or simply use
now_utc.astimezone(datetime.timezone.utc).replace(tzinfo=None)) so that the
computation inside sample_weights (uses _now_naive, _parse_graded_at, rows)
compares timestamps in UTC rather than dropping offsets in place.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 70cf0117-d653-48d1-9b74-d914523f1db4

📥 Commits

Reviewing files that changed from the base of the PR and between b606d71 and 03648dc.

📒 Files selected for processing (1)

tasklets.py

coderabbitai · 2026-05-18T17:44:19Z

+    _now_naive = now_utc.replace(tzinfo=None)  # PR #587: fix tz-aware vs naive TypeError
    sample_weights = np.array([
-        np.exp(-0.01 * max((now_utc - _parse_graded_at(r[2]).replace(tzinfo=None)).days, 0))
+        np.exp(-0.01 * max((_now_naive - _parse_graded_at(r[2]).replace(tzinfo=None)).days, 0))
        for r in rows


⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Normalize aware timestamps to UTC before dropping tzinfo.

This avoids the TypeError, but replace(tzinfo=None) on an offset-aware graded_at drops the offset without conversion, which can skew .days and sample weights for non-UTC offsets.

Suggested fix

_now_naive = now_utc.replace(tzinfo=None) # PR `#587`: fix tz-aware vs naive TypeError + def _to_utc_naive(dt: datetime.datetime) -> datetime.datetime: + if dt.tzinfo is not None: + return dt.astimezone(datetime.timezone.utc).replace(tzinfo=None) + return dt + sample_weights = np.array([ - np.exp(-0.01 * max((_now_naive - _parse_graded_at(r[2]).replace(tzinfo=None)).days, 0)) + np.exp(-0.01 * max((_now_naive - _to_utc_naive(_parse_graded_at(r[2]))).days, 0)) for r in rows ], dtype=np.float32)

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

_now_naive = now_utc.replace(tzinfo=None) # PR #587: fix tz-aware vs naive TypeError

sample_weights = np.array([

np.exp(-0.01 * max((now_utc - _parse_graded_at(r[2]).replace(tzinfo=None)).days, 0))

np.exp(-0.01 * max((_now_naive - _parse_graded_at(r[2]).replace(tzinfo=None)).days, 0))

for r in rows

_now_naive = now_utc.replace(tzinfo=None) # PR `#587`: fix tz-aware vs naive TypeError

def _to_utc_naive(dt: datetime.datetime) -> datetime.datetime:

if dt.tzinfo is not None:

return dt.astimezone(datetime.timezone.utc).replace(tzinfo=None)

return dt

sample_weights = np.array([

np.exp(-0.01 * max((_now_naive - _to_utc_naive(_parse_graded_at(r[2]))).days, 0))

for r in rows

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@tasklets.py` around lines 7997 - 8000, The current code drops tzinfo naively which can skew day differences; convert both timestamps to UTC before removing tzinfo: call _parse_graded_at(row[2]) and convert it to UTC (e.g., via astimezone(datetime.timezone.utc)) then replace(tzinfo=None) and do the same for now_utc (or simply use now_utc.astimezone(datetime.timezone.utc).replace(tzinfo=None)) so that the computation inside sample_weights (uses _now_naive, _parse_graded_at, rows) compares timestamps in UTC rather than dropping offsets in place.

codacy-production · 2026-05-18T17:44:49Z

Up to standards ✅

🟢 Issues 0 issues

Results:
0 new issues

View in Codacy

_{NEW Get contextual insights on your PRs based on Codacy's metrics, along with PR and Jira context, without leaving GitHub. Enable AI reviewer}
_{TIP This summary will be updated as you push new changes.}

cubic-dev-ai

No issues found across 1 file

_{Re-trigger cubic}

gemini-code-assist

Code Review

This pull request fixes a TypeError in tasklets.py by converting the timezone-aware now_utc variable to a naive datetime before calculating sample weights. The reviewer suggests a more robust long-term approach of maintaining timezone awareness throughout the calculation to prevent potential bugs with mixed timezones, providing a code suggestion for a helper function to ensure UTC awareness.

gemini-code-assist · 2026-05-18T17:50:05Z

+    _now_naive = now_utc.replace(tzinfo=None)  # PR #587: fix tz-aware vs naive TypeError
    sample_weights = np.array([
-        np.exp(-0.01 * max((now_utc - _parse_graded_at(r[2]).replace(tzinfo=None)).days, 0))
+        np.exp(-0.01 * max((_now_naive - _parse_graded_at(r[2]).replace(tzinfo=None)).days, 0))
        for r in rows
    ], dtype=np.float32)


While this fix is correct and solves the immediate TypeError, a more robust long-term solution is to work with timezone-aware datetimes consistently, rather than stripping timezone information. This prevents subtle bugs if data from different timezones is introduced in the future.

A better approach would be to ensure the graded_at datetime is timezone-aware (in UTC) before subtracting it from the aware now_utc. This makes the code's intent more explicit and safer.

Here's an alternative implementation for consideration, which could be placed inside run_xgboost_tasklet:

def _ensure_utc_aware(dt): """Helper to ensure a datetime is timezone-aware and in UTC.""" if dt.tzinfo is None: # Assume naive datetimes from DB are in UTC. return dt.replace(tzinfo=datetime.timezone.utc) # Convert other aware datetimes to UTC. return dt.astimezone(datetime.timezone.utc) sample_weights = np.array([ np.exp(-0.01 * max((now_utc - _ensure_utc_aware(_parse_graded_at(r[2]))).days, 0)) for r in rows ], dtype=np.float32)

This pattern of ensuring datetimes are aware before use is a good practice for robustness.

Fix XGBoost retrain crash: tz-aware vs naive datetime in sample_weights

03648dc

coderabbitai Bot reviewed May 18, 2026

View reviewed changes

cubic-dev-ai Bot reviewed May 18, 2026

View reviewed changes

gemini-code-assist Bot reviewed May 18, 2026

View reviewed changes

jaayslaughter-cpu merged commit e3afc09 into main May 18, 2026
8 of 9 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

PR #587 — Fix XGBoost retrain crash: tz-aware vs naive datetime TypeError in sample_weights#458

PR #587 — Fix XGBoost retrain crash: tz-aware vs naive datetime TypeError in sample_weights#458
jaayslaughter-cpu merged 1 commit into
mainfrom
fix/xgb-sample-weights-tz

jaayslaughter-cpu commented May 18, 2026 •

edited by coderabbitai Bot

Loading

Uh oh!

coderabbitai Bot commented May 18, 2026 •

edited

Loading

Walkthrough

Changes

Estimated code review effort

Poem

❌ Failed checks (1 warning)

Uh oh!

secure-code-warrior-for-github Bot commented May 18, 2026

Uh oh!

deepsource-io Bot commented May 18, 2026 •

edited

Loading

Uh oh!

coderabbitai Bot left a comment

Uh oh!

coderabbitai Bot May 18, 2026

Uh oh!

codacy-production Bot commented May 18, 2026

Uh oh!

cubic-dev-ai Bot left a comment

Uh oh!

gemini-code-assist Bot left a comment

Uh oh!

gemini-code-assist Bot May 18, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

jaayslaughter-cpu commented May 18, 2026 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Root Cause

Fix

Impact

Note

Summary by cubic

Summary by CodeRabbit

Uh oh!

coderabbitai Bot commented May 18, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Estimated code review effort

Poem

❌ Failed checks (1 warning)

Uh oh!

secure-code-warrior-for-github Bot commented May 18, 2026

Micro-Learning Topic: Security Misconfiguration (Detected by phrase)

Matched on "security misconfiguration"

Try a challenge in Secure Code Warrior

Uh oh!

deepsource-io Bot commented May 18, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

DeepSource Code Review

PR Report Card

Code Review Summary

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot May 18, 2026

Choose a reason for hiding this comment

Uh oh!

codacy-production Bot commented May 18, 2026

Up to standards ✅

Uh oh!

cubic-dev-ai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

gemini-code-assist Bot May 18, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

jaayslaughter-cpu commented May 18, 2026 •

edited by coderabbitai Bot

Loading

coderabbitai Bot commented May 18, 2026 •

edited

Loading

deepsource-io Bot commented May 18, 2026 •

edited

Loading