Skip to content

feat(tesseract): native Rust time-series generation#10980

Merged
waralexrom merged 7 commits into
masterfrom
tesseract-native-time-series-generation
Jun 1, 2026
Merged

feat(tesseract): native Rust time-series generation#10980
waralexrom merged 7 commits into
masterfrom
tesseract-native-time-series-generation

Conversation

@waralexrom
Copy link
Copy Markdown
Member

Summary

Move time-series bucket generation for the Tesseract (Rust) planner from the JS bridge into native Rust. The planner no longer calls back into JavaScript to build time series, and the dead bridge methods are removed.

Changes

  • Add QueryTimeSeries (planner/time_dimension/time_series.rs) with generate_predefined and generate_custom, mirroring the established bucket/alignment/padding semantics; both paths are capped at a shared module-level MAX_BUCKETS.
  • Route physical_plan::TimeSeries and the mock through the Rust impl for drivers without generated-series support; delete the old mock-only helper.
  • Fix week-only custom granularity origin: snap the default origin to ISO Monday (matches the JS fixOriginForWeeksIfNeeded), via SqlInterval::is_week_only + QueryDateTime::start_of_iso_week.
  • Drop the now-dead generate_time_series / generate_custom_time_series from the BaseTools bridge trait and its consumers in cubejs-backend-native (StubBaseTools, invoke_base_tools, JS bridge fixtures). The JS implementations stay for the legacy planner.
  • Enable the previously-ignored custom-granularity-as-query-granularity tests (half_year / bi_weekly / to_date); pin an explicit origin on the bi_weekly fixture so its snapshot is year-stable.

Testing

  • cargo test -p cubesqlplanner — time_series unit tests (predefined/custom shapes, bucket cap, calendar intervals, precision, alignment) and rolling_window integration tests (snapshots verified against seed data) pass.
  • cargo build of cubejs-backend-native compiles after the bridge change.
  • cargo fmt --check clean.

Add QueryTimeSeries to planner/time_dimension with two entry points:
  * generate_predefined(granularity, range, precision) — snaps the
    start to the bucket boundary (day / week ISO-Mon / month / quarter
    / year / hour / minute / second) and emits buckets until past the
    end. Sub-second part padded with '0' / '9' to the requested precision.
  * generate_custom(interval, range, origin, precision) — aligns to
    origin, walks by parsed SqlInterval (calendar-aware for month /
    quarter / year via chrono::Months, fixed Duration for sub-day),
    end = next - 1 second. Iteration capped at 50k buckets.

physical_plan::TimeSeries and MockBaseTools now route through the Rust
impl for drivers without generated-series support; the bridge
generate_time_series / generate_custom_time_series calls disappear from
the planner path. The mock-only helper test_fixtures/cube_bridge/
time_series.rs is deleted — the new module covers it with 15 inline
tests for predefined / custom shapes.
A custom granularity with a week-only interval (e.g. "2 weeks") and no
explicit origin/offset fell back to the default origin (start of the
current year) without snapping to the start of the week. That produced
bucket boundaries offset from the expected Monday-aligned grid.

Snap the default origin to the ISO Monday when the interval is week-only,
applied in both the default and offset branches of Granularity. Adds
SqlInterval::is_week_only and QueryDateTime::start_of_iso_week, covered
by deterministic unit tests in granularity.rs.
Enable the previously-ignored half_year / bi_weekly / to_date rolling
window tests now that the Rust time-series path generates the buckets.
Pin an explicit origin on the bi_weekly fixture granularity so its
bucket alignment no longer depends on the current year, keeping the
snapshot stable. Commit the reviewed insta baselines (verified against
the seed data).
The planner now generates time series via QueryTimeSeries, so the
BaseTools bridge no longer calls generate_time_series /
generate_custom_time_series. Remove them from the native_bridge trait
and the mock. The JavaScript implementations stay in place for the
legacy planner.
- Drop the now-removed generate_time_series / generate_custom_time_series
  from the cubejs-backend-native bridge test exports (StubBaseTools impl
  and invoke_base_tools) and the matching JS bridge fixtures / coverage
  list, so the native crate compiles after the BaseTools trait change.
- Cap generate_predefined at the shared MAX_BUCKETS (hoisted to module
  level) so a wide range at second/minute granularity errors instead of
  looping unbounded.
- add_interval_to_dt: return the combined datetime directly and note the
  fixed unit-application order.
Add unit tests for previously-uncovered QueryTimeSeries paths:
- custom MAX_BUCKETS cap
- months+days and sub-day intervals in add_interval_to_dt
- the non-advancing (net-negative) interval guard
- precision-6 padding on the custom path
- quarter and year custom intervals
@waralexrom waralexrom requested a review from a team as a code owner June 1, 2026 11:02
@github-actions github-actions Bot added rust Pull requests that update Rust code javascript Pull requests that update Javascript code labels Jun 1, 2026
@claude
Copy link
Copy Markdown
Contributor

claude Bot commented Jun 1, 2026

Claude finished @waralexrom's task in 1m 46s —— View job


Claude PR Review

Verdict: Approve with nits — 1 medium (latent infinite-loop in align_to_origin), 2 low (range-end boundary inconsistency, interval-order contract).

Details

Todos

  • Read new time_series.rs module
  • Read granularity / date_time / sql_interval changes
  • Read physical_plan and bridge changes
  • Review tests
  • Post inline findings

Summary

This is a clean, well-scoped move: the JS bridge methods generate_time_series / generate_custom_time_series are deleted along with the StubBaseTools / fixture plumbing, and the planner now calls a native QueryTimeSeries impl. Semantics match the documented JS behavior (bucket alignment, week-Monday snapping via fixOriginForWeeksIfNeeded, sub-second '0'/'9' padding). The shared MAX_BUCKETS = 50_000 cap is a sensible guard.

Test coverage is good: predefined paths cover every granularity unit + precision-6 padding + the cap; custom paths cover origin-before-start, origin-after-start (back-walk), calendar months, mixed month+day, sub-day, quarter, year, zero/garbage intervals, the bucket cap, and the non-advancing-interval guard. The new granularity.rs unit tests pin the week-snap rule across the four interesting cases (default, with offset, non-week interval, explicit origin).

Inline findings

  • Medium — time_series.rs:341–365 align_to_origin: no iteration cap. If start < origin and the interval is net-negative, sub_interval_from_dt actually advances offset further from start (negation of a negative) and spins forever. The main-loop guard and the next <= aligned check don't cover this prologue. Suggest reusing MAX_BUCKETS here with the same diagnostic.
  • Low — time_series.rs:43 vs 81: range-end boundary is inclusive in the predefined path (bucket_start > range_end break) but exclusive in the custom path (aligned < range_end). If a dateRange upper bound lands exactly on a bucket boundary, the two paths emit different counts. Either unify or add a comment confirming this mirrors legacy JS.
  • Low — time_series.rs:286: comment already flags that non-canonical intervals (e.g. "5 days 1 month") can differ from written-order application due to month-end clamping. Either normalize SqlInterval at parse time or assert canonical order, so the contract isn't implicit.

Praise

  • The week-only origin snap (SqlInterval::is_week_only + QueryDateTime::start_of_iso_week) is the right place for that fix; the granularity tests pin it cleanly.
  • Dropping the query_tools field from physical_plan::TimeSeries is a nice incidental cleanup — the node no longer needs the query context just to produce buckets.
  • Pinning an explicit origin on the bi_weekly fixture so the snapshot stays year-stable is exactly right.
· branch: `tesseract-native-time-series-generation`

@codecov
Copy link
Copy Markdown

codecov Bot commented Jun 1, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 78.84%. Comparing base (8265a64) to head (d169333).
⚠️ Report is 4 commits behind head on master.

Additional details and impacted files
@@           Coverage Diff           @@
##           master   #10980   +/-   ##
=======================================
  Coverage   78.84%   78.84%           
=======================================
  Files         470      470           
  Lines       93390    93390           
  Branches     3466     3466           
=======================================
  Hits        73637    73637           
  Misses      19250    19250           
  Partials      503      503           
Flag Coverage Δ
cube-backend 58.43% <ø> (-0.01%) ⬇️
cubesql 83.41% <ø> (+<0.01%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

- Guard both align_to_origin loops with a MAX_BUCKETS cap so a
  net-negative or net-zero interval errors instead of spinning the
  planner thread; add a regression test for start < origin.
- Document the intentional inclusive (predefined) vs exclusive (custom)
  range-end handling.
- Tighten the add_interval_to_dt comment: SqlInterval is field-based, so
  month-end clamping happens before days are added.
@claude
Copy link
Copy Markdown
Contributor

claude Bot commented Jun 1, 2026

Claude finished @waralexrom's task in 39s —— View job


Claude PR Review (re-review)

Verdict: Approve — all 3 prior findings addressed in d169333. No new issues.

Details

Todos

  • List existing review threads
  • Diff latest commit against prior findings
  • Resolve stale threads
  • Confirm no new issues introduced

Resolution of prior review

Prior finding Status
Medium — align_to_origin no iteration cap (infinite loop on net-negative interval with start < origin) Fixed. converge_guard caps both alignment loops at MAX_BUCKETS and errors with "did not converge". Regression test custom_alignment_non_convergent_errors covers the previously-unreachable path.
Low — range-end boundary inclusive (predefined) vs exclusive (custom) Documented. Comment at time_series.rs:48 calls out the asymmetry as intentional.
Low — interval-unit-order contract implicit Clarified. Comment at add_interval_to_dt now explains that SqlInterval is field-based — written order is already lost at parse time, so months → days → sub-day clamping is the contract, not a latent edge case.

All three corresponding inline threads resolved.

Re-review pass

Walked the rest of the diff for anything the prior pass might have missed. Nothing new surfaced:

  • Bridge removal (BaseTools, StubBaseTools, fixtures, JS bridge tests) is symmetric — no orphan callers grep'd up.
  • physical_plan::TimeSeries route correctly switches between native QueryTimeSeries and driver-supplied generated series; query_tools field drop is clean.
  • Snapshot fixtures (half_year, bi_weekly, to_date) look year-stable thanks to the explicit origin pin on bi_weekly.
· branch: `tesseract-native-time-series-generation`

@waralexrom waralexrom merged commit 48a3e41 into master Jun 1, 2026
251 of 257 checks passed
@waralexrom waralexrom deleted the tesseract-native-time-series-generation branch June 1, 2026 13:28
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

javascript Pull requests that update Javascript code rust Pull requests that update Rust code

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants