fix(clp-s): Simplify timestamp range index evaluation code; Fix conversion utility used to compare float AST literals to integer values (fixes #1375). by gibber9809 · Pull Request #1369 · y-scope/clp

gibber9809 · 2025-10-02T16:16:29Z

Description

This PR fixes a bug in the double_as_int utility that the AST code uses to convert floating point literals into integers such that they can be compared against integer values.

Nominally this code is meant to do the following sorts of conversions:

1 < 1.1 -> 1 < 2
1 > 0.9 -> 1 > 0

Where depending on the kind of operation being performed, we may have to take the floor or ceiling of a given floating-point value in order to ensure correct comparison against an integer value.

Unfortunately, this conversion code had a bug causing the returned integer to always be the floor of the floating-point number for all operations besides == and !=.

We discovered this bug while investigating a case where the timestamp range index was incorrectly not being matched for certain queries -- the issue was that for archives with double encoded epoch range [a, b] we always tried to convert a literal c in a query like timestamp < c into an integer before evaluating the timestamp range index, so even if a < c was true the bug in the AST code would end up turning this comparison into a < floor(c) which may not be true.

Without the AST bug the timestamp range index evaluation code would technically be correct, but the way it was written (to allow float and integer literals to be compared with double-encoded and integer-encoded timestamp ranges interchangeably) was unnecessarily complex.

As a result, this PR:

Fixes the AST float conversion bug
Adds a search unit test covering the float conversion bug
Simplifies the timestamp range index evaluation code by always comparing integer-encoded ranges with integer interpretations of literals and double-encoded ranges with double interpretations of literals
Adds search unit tests dedicated to timestamp filtering

Checklist

The PR satisfies the contribution guidelines.
This is a breaking change and that has been indicated in the PR title, OR this isn't a
breaking change.
Necessary docs have been updated, OR no docs need to be updated.

Validation performed

Added dedicated unit tests for search against the timestamp column, some of which fail before this change
Added test case exercising floating-point conversions that fail before this change

Summary by CodeRabbit

New Features
- Improved timestamp search with support for both floating-point and epoch-based formats.
- Stricter parsing aligned to data encoding for more accurate comparisons and range queries.
Bug Fixes
- Fixed potential mis-evaluation in timestamp comparisons caused by switch fall-through.
- Standardized handling when timestamp literals don’t match the expected format, yielding consistent results.
Tests
- Added comprehensive tests for float and integer timestamp searches with new datasets.
- Expanded general search dataset to cover additional scenarios.

…ouble-encoded-epoch timestamp index.

…rs for the sake of comparison against integers.

coderabbitai · 2025-10-02T16:16:36Z

Walkthrough

Unifies timestamp filter evaluation into a single encoding-guarded switch in TimestampEntry, adds a public timestamp encoding accessor, changes search parsing to depend on the entry's encoding (float vs int), fixes switch fall-through in SearchUtils, and adds/initializes timestamp tests and test data for float and epoch timestamps.

Changes

Cohort / File(s)	Summary
TimestampEntry refactor `components/core/src/clp_s/TimestampEntry.cpp`, `components/core/src/clp_s/TimestampEntry.hpp`	Consolidates per-encoding branching into one encoding-guarded switch for evaluate_filter; adds inline accessor `auto get_timestamp_encoding() const -> TimestampEncoding`.
Search timestamp evaluation `components/core/src/clp_s/search/EvaluateTimestampIndex.cpp`	Parses timestamp literals according to the range entry's encoding: `DoubleEpoch` → require float literal, `Epoch` → require integer literal; non-conforming literals return Unknown.
AST switch fall-through fix `components/core/src/clp_s/search/ast/SearchUtils.cpp`	Adds explicit `break` statements to prevent unintended fall-through in `double_as_int` switch cases (GT/LTE, LT/GTE, default).
Test harness init `components/core/tests/clp_s_test_utils.cpp`	Adds `#include` for `clp_s/TimestampPattern.hpp` and calls `clp_s::TimestampPattern::init()` before constructing `clp_s::JsonParser`.
Tests: new timestamp cases `components/core/tests/test-clp_s-search.cpp`	Adds float-timestamp and epoch-timestamp TEST_CASEs; introduces constants for new input files and timestamp key; adds an extra numeric query to an existing test.
Test data additions/changes `components/core/tests/test_log_files/test_search.jsonl`, `components/core/tests/test_log_files/test_search_float_timestamp.jsonl`, `components/core/tests/test_log_files/test_search_int_timestamp.jsonl`	Appends one record to `test_search.jsonl`; adds `test_search_float_timestamp.jsonl` (float timestamps) and `test_search_int_timestamp.jsonl` (epoch-millisecond timestamps).

Sequence Diagram(s)

sequenceDiagram
  autonumber
  actor User
  participant CLI as clp-s-search
  participant Engine as Search Engine
  participant Index as TimestampIndex
  participant Entry as TimestampEntry
  participant Parser as Literal Parser

  User->>CLI: submit timestamp query
  CLI->>Engine: execute(query)
  Engine->>Index: evaluate(range, expr)
  Index->>Entry: get_timestamp_encoding()
  alt Entry.encoding == DoubleEpoch
    Index->>Parser: parse literal as float
    Parser-->>Index: float or Unknown
    alt float
      Index->>Entry: evaluate_filter(op, float)
      Entry-->>Index: True/False/Unknown
    else Unknown
      Note over Index: non-float literal → Unknown
    end
  else Entry.encoding == Epoch
    Index->>Parser: parse literal as int
    Parser-->>Index: int or Unknown
    alt int
      Index->>Entry: evaluate_filter(op, int)
      Entry-->>Index: True/False/Unknown
    else Unknown
      Note over Index: non-int literal → Unknown
    end
  else Other
    Note over Index: unsupported encoding → Unknown
  end
  Index-->>Engine: evaluation result
  Engine-->>CLI: filtered results
  CLI-->>User: output

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Pre-merge checks and finishing touches

✅ Passed checks (3 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit's high-level summary is enabled.
Docstring Coverage	✅ Passed	No functions found in the changes. Docstring coverage check skipped.
Title Check	✅ Passed	The title clearly summarizes the key changes by indicating both the simplification of timestamp range index evaluation and the correction to the float-to-integer conversion utility, while also specifying the affected component and related issue, making it informative and directly relevant to the pull request.

✨ Finishing touches

📝 Generate docstrings

🧪 Generate unit tests (beta)

Create PR with unit tests
Post copyable unit tests in a comment

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

wraymo

Thanks for the PR! Just two questions

wraymo · 2025-10-02T19:55:32Z

components/core/tests/test-clp_s-search.cpp

+    std::vector<std::pair<std::string, std::vector<int64_t>>> queries_and_results{
+            {R"aa(timestamp < 1759417024400)aa", {0, 1, 2}},
+            {R"aa(timestamp > 1759417023100)aa", {0, 1, 2}},
+            {R"aa(timestamp > 1759417024000)aa", {0, 1, 2}},


Should we add a test for comparing timestamp with a floating point value?

Sure, I can add one.

wraymo · 2025-10-02T19:56:11Z

components/core/tests/test_log_files/test_search.jsonl

 {"idx": 10, "ambiguous_varstring": "abcde"}
 {"idx": 11, "ambiguous_varstring": "ae"}
 {"idx": 12, "ambiguous_varstring": "a*e"}
+{"idx": 13, "one": 1}


What's this case for?

The AST bug I mentioned -- it is exercised with the one < 1.1 AND one > 0.9 AND one: 1.0 test case.

…-eval

kirkrodrigues · 2025-10-03T13:09:44Z

Nice PR description. Can we file an issue for the bug (it's user-facing, right?) and refer to it in the title? Otherwise, it's a bit awkward to refer users to the PR, and the title itself doesn't (or can't) indicate what the actual bug was from a user perspective (which, in turn, makes writing release notes a bit harder).

kirkrodrigues

Deferring to @wraymo's review.

gibber9809 added 6 commits October 2, 2025 15:26

Avoid incorrectly truncating double literal when evaluating against d…

9a31e19

…ouble-encoded-epoch timestamp index.

Reduce unnecessary complexity in timestamp index evaluation.

dafaba5

Add dedicated tests for timestamp filtering.

4cb56c1

Fix bug in AST code that rounds double literals to appropriate intege…

2d38111

…rs for the sake of comparison against integers.

Lint fix

ca65412

Add search tests for float literal conversion bug

1054dbc

gibber9809 requested review from a team and wraymo as code owners October 2, 2025 16:16

wraymo reviewed Oct 2, 2025

View reviewed changes

gibber9809 added 2 commits October 2, 2025 20:50

Add test case for comparing float literal against int timestamp.

b52b60c

Merge remote-tracking branch 'upstream/main' into fix-timestamp-index…

4492bef

…-eval

gibber9809 requested a review from wraymo October 2, 2025 20:53

wraymo approved these changes Oct 2, 2025

View reviewed changes

Merge branch 'main' into fix-timestamp-index-eval

bdc51fa

Merge branch 'main' into fix-timestamp-index-eval

9fe29a8

gibber9809 linked an issue Oct 3, 2025 that may be closed by this pull request

clp-s: Incorrect rounding in double_as_int when converting floating-point numbers into integers for comparison against integer values. #1375

Closed

kirkrodrigues approved these changes Oct 3, 2025

View reviewed changes

gibber9809 merged commit 67276c0 into y-scope:main Oct 3, 2025
31 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(clp-s): Simplify timestamp range index evaluation code; Fix conversion utility used to compare float AST literals to integer values (fixes #1375).#1369

fix(clp-s): Simplify timestamp range index evaluation code; Fix conversion utility used to compare float AST literals to integer values (fixes #1375).#1369
gibber9809 merged 10 commits intoy-scope:mainfrom
gibber9809:fix-timestamp-index-eval

gibber9809 commented Oct 2, 2025 •

edited by coderabbitai bot

Loading

Uh oh!

coderabbitai bot commented Oct 2, 2025 •

edited

Loading

Uh oh!

wraymo left a comment

Uh oh!

wraymo Oct 2, 2025 •

edited

Loading

Uh oh!

gibber9809 Oct 2, 2025

Uh oh!

wraymo Oct 2, 2025

Uh oh!

gibber9809 Oct 2, 2025

Uh oh!

kirkrodrigues commented Oct 3, 2025

Uh oh!

kirkrodrigues left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

gibber9809 commented Oct 2, 2025 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Checklist

Validation performed

Summary by CodeRabbit

Uh oh!

coderabbitai bot commented Oct 2, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Pre-merge checks and finishing touches

Uh oh!

wraymo left a comment

Choose a reason for hiding this comment

Uh oh!

wraymo Oct 2, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

gibber9809 Oct 2, 2025

Choose a reason for hiding this comment

Uh oh!

wraymo Oct 2, 2025

Choose a reason for hiding this comment

Uh oh!

gibber9809 Oct 2, 2025

Choose a reason for hiding this comment

Uh oh!

kirkrodrigues commented Oct 3, 2025

Uh oh!

kirkrodrigues left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

gibber9809 commented Oct 2, 2025 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Oct 2, 2025 •

edited

Loading

wraymo Oct 2, 2025 •

edited

Loading