fix(clp-s): Simplify timestamp range index evaluation code; Fix conversion utility used to compare float AST literals to integer values (fixes #1375).#1369
Conversation
…ouble-encoded-epoch timestamp index.
…rs for the sake of comparison against integers.
WalkthroughUnifies timestamp filter evaluation into a single encoding-guarded switch in TimestampEntry, adds a public timestamp encoding accessor, changes search parsing to depend on the entry's encoding (float vs int), fixes switch fall-through in SearchUtils, and adds/initializes timestamp tests and test data for float and epoch timestamps. Changes
Sequence Diagram(s)sequenceDiagram
autonumber
actor User
participant CLI as clp-s-search
participant Engine as Search Engine
participant Index as TimestampIndex
participant Entry as TimestampEntry
participant Parser as Literal Parser
User->>CLI: submit timestamp query
CLI->>Engine: execute(query)
Engine->>Index: evaluate(range, expr)
Index->>Entry: get_timestamp_encoding()
alt Entry.encoding == DoubleEpoch
Index->>Parser: parse literal as float
Parser-->>Index: float or Unknown
alt float
Index->>Entry: evaluate_filter(op, float)
Entry-->>Index: True/False/Unknown
else Unknown
Note over Index: non-float literal → Unknown
end
else Entry.encoding == Epoch
Index->>Parser: parse literal as int
Parser-->>Index: int or Unknown
alt int
Index->>Entry: evaluate_filter(op, int)
Entry-->>Index: True/False/Unknown
else Unknown
Note over Index: non-int literal → Unknown
end
else Other
Note over Index: unsupported encoding → Unknown
end
Index-->>Engine: evaluation result
Engine-->>CLI: filtered results
CLI-->>User: output
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~25 minutes Pre-merge checks and finishing touches✅ Passed checks (3 passed)
✨ Finishing touches
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
wraymo
left a comment
There was a problem hiding this comment.
Thanks for the PR! Just two questions
| std::vector<std::pair<std::string, std::vector<int64_t>>> queries_and_results{ | ||
| {R"aa(timestamp < 1759417024400)aa", {0, 1, 2}}, | ||
| {R"aa(timestamp > 1759417023100)aa", {0, 1, 2}}, | ||
| {R"aa(timestamp > 1759417024000)aa", {0, 1, 2}}, |
There was a problem hiding this comment.
Should we add a test for comparing timestamp with a floating point value?
There was a problem hiding this comment.
Sure, I can add one.
| {"idx": 10, "ambiguous_varstring": "abcde"} | ||
| {"idx": 11, "ambiguous_varstring": "ae"} | ||
| {"idx": 12, "ambiguous_varstring": "a*e"} | ||
| {"idx": 13, "one": 1} |
There was a problem hiding this comment.
The AST bug I mentioned -- it is exercised with the one < 1.1 AND one > 0.9 AND one: 1.0 test case.
|
Nice PR description. Can we file an issue for the bug (it's user-facing, right?) and refer to it in the title? Otherwise, it's a bit awkward to refer users to the PR, and the title itself doesn't (or can't) indicate what the actual bug was from a user perspective (which, in turn, makes writing release notes a bit harder). |
kirkrodrigues
left a comment
There was a problem hiding this comment.
Deferring to @wraymo's review.
Description
This PR fixes a bug in the
double_as_intutility that the AST code uses to convert floating point literals into integers such that they can be compared against integer values.Nominally this code is meant to do the following sorts of conversions:
1 < 1.1->1 < 21 > 0.9->1 > 0Where depending on the kind of operation being performed, we may have to take the floor or ceiling of a given floating-point value in order to ensure correct comparison against an integer value.
Unfortunately, this conversion code had a bug causing the returned integer to always be the floor of the floating-point number for all operations besides
==and!=.We discovered this bug while investigating a case where the timestamp range index was incorrectly not being matched for certain queries -- the issue was that for archives with double encoded epoch range [a, b] we always tried to convert a literal
cin a query liketimestamp < cinto an integer before evaluating the timestamp range index, so even ifa < cwas true the bug in the AST code would end up turning this comparison intoa < floor(c)which may not be true.Without the AST bug the timestamp range index evaluation code would technically be correct, but the way it was written (to allow float and integer literals to be compared with double-encoded and integer-encoded timestamp ranges interchangeably) was unnecessarily complex.
As a result, this PR:
Checklist
breaking change.
Validation performed
Summary by CodeRabbit
New Features
Bug Fixes
Tests