Skip to content

fix issue with projection __time equivalent column mapping#18215

Merged
capistrant merged 3 commits intoapache:masterfrom
clintropolis:fix-projection-time-equivalent-column-remapping
Jul 9, 2025
Merged

fix issue with projection __time equivalent column mapping#18215
capistrant merged 3 commits intoapache:masterfrom
clintropolis:fix-projection-time-equivalent-column-remapping

Conversation

@clintropolis
Copy link
Member

@clintropolis clintropolis commented Jul 8, 2025

Description

Fixes some issues with project __time column remapping. The first issue is that a projection with finer granularity would incorrectly replace a coarser granularity time_floor function with the projection __time column, instead of retaining the time_floor operation to further rollup the projection at query time, producing incorrect results. The added test covers this. Fixing this stumbled into another issue with remapping the __time column when it was the same as the projection granularity, which would incorrectly map the matching query virtual column to the internal projection time column name instead of to __time itself. idk what i was thinking with the previous logic 🤷

.setDataSource("test")
.setInterval(Intervals.ETERNITY)
.addAggregator(new LongSumAggregatorFactory("c_sum", "c"));
if (sortByDim) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i thought this meant query has a sortBy at first, but i guess here's just we want to test out time as granularity & virtual column? might worth some comments.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

also, some projections are named as daily, e.x. abfoo_daily but it's not grouping by DAY right?

Copy link
Member Author

@clintropolis clintropolis Jul 8, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i thought this meant query has a sortBy at first, but i guess here's just we want to test out time as granularity & virtual column? might worth some comments.

yea, sortByDim is whether or not the segment we are querying is stored sorted first by __time or not, its one of the test parameters which controls if we set the forceSegmentSortByTime parameter on DImensionsSpec added in #16849. When this is set, using a granularity on a query doesn't work because the assumptions the engines make about the segments when using granularity (which acts like an implicit group by time_floor(__time, granularity)) are that the rows of the cursor are ordered by __time and so it can just scan and collect the time bucket group of the granularity as it goes instead of grouping on it properly as if it were a dimension. So for group by, we have to explicitly make a time_floor and group on it instead of using the query granularity. Timeseries queries don't work either if the segment isn't first sorted by __time, so some of the tests also just skip if the segment isn't time sorted. I can rename this parameter to segmentSortedByTime or something and it might be clearer?

also, some projections are named as daily, e.x. abfoo_daily but it's not grouping by DAY right?

ah I guess I could rename them to leave the granularity off of the implicit ones, some of those are projections without a __time equivalent column, but they still live within the constraints of the segment granularity, so its an implicit time_floor of day in this case since that is when the segment interval starts.

@capistrant capistrant added this to the 34.0.0 milestone Jul 9, 2025
@capistrant capistrant merged commit 5a3fa99 into apache:master Jul 9, 2025
76 checks passed
@clintropolis clintropolis deleted the fix-projection-time-equivalent-column-remapping branch July 9, 2025 19:15
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants