Support grouping multiple rows of first match in an interval #347

metalmatze · 2023-02-15T12:11:23Z

Right now Parca queries FrostDB for all the timestamps and on the server side chooses to ignore metric samples when a sample has been found in the step's bucket.

https://github.com/parca-dev/parca/blob/ed90dbeb684186e9cdb295bc0f62c723ed3c5a9f/pkg/parcacol/querier.go#L571-L579

This isn't ideal since the data is still queried...

FrostDB should add support to find all rows for a bucket of interval and only then start aggregating.

The row data would look like this:

timestamp	stacktrace	value
1	stack1	3
1	stack2	5
1	stack3	8
2	stack1	2
2	stack2	3

What we want. We want to only return the sum(value) for the first timestamp we find in the bucket that ranges from 0-9.
So what we want as a result is:

timestamp	sum(value)
1	16

What we currently get is however the very first row that falls within that bucket (1, stack1, 3) and then the sum of it's value, which is basically a noop:

timestamp	sum(value)
1	3

The text was updated successfully, but these errors were encountered:

metalmatze · 2023-02-15T12:14:28Z

This issue is especially bad for Parca itself. On our cloud with the aggregate_view table, it shouldn't be such a bad problem for now.

albertlockett · 2023-08-28T21:41:52Z

@metalmatze As far as I an tell the Duration aggregation does what we want. I added a test to demonstrate.
https://github.com/polarsignals/frostdb/pull/503/files

That said, I think this is different than what we do here in parca, where we take the first result in the interval.
parca-dev/parca#2598

I'm wondering if maybe what we want is a First aggregation instead like this (or something like it)?
#504

github-actions · 2024-01-04T01:45:33Z

This issue is stale because it has been open 30 days with no activity. Remove stale label or comment or this will be closed in 5 days.

github-actions · 2024-02-09T01:43:36Z

This issue is stale because it has been open 30 days with no activity. Remove stale label or comment or this will be closed in 5 days.

metalmatze added the enhancement New feature or request label Feb 15, 2023

metalmatze changed the title ~~Support~~ Support grouping multiple rows of first match in an interval Feb 15, 2023

metalmatze mentioned this issue Feb 15, 2023

Support to query QueryRange non-delta profiles buckets from FrostDB parca-dev/parca#2598

Open

albertlockett mentioned this issue Aug 29, 2023

Added first aggregation #504

Closed

github-actions bot added the Stale label Jan 4, 2024

metalmatze removed the Stale label Jan 9, 2024

github-actions bot added the Stale label Feb 9, 2024

github-actions bot closed this as not planned Won't fix, can't repro, duplicate, stale Feb 14, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Support grouping multiple rows of first match in an interval #347

Support grouping multiple rows of first match in an interval #347

metalmatze commented Feb 15, 2023

metalmatze commented Feb 15, 2023

albertlockett commented Aug 28, 2023 •

edited

Loading

github-actions bot commented Jan 4, 2024

github-actions bot commented Feb 9, 2024

Support grouping multiple rows of first match in an interval #347

Support grouping multiple rows of first match in an interval #347

Comments

metalmatze commented Feb 15, 2023

metalmatze commented Feb 15, 2023

albertlockett commented Aug 28, 2023 • edited Loading

github-actions bot commented Jan 4, 2024

github-actions bot commented Feb 9, 2024

albertlockett commented Aug 28, 2023 •

edited

Loading