Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add rate aggFn support for sum metrics #77

Merged
merged 3 commits into from
Oct 31, 2023
Merged

Conversation

MikeShi42
Copy link
Contributor

image

Found a few things that we'd likely need to clean up for #76, but I think this makes a huge improvement in making sum metrics usable.

@changeset-bot
Copy link

changeset-bot bot commented Oct 29, 2023

🦋 Changeset detected

Latest commit: 38215d1

The changes in this PR will be included in the next version bump.

This PR includes changesets to release 2 packages
Name Type
@hyperdx/api Patch
@hyperdx/app Patch

Not sure what this means? Click here to learn what changesets are.

Click here if you're a maintainer who wants to add another changeset to this PR

SELECT
if(
runningDifference(value) < 0
OR neighbor(_string_attributes, -1, _string_attributes) != _string_attributes,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I assume this handles the edge cases. I wonder if this map comparison is strict enough or we need to sort keys first

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah that's a good point, do you think it's something we should do in the query or at ingestion time?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks like sorting is done on the SDK side today. Prometheus looks to operate under the same assumption so I think we're okay here.

AND data_type = ?
AND (?)
GROUP BY
_string_attributes,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we also group by the name here ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since name is fixed, it didn't seem necessary so I just did min(name), though now I wonder if group by name is actually more performant than min(name). I think right now the query is correct - but I realize that there's possibly another option where we just inject the name as a fixed value in the query itself so Clickhouse doesn't have to do any additional processing on the name column as well.

On the perf side - I'm realizing that the indexing order has an effect here on the grouping as well (and it seems like a load of other things), and we can probably come back and do some benchmarking/see if there's an optimization to be done here later as CH's grouping optimization logic is quite complex.

wrn14897
wrn14897 previously approved these changes Oct 29, 2023
Copy link
Contributor

@wrn14897 wrn14897 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

@MikeShi42 MikeShi42 merged commit 8cb0eac into main Oct 31, 2023
3 checks passed
@MikeShi42 MikeShi42 deleted the mikeshi/add-metric-rate branch October 31, 2023 07:06
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants