Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(performance): implement sampling for lifecycle queries #14283

Merged
merged 9 commits into from Feb 17, 2023

Conversation

yakkomajuri
Copy link
Collaborator

Problem

#12908

Changes

Adds end-to-end sampling support for lifecycle queries. Mostly to get feedback at this point.

Why lifecycle? It was just the easiest query to work with.

Screenshot 2023-02-16 at 15 46 18

How did you test this code?

Manually

@yakkomajuri
Copy link
Collaborator Author

Not a WIP in the sense that this works and could safely get merged but certainly an initial approach to gather feedback

@cached_property
def sample_factor(self) -> Optional[float]:
factor = None
if self._data.get("sample_results", False):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Q: Why not send the sample factor from the FE?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm still unsure what the approach should be with sampling. Should we let the client decide or should we make decisions for them?

@mariusandra seems to like the idea of the client deciding too. From my side, I think I still have non-technical users in mind (i.e. building for PMs), but maybe I should drop that and think about devs first instead. If that's the case, then yeah I'd send this from the frontend and maybe make it a slider-like thing like Marius mentioned

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note we can hard-code the value from the frontend if we want and keep api flexibility up :)

@@ -85,6 +85,8 @@ def parse_response(stats: Dict, filter: Filter, additional_values: Dict = {}) ->
counts = stats[1]
labels = [item.strftime("%-d-%b-%Y{}".format(" %H:%M" if filter.interval == "hour" else "")) for item in stats[0]]
days = [item.strftime("%Y-%m-%d{}".format(" %H:%M:%S" if filter.interval == "hour" else "")) for item in stats[0]]
if filter.sample_factor:
counts = [c * (1 / filter.sample_factor) for c in counts]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: Probably worth creating a correct_for_sampling function.

@yakkomajuri
Copy link
Collaborator Author

yakkomajuri commented Feb 17, 2023

Will move fast/iteratively with this project and evolve the API & UI as I go along

@yakkomajuri yakkomajuri enabled auto-merge (squash) February 17, 2023 13:59
@yakkomajuri yakkomajuri merged commit 4c2a3fb into master Feb 17, 2023
@yakkomajuri yakkomajuri deleted the sampling branch February 17, 2023 14:32
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants