New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Proposal: Include histogram in JSON output #394
Comments
fxkr
added a commit
to fxkr/vegeta
that referenced
this issue
Apr 24, 2019
I went ahead and did it the way I think it's best; it was pretty easy. Let me know if you'd prefer it to be done differently. |
I'd LOVE this, really! |
fxkr
added a commit
to fxkr/vegeta
that referenced
this issue
May 28, 2019
fxkr
added a commit
to fxkr/vegeta
that referenced
this issue
May 28, 2019
fxkr
added a commit
to fxkr/vegeta
that referenced
this issue
Jun 24, 2019
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Proposal
Add histogram data to the
vegeta report -type json
output, e.g.:I'd be happy to write the code, but I'd like to run some design decisions by you first:
Under what key: don't care.
hist
matches CLI.List (preferred) or map. It's guaranteed ordering and no questions about the format of the key ("min", "max", "min-max") vs self-descriptiveness.
Values in msec
Bin config:
-type json -hist [0,5ms,10ms,15ms]
)-type json'{"hist":["0","5ms","10ms","15ms"]}'
)-type json'[0,5ms,10ms,15ms]'
). Similar tohist
, but I don't like this approach because it's not extensible at all, and I don't want to have to deal with breaking changes later.Either way I'd default to leaving out histogram data if no config is given for it.
Automatic logarithmic binning would be kinda neat (I love this in bcc), but I see it as out of scope.
Do you think that would be useful?
The main concern I have is that I am not 100% sure if recording the histogram is the best idea. Having just the three percentiles that vegeta puts in the JSON right now is too coarse, but we could just output more of those instead. I wonder if you have any thoughts on recording histograms vs (many points of the) CDF?
Background
I want to automatically benchmark a piece of software as part of a CI/CD pipeline and record the results to track how it changes over time. I want to record more than just the percentiles because the latency distribution can be strongly multimodal.
For example, depending on the concrete test we can hit (and possibly trash) or bypass certain caches (at different layers in our software), and having the histogram lets us see that very clearly visually.
Workarounds
report -type hist
human-readable outputThanks for writing vegeta!
The text was updated successfully, but these errors were encountered: