Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Proposal: Include histogram in JSON output #394

Closed
fxkr opened this issue Apr 23, 2019 · 2 comments
Closed

Proposal: Include histogram in JSON output #394

fxkr opened this issue Apr 23, 2019 · 2 comments

Comments

@fxkr
Copy link
Contributor

fxkr commented Apr 23, 2019

Proposal

Add histogram data to the vegeta report -type json output, e.g.:

# vegeta report -type json -hist '[0,5ms,10ms,15ms]'
  ...
  "hist": [123, 125, 40, 80],
  ...

I'd be happy to write the code, but I'd like to run some design decisions by you first:

  • Under what key: don't care. hist matches CLI.

  • List (preferred) or map. It's guaranteed ordering and no questions about the format of the key ("min", "max", "min-max") vs self-descriptiveness.

  • Values in msec

  • Bin config:

    • (preferred) Separate command line argument (-type json -hist [0,5ms,10ms,15ms])
    • As a subsequent map (-type json'{"hist":["0","5ms","10ms","15ms"]}')
    • As a subsequent array (-type json'[0,5ms,10ms,15ms]'). Similar to hist, but I don't like this approach because it's not extensible at all, and I don't want to have to deal with breaking changes later.

    Either way I'd default to leaving out histogram data if no config is given for it.

    Automatic logarithmic binning would be kinda neat (I love this in bcc), but I see it as out of scope.

Do you think that would be useful?

The main concern I have is that I am not 100% sure if recording the histogram is the best idea. Having just the three percentiles that vegeta puts in the JSON right now is too coarse, but we could just output more of those instead. I wonder if you have any thoughts on recording histograms vs (many points of the) CDF?

Background

I want to automatically benchmark a piece of software as part of a CI/CD pipeline and record the results to track how it changes over time. I want to record more than just the percentiles because the latency distribution can be strongly multimodal.

For example, depending on the concrete test we can hit (and possibly trash) or bypass certain caches (at different layers in our software), and having the histogram lets us see that very clearly visually.

Workarounds

  • Parse report -type hist human-readable output
  • Use vegeta as a library

Thanks for writing vegeta!

fxkr added a commit to fxkr/vegeta that referenced this issue Apr 24, 2019
@fxkr
Copy link
Contributor Author

fxkr commented Apr 24, 2019

I went ahead and did it the way I think it's best; it was pretty easy. Let me know if you'd prefer it to be done differently.

@tsenart
Copy link
Owner

tsenart commented Apr 28, 2019

Automatic logarithmic binning would be kinda neat (I love this in bcc), but I see it as out of scope.

I'd LOVE this, really!

fxkr added a commit to fxkr/vegeta that referenced this issue May 28, 2019
fxkr added a commit to fxkr/vegeta that referenced this issue May 28, 2019
fxkr added a commit to fxkr/vegeta that referenced this issue Jun 24, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants