Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Speed up JSON output #1114

Merged
merged 4 commits into from
Sep 2, 2019
Merged

Speed up JSON output #1114

merged 4 commits into from
Sep 2, 2019

Conversation

mstoykov
Copy link
Collaborator

The commit includes three main changes:

  1. Don't write on Collect instead (as all other outputs) append to a
    buffer and write the buffer in a separate goroutine every 100
    milliseconds
  2. Use json.Encoder for speed up
  3. If the output file ends in .gz gzip the output which in my testing
    decreases the file size by 30x. I couldn't measure any performance
    degradation from the fact we need to compress if anything there is some
    performance gain because of the smaller io writes

I was testing with simple script doing batch requests for a local zero
length file with 40VUS, that makes around 8k RPS using around 600-700mb
of memory. When running before those I was getting 5.8-6k RPS, after
this I am getting 6.5-6.9k RPS with around 3GB of memory usage.

At this point my cpu and memory profiling shows that the problem is the
json encoding upwards of 10 seconds and as such the next possible
optimizations are in using some kind of faster json encoding.

The commit includes three main changes:
1. Don't write on Collect instead (as all other outputs) append to a
buffer and write the buffer in a separate goroutine every 100
milliseconds
2. Use json.Encoder for speed up
3. If the output file ends in `.gz` gzip the output which in my testing
decreases the file size by 30x. I couldn't measure any performance
degradation from the fact we need to compress if anything there is some
performance gain because of the smaller io writes

I was testing with simple script doing batch requests for a local zero
length file with 40VUS, that makes around 8k RPS using around 600-700mb
of memory. When running before those I was getting 5.8-6k RPS, after
this I am getting 6.5-6.9k RPS with around 3GB of memory usage.

At this point my cpu and memory profiling shows that the problem is the
json encoding upwards of 10 seconds and as such the next possible
optimizations are in using some kind of faster json encoding.
@mstoykov mstoykov requested a review from na-- August 14, 2019 09:25
@mstoykov
Copy link
Collaborator Author

I would like to add that the test here and in #1113 is basically using all available CPU either way ... so there is essentially no head room so adding an output WILL decrease performance in all cases.

In both PRs I am basically making that hit smaller and hopefully less noticeable in the cases where the CPU usage is at 100% , while still saving all of the metric data.

In this PR for example the fact that the gzip decreases the json output by 30x makes the json output viable as a metric output even for large and long tests. The example above with 6.5k+ RPS for 4 minutes was producing 3GB files which got reduced to 100-120MB with gzip.
While in the example above we will probably fill our memory with unwritten samples before a 2 hour test ends it will probably work with some lower RPS.

I did some testing and around 5.5-5.6RPS all the json writes finish within 100ms or less which makes it viable for that load IMO

@codecov-io
Copy link

codecov-io commented Aug 14, 2019

Codecov Report

Merging #1114 into master will decrease coverage by 0.15%.
The diff coverage is 13.46%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master    #1114      +/-   ##
==========================================
- Coverage   73.25%   73.09%   -0.16%     
==========================================
  Files         141      141              
  Lines       10260    10282      +22     
==========================================
  Hits         7516     7516              
- Misses       2302     2322      +20     
- Partials      442      444       +2
Impacted Files Coverage Δ
stats/json/collector.go 12.34% <13.46%> (-1.22%) ⬇️
core/engine.go 92.99% <0%> (-0.94%) ⬇️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update e3c279f...67cede7. Read the comment docs.

stats/json/collector.go Outdated Show resolved Hide resolved
stats/json/collector.go Show resolved Hide resolved
stats/json/collector.go Show resolved Hide resolved
stats/json/collector.go Show resolved Hide resolved
@na-- na-- changed the title perf(jsonOutput): speed up json output Speed up json output Aug 14, 2019
@na-- na-- changed the title Speed up json output Speed up JSON output Aug 14, 2019
stats/json/collector.go Outdated Show resolved Hide resolved
stats/json/collector.go Outdated Show resolved Hide resolved
stats/json/collector.go Show resolved Hide resolved
stats/json/collector.go Show resolved Hide resolved
Copy link
Contributor

@imiric imiric left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

@mstoykov mstoykov merged commit 2dbf695 into master Sep 2, 2019
@mstoykov mstoykov deleted the fix/jsonOutputBeingSlow branch September 2, 2019 05:39
srguglielmo pushed a commit to srguglielmo/k6 that referenced this pull request Nov 3, 2019
* perf(jsonOutput): speed up json output

The commit includes three main changes:
1. Don't write on Collect instead (as all other outputs) append to a
buffer and write the buffer in a separate goroutine every 100
milliseconds
2. Use json.Encoder for speed up
3. If the output file ends in `.gz` gzip the output which in my testing
decreases the file size by 30x. I couldn't measure any performance
degradation from the fact we need to compress if anything there is some
performance gain because of the smaller io writes

I was testing with simple script doing batch requests for a local zero
length file with 40VUS, that makes around 8k RPS using around 600-700mb
of memory. When running before those I was getting 5.8-6k RPS, after
this I am getting 6.5-6.9k RPS with around 3GB of memory usage.

At this point my cpu and memory profiling shows that the problem is the
json encoding upwards of 10 seconds and as such the next possible
optimizations are in using some kind of faster json encoding.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

6 participants