Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add detailed CPU benchmark #1188

Merged
merged 4 commits into from
Oct 7, 2020
Merged

Add detailed CPU benchmark #1188

merged 4 commits into from
Oct 7, 2020

Conversation

marcotc
Copy link
Member

@marcotc marcotc commented Sep 28, 2020

This PR adds CPU profiling using ruby-prof .

All our existing benchmarks only provide deep analysis for memory behavior. This PR introduces a detailed analysis tool to measure application timing.

By default, the results are output in Cachegrind format, and can be analyzed with tools like KCachegrind or QCachegrind.
Here's an example:
Screen Shot 2020-09-28 at 3 04 25 PM

When the benchmark runs, instructions are printed on how to get to this colorful screen I posted above.

@marcotc marcotc added the performance Involves performance (e.g. CPU, memory, etc) label Sep 28, 2020
@marcotc marcotc requested a review from a team September 28, 2020 19:08
@marcotc marcotc self-assigned this Sep 28, 2020
after { tracer.shutdown! }

let(:writer) { Datadog::Writer.new(buffer_size: 1000, flush_interval: 0) }
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This ensures that spans are actually being consumed quickly by the writer, instead of being mostly dropped by the buffer due to the 1000 span default limit.

Our benchmarks aim to simulate a realistic user scenario, but as quickly as possible. Having the worker flush spans more frequently helps us accomplish that goal.


# Warm up
def warm_up
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Warm up was previously being done on a before RSpec block, which could run at any point of the setup phase, which can cause issues when benchmark setups are required to run before the warming up starts.

We now moved warm_up to inside the test run block, which ensures that it runs after all setup is done.

# Read HTTP request to allow other side to have enough
# buffer write room. If we don't, the client won't be
# able to send the full request until the buffer is cleared.
conn.read(1 << 31 - 1)
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is happening because, in the end-to-end test, we are trying to send very large payloads, due to the buffer constantly having close to 1000 items.

The network write buffer was getting full on the worker side, which would just block and not continue flushing until it timed out.

@marcotc marcotc changed the title Add CPU benchmark Add detailed CPU benchmark Sep 28, 2020
ericmustin
ericmustin previously approved these changes Sep 29, 2020
Copy link
Contributor

@ericmustin ericmustin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Lgtm. Maybe we want to include a line or small section development docs about using it, so 3rd party contribs know to use tools like KCachegrind or QCachegrind to view the results, or generally any things we want a contributor to pay attention to in the results?

@marcotc
Copy link
Member Author

marcotc commented Sep 29, 2020

@ericmustin I added a section to our developer guide about benchmarks.
I believe these instructions should be available in our Pull Request templates, when we implement them.

Benchmark specific instructions, like how to process results in this PR, are printed in each benchmark run as they are potentially different for each benchmark result. It would be hard to keep our markdown file and benchmark instructions in sync, while printing them with the benchmarks makes that part easier.

@marcotc marcotc merged commit fbb068a into master Oct 7, 2020
@marcotc marcotc added this to the 0.42.0 milestone Oct 7, 2020
michaelkl pushed a commit to michaelkl/dd-trace-rb that referenced this pull request Oct 23, 2020
* Add CPU benchmark

* Remove incompatible ruby-prof for JRuby runs

* Remove incompatible 'ruby-prof' for Ruby < 2.4 runs

* Add benchmarks to developer guide
@ivoanjo ivoanjo deleted the perf/cpu-bench branch July 16, 2021 09:13
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
performance Involves performance (e.g. CPU, memory, etc)
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants