Skip to content

test(common): tradeoff running time vs. repro power#12378

Merged
coryan merged 3 commits intogoogleapis:mainfrom
drfloob:reduce-grpc-Alarm-synthetic-test-load
Aug 14, 2023
Merged

test(common): tradeoff running time vs. repro power#12378
coryan merged 3 commits intogoogleapis:mainfrom
drfloob:reduce-grpc-Alarm-synthetic-test-load

Conversation

@drfloob
Copy link
Contributor

@drfloob drfloob commented Aug 14, 2023

This is a gRPC test that relied on the high performance of an uncommon (synthetic) gRPC Alarm usage pattern. Per the test comments, this change will reduce the likelihood of seeing a regression in gRPC from 99% of runs, to 10% of runs. I think this should still be sufficient to identify problems. Running this test with the next generation of gRPC Alarm implementations, in 100 runs, this TimerCancel test never exceeded 10 seconds in the asan-pr docker environment (most commonly: 5 to 7s).


This change is Reviewable

@drfloob drfloob requested a review from a team August 14, 2023 21:39
@drfloob drfloob temporarily deployed to external August 14, 2023 21:40 — with GitHub Actions Inactive
@conventional-commit-lint-gcf
Copy link

conventional-commit-lint-gcf bot commented Aug 14, 2023

🤖 I detect that the PR title and the commit message differ and there's only one commit. To use the PR title for the commit history, you can use Github's automerge feature with squashing, or use automerge label. Good luck human!

-- conventional-commit-lint bot
https://conventionalcommits.org/

@coryan coryan changed the title Reduce TimerCancel gRPC test load to 1000 iterations test(common): tradeoff running time vs. repro power Aug 14, 2023
@coryan
Copy link
Contributor

coryan commented Aug 14, 2023

SGTM. Let's see what the builds say.

@coryan
Copy link
Contributor

coryan commented Aug 14, 2023

/gcbrun

using TimerFuture = future<StatusOr<std::chrono::system_clock::time_point>>;
auto worker = [&](CompletionQueue cq) {
for (int i = 0; i != 10000; ++i) {
for (int i = 0; i != 1000; ++i) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should change the comments on line 81 to reflect the new expected values. Something like:

// The magic numbers in this test (number of iterations, thread, etc.) were originally tuned
// to repro #5141 in 99 out of 100 runs using the workstations available at the time.  Since
// then, the number of iterations was reduced. The higher number was making the test too
// long with a the new, more deterministic albeit slower implementation of gRPC alarms.
// The new implementation expires the alarm in a I/O thread vs. sometimes using the thread
// scheduling the alarm.  We expect a repro in 10 out of 100 runs.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done, PTAL.

@codecov
Copy link

codecov bot commented Aug 14, 2023

Codecov Report

Patch coverage: 100.00% and no project coverage change.

Comparison is base (a674bd7) 93.61% compared to head (b7ec042) 93.61%.
Report is 12 commits behind head on main.

Additional details and impacted files
@@           Coverage Diff           @@
##             main   #12378   +/-   ##
=======================================
  Coverage   93.61%   93.61%           
=======================================
  Files        2029     2029           
  Lines      179856   179856           
=======================================
+ Hits       168377   168379    +2     
+ Misses      11479    11477    -2     
Files Changed Coverage Δ
google/cloud/completion_queue_test.cc 97.65% <100.00%> (ø)

... and 8 files with indirect coverage changes

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@drfloob drfloob temporarily deployed to external August 14, 2023 22:39 — with GitHub Actions Inactive
@coryan
Copy link
Contributor

coryan commented Aug 14, 2023

/gcbrun

@coryan coryan merged commit 1f08806 into googleapis:main Aug 14, 2023
drfloob added a commit to grpc/grpc that referenced this pull request Aug 16, 2023
…hancements (#34056)

This PR is mainly a set of improvements that allow the C++ Alarm to be
migrated away from legacy iomgr. It cannot be landed without significant
speedup, due to third-parties relying on a fast path for immediate timer
execution with deadlines <= now.

Previous EventEngine performance of bm_alarm, compared to baseline iomgr
timers: *0.014%*
This PR: *2.5%*

Regarding previous failures to land this change: The cloud libraries
team agreed to reduce the amount of stress in their alarm stress test
googleapis/google-cloud-cpp#12378
eugeneo pushed a commit to eugeneo/grpc that referenced this pull request Aug 21, 2023
…hancements (grpc#34056)

This PR is mainly a set of improvements that allow the C++ Alarm to be
migrated away from legacy iomgr. It cannot be landed without significant
speedup, due to third-parties relying on a fast path for immediate timer
execution with deadlines <= now.

Previous EventEngine performance of bm_alarm, compared to baseline iomgr
timers: *0.014%*
This PR: *2.5%*

Regarding previous failures to land this change: The cloud libraries
team agreed to reduce the amount of stress in their alarm stress test
googleapis/google-cloud-cpp#12378
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants