Investigate arm64 robustness performance #17595
Merged
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
In October last year we switched from self managed arm64 CI infrastructure from Equinix Metal to managed arm64 runners provided via the CNCF and actuated.dev.
Since then we have completed some right sizing for memory requirements for all our
arm64
workflows, however we are still hitting some teething issues with CPU performance. Notably for robustness testing.Performance issues were tracked in:
One mitigation to improve CPU performance was to disable lazyfs for
arm64
which was completed in #17323.Even with lazyfs disabled we are still seeing failures, recent examples are:
The above failures indicate CPU and/or disk IOPS performance bottlnecks, so this pull request increases robustness cpu cores from 8 to 12 and also enables vmmeter so we can more closely introspect the performance of the
arm64
runners versus standard githubamd64
runners.cc @serathius, @alexellis