set crash kernel memory based on host memory #3650
Closed
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Merge Checklist
All boxes should be checked before merging the PR (just tick any boxes which don't apply to this PR)
*-static
subpackages, etc.) have had theirRelease
tag incremented../cgmanifest.json
,./toolkit/tools/cgmanifest.json
,./toolkit/scripts/toolchain/cgmanifest.json
,.github/workflows/cgmanifest.json
)./SPECS/LICENSES-AND-NOTICES/data/licenses.json
,./SPECS/LICENSES-AND-NOTICES/LICENSES-MAP.md
,./SPECS/LICENSES-AND-NOTICES/LICENSE-EXCEPTIONS.PHOTON
)*.signatures.json
filessudo make go-tidy-all
andsudo make go-test-coverage
passSummary
What does the PR accomplish, why was it needed?
Set crashkernel param in kernel.spec based on host memory. This makes kernel crash recovery work consistently. Using less memory than needed causes kdump to take longer to create the vmcore file (memory dump)
Based on a few sources (1, 2), the memory allocated to crash kernel ought to be based on the total memory available to the host. There's a crashkernel=auto option that tries to do this and I think ideally, we would like to use this. However, on a 2gb ram hyper vm machine, the auto option will still assign 128mb to the crash kernel, and this will cause kdump issues. So, we'd want to do something similar to the auto option, but with higher memory allocated to crash kernel. The range I use is based on a recommended config for RHEL6.0 and RHEL6.1 where I slightly modify the lower range of the values (2gb-6gb -> 1gb-6gb) to be able to get consistent kernel recoveries on my hyper vm instance(128mb wasn't enough for 1.8gb of ram available) Using this range, I was able to recover consistently with 6gb and 9gb of ram as well, showing this range works for higher ram values.
Follow up:
Why does mariner crashkernel requires more than 128mb (default recommended by canonical and assigned by auto option)for kernel crashes to recover consistently
Do higher ram values really require a bigger crashkernel, maybe we can use crashkernel=256mb(or lower) for higher values as well.
A: Further testing on hyper v suggest we could stick to 256m for everything. I was able to crash and recover logs fast and consecutively using 256m on a vm with 10gb of ram
Change Log
Does this affect the toolchain?
YES
Associated issues
Test Methodology
echo c > /proc/sysrq-trigger
to trigger a kernel crash and noticed inconsistent results. (Sometimes it would recover after a few minutes, sometimes it would hang and not recover)