Allow innodb to use multiple large page sizes #257
Closed
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Linux has supported multiple large page sizes since kernel ~2.6.32.
The Linux-3.8 kernel added a mmap call to retreive specific large page
sizes.
Currently innodb falls back to conventional mmap if shmget SHM_HUGETLB fails,
meaning the deallocation attempts to use shmdt on an mmapped segment.
Using shared memory means that kernel limits of kernel.shmall or kernel.shmmax
need to be adjusted and a hugetlbfs mount needed to occur.
For all these reasons mmap is an easier to use function. The sysadmin,
without rebooting, mounting filesystems, sysctls or change
large-page-size settings. kernel can change the allocation of huge pages available like:
echo 4 > /sys/devices/system/node/node0/hugepages/hugepages-1048576kB/nr_hugepages
The innodb large page allocator will choose a large page size smaller or
equal to the requested size and allocate a block of memory. Meaning 1G
pages will be used for a large innodb buffer pool while log buffers can
use 2M pages. If a large page size is unavailable it will fall back to
a smaller page size before reverting to convential memory.
The meaning of large-page-size system variable has changed for 3.8
kernels that support multiple page size. 0 means choose the most
approprate size for the location otherwise its the largest page size
that will be used. This is only a compatibility issue if large-pages=1
and large-page-size=0 is a valid disabling mechanism.