New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
add Learn how to enable libhugetlbfs to increase performance #611
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,39 @@ | ||
--- | ||
title: Learn how to enable libhugetlbfs to increase performance on Arm Server | ||
|
||
minutes_to_complete: 60 | ||
|
||
who_is_this_for: This is an advanced topic for performance engineers who wants to tune performance in arm server. | ||
|
||
learning_objectives: | ||
- enable libhugetlbfs to increase performance | ||
- see how much performance improved on workloads like MySQL, Redis. | ||
|
||
prerequisites: | ||
- system with ubuntu 20 installed | ||
- knowledge to build MySQL server, and run sysbench benchmark test | ||
- knowledge to build Redis server, and run memtier benchmark test | ||
|
||
author_primary: Bolt Liu | ||
|
||
skilllevels: Advanced | ||
subjects: Databases | ||
armips: | ||
- Neoverse | ||
operatingsystems: | ||
- Linux | ||
tools_software_languages: | ||
- C | ||
- C++ | ||
|
||
test_images: | ||
- ubuntu:latest | ||
test_link: null | ||
test_maintenance: true | ||
test_status: | ||
- passed | ||
|
||
weight: 1 | ||
layout: learningpathall | ||
learning_path_main_page: 'yes' | ||
--- |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,30 @@ | ||
--- | ||
# ================================================================================ | ||
# Edit | ||
# ================================================================================ | ||
|
||
next_step_guidance: > | ||
You can continue learning about migrating applications to Arm. | ||
# 1-3 sentence recommendation outlining how the reader can generally keep learning about these topics, and a specific explanation of why the next step is being recommended. | ||
|
||
recommended_path: "/learning-paths/servers-and-cloud-computing/mysql_tune/" | ||
# Link to the next learning path being recommended(For example this could be /learning-paths/servers-and-cloud-computing/mongodb). | ||
|
||
# further_reading links to references related to this path. Can be: | ||
# Manuals for a tool / software mentioned (type: documentation) | ||
# Blog about related topics (type: blog) | ||
# General online references (type: website) | ||
|
||
further_reading: | ||
- resource: | ||
title: libhugetlbfs manpage | ||
link: https://linux.die.net/man/7/libhugetlbfs | ||
type: documentation | ||
|
||
# ================================================================================ | ||
# FIXED, DO NOT MODIFY | ||
# ================================================================================ | ||
weight: 21 # set to always be larger than the content in this path, and one more than 'review' | ||
title: "Next Steps" # Always the same | ||
layout: "learningpathall" # All files under learning paths have this same wrapper | ||
--- |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,55 @@ | ||
--- | ||
# ================================================================================ | ||
# Edit | ||
# ================================================================================ | ||
|
||
# Always 3 questions. Should try to test the reader's knowledge, and reinforce the key points you want them to remember. | ||
# question: A one sentence question | ||
# answers: The correct answers (from 2-4 answer options only). Should be surrounded by quotes. | ||
# correct_answer: An integer indicating what answer is correct (index starts from 0) | ||
# explanation: A short (1-3 sentence) explanation of why the correct answer is correct. Can add additional context if desired | ||
|
||
|
||
review: | ||
- questions: | ||
question: > | ||
In which build stage libhugetlbfs will take effect? | ||
answers: | ||
- preprocessing | ||
- compilation | ||
- assembly | ||
- linking | ||
correct_answer: 4 | ||
explanation: > | ||
libhugetlbfs takes effect during linking stage to place program sections into hugepage. | ||
|
||
- questions: | ||
question: > | ||
libhugetlbfs could only eanble code section of a program, is it true? | ||
answers: | ||
- Yes | ||
- No | ||
correct_answer: 2 | ||
explanation: > | ||
Though code sectition is the typical section to be placed in hugepage, other sections like data can also be placed in hugepage. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. nit: typo: s/sectition/section/ |
||
|
||
- questions: | ||
question: > | ||
After enabling libhugetlbfs on MySQL, which perf event would be decresed dramatically? | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. nit: typo: s/decresed/decreased/ |
||
answers: | ||
- l1d_tlb_refill | ||
- l1i_tlb_refill | ||
- l2d_tlb_refill | ||
correct_answer: 2 | ||
explanation: > | ||
After enabling libhugetlbfs on MySQL, we could see l1i_tlb_refill decreases dramatically from 490,265,467 to 70,741,621. | ||
|
||
|
||
|
||
# ================================================================================ | ||
# FIXED, DO NOT MODIFY | ||
# ================================================================================ | ||
title: "Review" # Always the same title | ||
weight: 20 # Set to always be larger than the content in this path | ||
layout: "learningpathall" # All files under learning paths have this same wrapper | ||
--- |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,88 @@ | ||
--- | ||
title: "General enablement method of libhugetlbfs" | ||
weight: 2 | ||
layout: "learningpathall" | ||
--- | ||
|
||
## Introduction of libhugetlbfs | ||
libhugetlbfs is a library that can back application text, data, malloc() and shared memory with hugepages. This is of benefit to applications that use large amounts of address space and suffer a performance hit due to TLB misses. Hence, by enabling libhugetlbfs, workloads with large code/data/heap sections would see significant performance improvement. | ||
|
||
|
||
## Install necessary packages | ||
On ubuntu 20, install necessary package and create symbolic link: | ||
``` | ||
$ sudo apt-get install libhugetlbfs-dev libhugetlbfs-bin | ||
$ sudo ln -s /usr/bin/ld.hugetlbfs /usr/share/libhugetlbfs/ld | ||
``` | ||
## Add compile option to enable libhugetlbfs | ||
add the following build option to build script (gcc option), and rebuild workload, from build option we could learn libhugetlbfs would be enabled in linking stage: | ||
``` | ||
-B /usr/share/libhugetlbfs -Wl,--hugetlbfs-align -no-pie -Wl,--no-as-needed | ||
|
||
``` | ||
|
||
## Enable system hugepage | ||
|
||
enable Linux system hugepage, for example, setting 1000 huge pages, for 2M huge pages, that's 2G: | ||
``` | ||
# echo 1000 > /proc/sys/vm/nr_hugepages | ||
# cat /proc/meminfo |grep HugePages_Total | ||
HugePages_Total: 1000 | ||
|
||
``` | ||
|
||
|
||
## Add HUGETLB_ELFMAP=RW before starting workload | ||
add HUGETLB_ELFMAP=RW prefix before starting the workload, which places both READ (such as code) and WRITE (such as data) in hugepage, such as: | ||
``` | ||
$ HUGETLB_ELFMAP=RW [workload] | ||
``` | ||
|
||
|
||
## Check hugepage is used | ||
|
||
make sure hugepage is used by checking meminfo: | ||
|
||
``` | ||
cat /proc/meminfo | ||
|
||
HugePages_Total: 1000 | ||
HugePages_Free: 994 | ||
|
||
``` | ||
|
||
Also check if the process is having huge page mapped: | ||
``` | ||
$ cat /proc/<pid>/smaps | less | ||
|
||
00200000-00400000 r-xp 00000000 00:25 48337 /dev/hugepages/libhugetlbfs.tmp.0D0D7x (deleted) | ||
Size: 2048 kB | ||
KernelPageSize: 2048 kB | ||
MMUPageSize: 2048 kB | ||
Rss: 0 kB | ||
Pss: 0 kB | ||
Shared_Clean: 0 kB | ||
Shared_Dirty: 0 kB | ||
Private_Clean: 0 kB | ||
Private_Dirty: 0 kB | ||
Referenced: 0 kB | ||
Anonymous: 0 kB | ||
LazyFree: 0 kB | ||
AnonHugePages: 0 kB | ||
ShmemPmdMapped: 0 kB | ||
FilePmdMapped: 0 kB | ||
Shared_Hugetlb: 0 kB | ||
Private_Hugetlb: 2048 kB | ||
Swap: 0 kB | ||
SwapPss: 0 kB | ||
Locked: 0 kB | ||
THPeligible: 0 | ||
``` | ||
|
||
|
||
|
||
|
||
|
||
|
||
|
||
|
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,159 @@ | ||
--- | ||
title: "Enable libhugetlbfs on MySQL" | ||
weight: 3 | ||
layout: "learningpathall" | ||
--- | ||
## Overview | ||
This page illustrates the steps to enable libhugetlbfs on MySQL and test results after enabling it. | ||
|
||
|
||
## Commands to build | ||
In order to build libhugetlbfs on MySQL, please add the following options to both -DCMAKE_C_FLAGS and -DCMAKE_CXX_FLAGS: | ||
``` | ||
-B /usr/share/libhugetlbfs -Wl,--hugetlbfs-align -no-pie -Wl --no-as-needed | ||
``` | ||
|
||
for example: | ||
``` | ||
$ cmake -DCMAKE_C_FLAGS="-g -mcpu=native -B /usr/share/libhugetlbfs -Wl,--hugetlbfs-align -no-pie -Wl,--no-as-needed" -DCMAKE_CXX_FLAGS="-g -mcpu=native -B /usr/share/libhugetlbfs -Wl,--hugetlbfs-align -no-pie -Wl,--no-as-needed" -DCMAKE_INSTALL_PREFIX=/home/mysql/mysql_install/1-install_8.0.33_huge -DWITH_BOOST=/home/mysql/boost_1_77_0/ .. | ||
$ make -j $(nproc) | ||
$ make install | ||
``` | ||
|
||
after build, check if the program has linked to libhugetlbfs.so, for example: | ||
``` | ||
root@bolt-ecs:/data# ldd mysqld |grep huge | ||
libhugetlbfs.so.0 => /lib/aarch64-linux-gnu/libhugetlbfs.so.0 (0x0000ffffac690000) | ||
``` | ||
|
||
## Commands to run | ||
After rebuilding MySQL with libhugetlbfs, add HUGETLB_ELFMAP=RW at the beginning of the command to start MySQL. for example: | ||
``` | ||
$ HUGETLB_ELFMAP=RW /home/mysql/mysql_install/1-install_8.0.33_huge/bin/mysqld ... | ||
``` | ||
|
||
please note don't export HUGETLB_ELFMAP=RW as an environment varible, it has to be specified right before the mysqld exectuable. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. nit: typo: s/exectuable/executable/ |
||
|
||
|
||
## Test Results | ||
|
||
By testing MySQL without/with libhugetlbfs, it shows performance increased by 11.9%~12.9%. | ||
|
||
### Without libhugetlbfs | ||
Reboot server and do 2 round tests | ||
|
||
#### First round | ||
|
||
``` | ||
Throughput: | ||
|
||
events/s (eps): 8604.3221 | ||
|
||
time elapsed: 300.0912s | ||
|
||
total number of events: 2582080 | ||
|
||
``` | ||
|
||
#### Second round | ||
TPS is 8524, also get the perf stat during the run: | ||
|
||
``` | ||
root@bolt-ecs:~# perf stat -e l1d_tlb_refill,l1i_tlb_refill,l2d_tlb_refill -a -- sleep 10 | ||
|
||
|
||
|
||
Performance counter stats for 'system wide': | ||
|
||
|
||
|
||
815,254,864 l1d_tlb_refill | ||
|
||
490,265,467 l1i_tlb_refill | ||
|
||
422,887,362 l2d_tlb_refill | ||
|
||
|
||
|
||
10.003289183 seconds time elapsed | ||
|
||
Throughput: | ||
|
||
events/s (eps): 8524.8307 | ||
|
||
time elapsed: 300.0878s | ||
|
||
total number of events: 2558197 | ||
|
||
``` | ||
|
||
### With libhugetlbfs | ||
|
||
Reboot and do 2 round test, in order to enable hugepage in server, need to do the following things: | ||
|
||
``` | ||
# chown -R mysql.mysql /dev/hugepages | ||
|
||
# echo 40 > /proc/sys/vm/nr_hugepages | ||
|
||
# cat /proc/meminfo | ||
|
||
HugePages_Total: 40 | ||
|
||
HugePages_Free: 4 | ||
|
||
HugePages_Rsvd: 1 | ||
|
||
HugePages_Surp: 0 | ||
|
||
Hugepagesize: 2048 kB | ||
|
||
Hugetlb: 81920 kB | ||
|
||
``` | ||
mysql used 36 huge pages (36*2M=72M) in the case. | ||
|
||
|
||
#### First round | ||
TPS is 9627, this is 12.9% increased compared to the 1st round without enabling hugepage: | ||
``` | ||
Throughput: | ||
|
||
events/s (eps): 9627.5017 | ||
|
||
time elapsed: 300.0855s | ||
|
||
total number of events: 2889073 | ||
|
||
``` | ||
#### Second round | ||
TPS is 9538, this is 11.9% increased compared to the 2nd round of without enabling hugepage, perf stat shows TLB misses are signifcantly reduced: | ||
|
||
``` | ||
root@bolt-ecs:~# perf stat -e l1d_tlb_refill,l1i_tlb_refill,l2d_tlb_refill -a -- sleep 10 | ||
|
||
|
||
|
||
Performance counter stats for 'system wide': | ||
|
||
|
||
|
||
688,157,786 l1d_tlb_refill | ||
|
||
70,741,621 l1i_tlb_refill | ||
|
||
254,054,393 l2d_tlb_refill | ||
|
||
|
||
|
||
10.002128509 seconds time elapsed | ||
|
||
Throughput: | ||
|
||
events/s (eps): 9538.9346 | ||
|
||
time elapsed: 300.0847s | ||
|
||
total number of events: 2862487 | ||
|
||
``` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: typo: s/eanble/enable/