Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add Learn how to enable libhugetlbfs to increase performance #611

Merged
merged 1 commit into from Jan 3, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Jump to
Jump to file
Failed to load files.
Diff view
Diff view
@@ -0,0 +1,39 @@
---
title: Learn how to enable libhugetlbfs to increase performance on Arm Server

minutes_to_complete: 60

who_is_this_for: This is an advanced topic for performance engineers who wants to tune performance in arm server.

learning_objectives:
- enable libhugetlbfs to increase performance
- see how much performance improved on workloads like MySQL, Redis.

prerequisites:
- system with ubuntu 20 installed
- knowledge to build MySQL server, and run sysbench benchmark test
- knowledge to build Redis server, and run memtier benchmark test

author_primary: Bolt Liu

skilllevels: Advanced
subjects: Databases
armips:
- Neoverse
operatingsystems:
- Linux
tools_software_languages:
- C
- C++

test_images:
- ubuntu:latest
test_link: null
test_maintenance: true
test_status:
- passed

weight: 1
layout: learningpathall
learning_path_main_page: 'yes'
---
@@ -0,0 +1,30 @@
---
# ================================================================================
# Edit
# ================================================================================

next_step_guidance: >
You can continue learning about migrating applications to Arm.
# 1-3 sentence recommendation outlining how the reader can generally keep learning about these topics, and a specific explanation of why the next step is being recommended.

recommended_path: "/learning-paths/servers-and-cloud-computing/mysql_tune/"
# Link to the next learning path being recommended(For example this could be /learning-paths/servers-and-cloud-computing/mongodb).

# further_reading links to references related to this path. Can be:
# Manuals for a tool / software mentioned (type: documentation)
# Blog about related topics (type: blog)
# General online references (type: website)

further_reading:
- resource:
title: libhugetlbfs manpage
link: https://linux.die.net/man/7/libhugetlbfs
type: documentation

# ================================================================================
# FIXED, DO NOT MODIFY
# ================================================================================
weight: 21 # set to always be larger than the content in this path, and one more than 'review'
title: "Next Steps" # Always the same
layout: "learningpathall" # All files under learning paths have this same wrapper
---
@@ -0,0 +1,55 @@
---
# ================================================================================
# Edit
# ================================================================================

# Always 3 questions. Should try to test the reader's knowledge, and reinforce the key points you want them to remember.
# question: A one sentence question
# answers: The correct answers (from 2-4 answer options only). Should be surrounded by quotes.
# correct_answer: An integer indicating what answer is correct (index starts from 0)
# explanation: A short (1-3 sentence) explanation of why the correct answer is correct. Can add additional context if desired


review:
- questions:
question: >
In which build stage libhugetlbfs will take effect?
answers:
- preprocessing
- compilation
- assembly
- linking
correct_answer: 4
explanation: >
libhugetlbfs takes effect during linking stage to place program sections into hugepage.

- questions:
question: >
libhugetlbfs could only eanble code section of a program, is it true?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: typo: s/eanble/enable/

answers:
- Yes
- No
correct_answer: 2
explanation: >
Though code sectition is the typical section to be placed in hugepage, other sections like data can also be placed in hugepage.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: typo: s/sectition/section/


- questions:
question: >
After enabling libhugetlbfs on MySQL, which perf event would be decresed dramatically?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: typo: s/decresed/decreased/

answers:
- l1d_tlb_refill
- l1i_tlb_refill
- l2d_tlb_refill
correct_answer: 2
explanation: >
After enabling libhugetlbfs on MySQL, we could see l1i_tlb_refill decreases dramatically from 490,265,467 to 70,741,621.



# ================================================================================
# FIXED, DO NOT MODIFY
# ================================================================================
title: "Review" # Always the same title
weight: 20 # Set to always be larger than the content in this path
layout: "learningpathall" # All files under learning paths have this same wrapper
---
@@ -0,0 +1,88 @@
---
title: "General enablement method of libhugetlbfs"
weight: 2
layout: "learningpathall"
---

## Introduction of libhugetlbfs
libhugetlbfs is a library that can back application text, data, malloc() and shared memory with hugepages. This is of benefit to applications that use large amounts of address space and suffer a performance hit due to TLB misses. Hence, by enabling libhugetlbfs, workloads with large code/data/heap sections would see significant performance improvement.


## Install necessary packages
On ubuntu 20, install necessary package and create symbolic link:
```
$ sudo apt-get install libhugetlbfs-dev libhugetlbfs-bin
$ sudo ln -s /usr/bin/ld.hugetlbfs /usr/share/libhugetlbfs/ld
```
## Add compile option to enable libhugetlbfs
add the following build option to build script (gcc option), and rebuild workload, from build option we could learn libhugetlbfs would be enabled in linking stage:
```
-B /usr/share/libhugetlbfs -Wl,--hugetlbfs-align -no-pie -Wl,--no-as-needed

```

## Enable system hugepage

enable Linux system hugepage, for example, setting 1000 huge pages, for 2M huge pages, that's 2G:
```
# echo 1000 > /proc/sys/vm/nr_hugepages
# cat /proc/meminfo |grep HugePages_Total
HugePages_Total: 1000

```


## Add HUGETLB_ELFMAP=RW before starting workload
add HUGETLB_ELFMAP=RW prefix before starting the workload, which places both READ (such as code) and WRITE (such as data) in hugepage, such as:
```
$ HUGETLB_ELFMAP=RW [workload]
```


## Check hugepage is used

make sure hugepage is used by checking meminfo:

```
cat /proc/meminfo

HugePages_Total: 1000
HugePages_Free: 994

```

Also check if the process is having huge page mapped:
```
$ cat /proc/<pid>/smaps | less

00200000-00400000 r-xp 00000000 00:25 48337 /dev/hugepages/libhugetlbfs.tmp.0D0D7x (deleted)
Size: 2048 kB
KernelPageSize: 2048 kB
MMUPageSize: 2048 kB
Rss: 0 kB
Pss: 0 kB
Shared_Clean: 0 kB
Shared_Dirty: 0 kB
Private_Clean: 0 kB
Private_Dirty: 0 kB
Referenced: 0 kB
Anonymous: 0 kB
LazyFree: 0 kB
AnonHugePages: 0 kB
ShmemPmdMapped: 0 kB
FilePmdMapped: 0 kB
Shared_Hugetlb: 0 kB
Private_Hugetlb: 2048 kB
Swap: 0 kB
SwapPss: 0 kB
Locked: 0 kB
THPeligible: 0
```








@@ -0,0 +1,159 @@
---
title: "Enable libhugetlbfs on MySQL"
weight: 3
layout: "learningpathall"
---
## Overview
This page illustrates the steps to enable libhugetlbfs on MySQL and test results after enabling it.


## Commands to build
In order to build libhugetlbfs on MySQL, please add the following options to both -DCMAKE_C_FLAGS and -DCMAKE_CXX_FLAGS:
```
-B /usr/share/libhugetlbfs -Wl,--hugetlbfs-align -no-pie -Wl --no-as-needed
```

for example:
```
$ cmake -DCMAKE_C_FLAGS="-g -mcpu=native -B /usr/share/libhugetlbfs -Wl,--hugetlbfs-align -no-pie -Wl,--no-as-needed" -DCMAKE_CXX_FLAGS="-g -mcpu=native -B /usr/share/libhugetlbfs -Wl,--hugetlbfs-align -no-pie -Wl,--no-as-needed" -DCMAKE_INSTALL_PREFIX=/home/mysql/mysql_install/1-install_8.0.33_huge -DWITH_BOOST=/home/mysql/boost_1_77_0/ ..
$ make -j $(nproc)
$ make install
```

after build, check if the program has linked to libhugetlbfs.so, for example:
```
root@bolt-ecs:/data# ldd mysqld |grep huge
libhugetlbfs.so.0 => /lib/aarch64-linux-gnu/libhugetlbfs.so.0 (0x0000ffffac690000)
```

## Commands to run
After rebuilding MySQL with libhugetlbfs, add HUGETLB_ELFMAP=RW at the beginning of the command to start MySQL. for example:
```
$ HUGETLB_ELFMAP=RW /home/mysql/mysql_install/1-install_8.0.33_huge/bin/mysqld ...
```

please note don't export HUGETLB_ELFMAP=RW as an environment varible, it has to be specified right before the mysqld exectuable.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: typo: s/exectuable/executable/



## Test Results

By testing MySQL without/with libhugetlbfs, it shows performance increased by 11.9%~12.9%.

### Without libhugetlbfs
Reboot server and do 2 round tests

#### First round

```
Throughput:

events/s (eps): 8604.3221

time elapsed: 300.0912s

total number of events: 2582080

```

#### Second round
TPS is 8524, also get the perf stat during the run:

```
root@bolt-ecs:~# perf stat -e l1d_tlb_refill,l1i_tlb_refill,l2d_tlb_refill -a -- sleep 10



Performance counter stats for 'system wide':



815,254,864 l1d_tlb_refill

490,265,467 l1i_tlb_refill

422,887,362 l2d_tlb_refill



10.003289183 seconds time elapsed

Throughput:

events/s (eps): 8524.8307

time elapsed: 300.0878s

total number of events: 2558197

```

### With libhugetlbfs

Reboot and do 2 round test, in order to enable hugepage in server, need to do the following things:

```
# chown -R mysql.mysql /dev/hugepages

# echo 40 > /proc/sys/vm/nr_hugepages

# cat /proc/meminfo

HugePages_Total: 40

HugePages_Free: 4

HugePages_Rsvd: 1

HugePages_Surp: 0

Hugepagesize: 2048 kB

Hugetlb: 81920 kB

```
mysql used 36 huge pages (36*2M=72M) in the case.


#### First round
TPS is 9627, this is 12.9% increased compared to the 1st round without enabling hugepage:
```
Throughput:

events/s (eps): 9627.5017

time elapsed: 300.0855s

total number of events: 2889073

```
#### Second round
TPS is 9538, this is 11.9% increased compared to the 2nd round of without enabling hugepage, perf stat shows TLB misses are signifcantly reduced:

```
root@bolt-ecs:~# perf stat -e l1d_tlb_refill,l1i_tlb_refill,l2d_tlb_refill -a -- sleep 10



Performance counter stats for 'system wide':



688,157,786 l1d_tlb_refill

70,741,621 l1i_tlb_refill

254,054,393 l2d_tlb_refill



10.002128509 seconds time elapsed

Throughput:

events/s (eps): 9538.9346

time elapsed: 300.0847s

total number of events: 2862487

```