-
Notifications
You must be signed in to change notification settings - Fork 184
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[common] Add encrypted-file-node-size slab into slab allocator #1763
Conversation
3dd08ab
to
12664eb
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 1 of 1 files at r2, all commit messages.
Reviewable status: all files reviewed, 1 unresolved discussion, not enough approvals from maintainers (2 more required), not enough approvals from different teams (1 more required, approved so far: Intel) (waiting on @kailun-qin)
-- commits
line 9 at r2:
I obviously don't like this, as this is special-casing the encrypted file logic into the generic slab allocator.
Moreover, objects in Gramine that take e.g. 2KB or 4KB will occupy slab slots of ~8KB, so the fragmentation of memory will be terrible.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: all files reviewed, 2 unresolved discussions, not enough approvals from maintainers (2 more required), not enough approvals from different teams (1 more required, approved so far: Intel) (waiting on @kailun-qin)
a discussion (no related file):
CI fails on LibOS regression tests:
__________________ TC_30_Syscall.test_055_mmap_emulated_tmpfs __________________
...
E subprocess.CalledProcessError: Command '['/home/jenkins/workspace/graphene-20.04/usr/lib/x86_64-linux-gnu/gramine/direct/loader', '/home/jenkins/workspace/graphene-20.04/usr/lib/x86_64-linux-gnu/gramine/direct/libpal.so', 'init', 'mmap_file_emulated', '/mnt/tmpfs/test_mmap']' returned non-zero exit status 1.
----------------------------- Captured stdout call -----------------------------
[0.002] CREATE OK
[0.002] mmap_file_emulated: unexpected non-zero byte at addr_shared[16]
@kailun-qin Could you take a look?
ec02135
to
0800855
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 1 of 1 files at r4.
Reviewable status: all files reviewed, 3 unresolved discussions, not enough approvals from maintainers (2 more required), not enough approvals from different teams (1 more required, approved so far: Intel) (waiting on @kailun-qin)
libos/src/fs/libos_fs_mem.c
line 21 at r4 (raw file):
size_t min_size = MIN(buf_size, (size_t)mem->size); memcpy(buf, mem->buf, MIN(buf_size, min_size));
Why do you have this MIN()
here? This should be just min_size
0800855
to
aa4c2b4
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 1 of 1 files at r5.
Reviewable status: all files reviewed, 4 unresolved discussions, not enough approvals from maintainers (2 more required), not enough approvals from different teams (1 more required, approved so far: Intel) (waiting on @kailun-qin)
-- commits
line 9 at r5:
I'm confused by all these numbers. How did you arrive at 8256 bytes?
common/include/slabmgr.h
line 97 at r5 (raw file):
#define SLAB_LEVEL_SIZES \ 16, 32, 64, 128 - SLAB_HDR_SIZE, 256 - SLAB_HDR_SIZE, 512 - SLAB_HDR_SIZE, \ 1024 - SLAB_HDR_SIZE, 2048 - SLAB_HDR_SIZE, 8288 - SLAB_HDR_SIZE
We may want to add one more level, that will cover 4KB objects, otherwise we have a huge gap between ~2KB and ~8KB, and memory fragmentation can be significant.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: all files reviewed, 4 unresolved discussions, not enough approvals from maintainers (2 more required), not enough approvals from different teams (1 more required, approved so far: Intel) (waiting on @dimakuv)
Previously, dimakuv (Dmitrii Kuvaiskii) wrote…
I'm confused by all these numbers. How did you arrive at 8256 bytes?
hmm... the size of file_node_t
is 8235B, the SLAB_HDR_SIZE
is 32B, so we need at least 8267B for a slab object for our case. While drafting this code, I tried to pick a value that is aligned to 32B (a multiple of 32) for the size of slab object , so this was where 8288B came from. And 8256B (8288 - 32) is thus the user buffer size.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed all commit messages.
Reviewable status: all files reviewed, 1 unresolved discussion, not enough approvals from maintainers (1 more required), not enough approvals from different teams (1 more required, approved so far: Intel) (waiting on @kailun-qin)
a discussion (no related file):
Previously, dimakuv (Dmitrii Kuvaiskii) wrote…
CI fails on LibOS regression tests:
__________________ TC_30_Syscall.test_055_mmap_emulated_tmpfs __________________ ... E subprocess.CalledProcessError: Command '['/home/jenkins/workspace/graphene-20.04/usr/lib/x86_64-linux-gnu/gramine/direct/loader', '/home/jenkins/workspace/graphene-20.04/usr/lib/x86_64-linux-gnu/gramine/direct/libpal.so', 'init', 'mmap_file_emulated', '/mnt/tmpfs/test_mmap']' returned non-zero exit status 1. ----------------------------- Captured stdout call ----------------------------- [0.002] CREATE OK [0.002] mmap_file_emulated: unexpected non-zero byte at addr_shared[16]
@kailun-qin Could you take a look?
This was fixed, see change in libos_fs_mem.c
file. (I would prefer to move it as a separate commit or even a PR, but can decide later when we have enough approvals.)
Previously, dimakuv (Dmitrii Kuvaiskii) wrote…
I obviously don't like this, as this is special-casing the encrypted file logic into the generic slab allocator.
Moreover, objects in Gramine that take e.g. 2KB or 4KB will occupy slab slots of ~8KB, so the fragmentation of memory will be terrible.
The other reviewers are leaning towards this PR instead of the previous attempt in #1723.
Since both patches are supposed to be only a temporary solution (the final solution will be using a new proper memory allocator), I'm fine with merging this PR.
Previously, kailun-qin (Kailun Qin) wrote…
hmm... the size of
file_node_t
is 8235B, theSLAB_HDR_SIZE
is 32B, so we need at least 8267B for a slab object for our case. While drafting this code, I tried to pick a value that is aligned to 32B (a multiple of 32) for the size of slab object , so this was where 8288B came from. And 8256B (8288 - 32) is thus the user buffer size.
Thanks, I understand the reasoning now.
common/include/slabmgr.h
line 97 at r5 (raw file):
Previously, dimakuv (Dmitrii Kuvaiskii) wrote…
We may want to add one more level, that will cover 4KB objects, otherwise we have a huge gap between ~2KB and ~8KB, and memory fragmentation can be significant.
We should probably do a couple experiments, whether we want to add yet another slab level.
I'm not blocking here though.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: all files reviewed, 1 unresolved discussion, not enough approvals from different teams (1 more required, approved so far: Intel) (waiting on @kailun-qin)
a discussion (no related file):
This PR was tested independently by two Intel teams: validation team and a team that found this perf bug & required this.
The PR improves performance of Encrypted Files and is even better than #1723 in terms of resulting latency/throughput.
The PR also doesn't result in any significant performance changes (neither perf degradation nor perf improvement) in our other CI workloads.
The PR looks good to go.
common/include/slabmgr.h
line 97 at r5 (raw file):
Previously, dimakuv (Dmitrii Kuvaiskii) wrote…
We should probably do a couple experiments, whether we want to add yet another slab level.
I'm not blocking here though.
@mkow What do you think? We can merge the PR as-is, as it was validated by two Intel teams, and no perf issues were detected. See my other comment.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 1 of 1 files at r2, 1 of 1 files at r5, all commit messages.
Reviewable status: all files reviewed, 1 unresolved discussion, not enough approvals from different teams (1 more required, approved so far: Intel) (waiting on @dimakuv)
common/include/slabmgr.h
line 97 at r5 (raw file):
Previously, dimakuv (Dmitrii Kuvaiskii) wrote…
@mkow What do you think? We can merge the PR as-is, as it was validated by two Intel teams, and no perf issues were detected. See my other comment.
// we talked about this in private and Dmitrii will do some stats to check whether this may be worth adding.
libos/src/fs/libos_fs_mem.c
line 20 at r5 (raw file):
return -ENOMEM; memcpy(buf, mem->buf, MIN(buf_size, (size_t)mem->size));
ouch 😬
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: all files reviewed, 1 unresolved discussion, not enough approvals from maintainers (1 more required), not enough approvals from different teams (1 more required, approved so far: Intel) (waiting on @dimakuv)
common/include/slabmgr.h
line 97 at r5 (raw file):
Previously, mkow (Michał Kowalczyk) wrote…
// we talked about this in private and Dmitrii will do some stats to check whether this may be worth adding.
Ok, based on results reported in #1768, we are adding one additional (intermediate) slab level -- around 4KB.
aa4c2b4
to
0b44435
Compare
Allocating the file nodes (`file_node_t`) of Gramine's encrypted files can be one source of overhead for certain workloads (e.g., RocksDB compaction). To address this performance bottleneck, this commit introduces a new slab size (8288B, considering the alignment) tailored for the size of the file node (8235B) and therefore a new free list for 8256B blocks managed by the slab allocator. Additionally, one more slab size (~4KB) is added, to have proper powers-of-2 slab levels. Note that while this change can improve performance for certain memory allocation patterns, it introduces internal fragmentation and may also increase memory overhead and external fragmentation if the application rarely makes allocations of the file node size. However, our limited experiments did not show any issues. In addition, this commit fixes an issue where in-memory files are not zeroed out when resized. Signed-off-by: Kailun Qin <kailun.qin@intel.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 1 of 1 files at r6, all commit messages.
Reviewable status: all files reviewed, all discussions resolved, not enough approvals from different teams (1 more required, approved so far: Intel)
common/include/slabmgr.h
line 97 at r5 (raw file):
Previously, dimakuv (Dmitrii Kuvaiskii) wrote…
Ok, based on results reported in #1768, we are adding one additional (intermediate) slab level -- around 4KB.
Done, also adjusted the commit message, please take a look.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 1 of 1 files at r6, all commit messages.
Reviewable status:complete! all files reviewed, all discussions resolved
0b44435
to
9288b49
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed all commit messages.
Reviewable status:complete! all files reviewed, all discussions resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: all files reviewed, 1 unresolved discussion (waiting on @kailun-qin)
a discussion (no related file):
Previously, dimakuv (Dmitrii Kuvaiskii) wrote…
This PR was tested independently by two Intel teams: validation team and a team that found this perf bug & required this.
The PR improves performance of Encrypted Files and is even better than #1723 in terms of resulting latency/throughput.
The PR also doesn't result in any significant performance changes (neither perf degradation nor perf improvement) in our other CI workloads.
The PR looks good to go.
I'm blocking for now. I asked the two Intel teams to re-do the validation again (on the new version of the PR -- finalized and rebased). Let's wait for their replies, just to be sure we didn't run into any perf regressions.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 1 of 1 files at r5, 1 of 1 files at r6, all commit messages.
Reviewable status: all files reviewed, 1 unresolved discussion
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status:
complete! all files reviewed, all discussions resolved
a discussion (no related file):
Previously, dimakuv (Dmitrii Kuvaiskii) wrote…
I'm blocking for now. I asked the two Intel teams to re-do the validation again (on the new version of the PR -- finalized and rebased). Let's wait for their replies, just to be sure we didn't run into any perf regressions.
Done. Both teams reported no issues with the latest revision of the PR.
Description of the changes
Allocating the file nodes (
file_node_t
) of Gramine's encrypted files can be one source of overhead for certain workloads (e.g., RocksDB compaction). To address this performance bottleneck, this commit introduces a new slab size (8288B, considering the alignment) tailored for the size of the file node (8235B) and therefore a new free list for 8256B blocks managed by the slab allocator.Note that while this change can improve performance for certain memory allocation patterns, it introduces internal fragmentation and may also increase memory overhead and external fragmentation if the application rarely makes allocations of the file node size.
In addition, this commit fixes an issue where in-memory files are not zeroed out when resized.
Fixes #1714.
Closes #1723.
How to test this PR?
CI
This change is![Reviewable](https://camo.githubusercontent.com/23b05f5fb48215c989e92cc44cf6512512d083132bd3daf689867c8d9d386888/68747470733a2f2f72657669657761626c652e696f2f7265766965775f627574746f6e2e737667)