Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Limit in-memory use for inodes #58

Merged
merged 6 commits into from
Aug 23, 2022
Merged

Conversation

alexlarsson
Copy link
Collaborator

This limits the xattr data size to 4k, and splits the dentry data into chunks of (max) 4k which we read in chunks as needed. This drops the in-memory requirement for inodes a lot.

@rhvgoyal does this take care of your issues wrt memory use?

Note: For memory use I will also look at re-using xattr chunks in memory between inodes as they are already shared on disk. Should be rather easy.

This matches what ext4 does and saves some space.

Signed-off-by: Alexander Larsson <alexl@redhat.com>
This avoids the module being unloaded while the fs is mounted.

Signed-off-by: Alexander Larsson <alexl@redhat.com>
Signed-off-by: Alexander Larsson <alexl@redhat.com>
Signed-off-by: Alexander Larsson <alexl@redhat.com>
@alexlarsson
Copy link
Collaborator Author

Wth is up with armv7:

  dump.c:254:24: error: format '%ld' expects argument of type 'long int', but argument 11 has type 'long long unsigned int' [-Werror=format=]
    254 |                 printf("name:%.*s|ino:%" PRIu64
        |                        ^~~~~~~~~~~~~~~~~
  ......
    259 |                        ino.st_size, (uint64_t)0,
        |                                     ~~~~~~~~~~~
        |                                     |
        |                                     long long unsigned int

Surely the printf type PRIu64 (defined in the system header) should match uint64_t(also defined in system header).

This changes the way a directory inode stores the dentry list.
Instead of a list of all dentries followed by all the filenames we
store a list of chunks, where each chunk is at most 4k of such
dentries and names. Additionally we have a header in the beginning
with the size and dentry count of all such chunks.

On the reader side we don't read (and alloc) all the dentries with the
inode, instead we read the blocks as needed, meaning we use a lot less
memory per in-memory inode.

We preload the chunk headers for typical size dirs (4 chunks max) as a
cheap way to avoid having to re-read it. However for large dirs we read
the table as needed.

Since we don't read all dentries when creating the inode we now also use
the st_nlink from the file rather than computing it from the dentries.

Signed-off-by: Alexander Larsson <alexl@redhat.com>
This means we're not using potentially unbounded kernel memory for the
inode for the xattrs. I think in practice we're not going to see such
large xattrs anyway, they are mainly used to store things like ACLs,
file caps or selinux contexts.

Signed-off-by: Alexander Larsson <alexl@redhat.com>
Copy link
Member

@giuseppe giuseppe left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@alexlarsson alexlarsson merged commit 361f373 into containers:main Aug 23, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants