Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feature: copy or reflink cpio data directly from source instead of staging in tmpdir #1662

Open
ddiss opened this issue Dec 6, 2021 · 5 comments
Assignees
Labels
enhancement Issue adding new functionality

Comments

@ddiss
Copy link
Contributor

ddiss commented Dec 6, 2021

When generating initramfs images, dracut currently recreates the filesystem tree in a staging area under tmpdir. This is mostly needed due to GNU cpio's unfortunate source-path-is-dest-path behaviour.
As of #1531 dracut now has it's own (optional) cpio archiver dracut-cpio. dracut-cpio currently replicates the GNU cpio source-path-is-dest-path behaviour, but it could be modified to process a content manifest similar to the kernel's gen_init_cpio utility, which uses separate source and destination file paths.
Eliminating tmpdir staging for cpio data should provide a not-insignificant performance boost for both reflinked (via copy_file_range) and non-reflinked cpio archives.

@ddiss ddiss added the enhancement Issue adding new functionality label Dec 6, 2021
@ddiss
Copy link
Contributor Author

ddiss commented Dec 6, 2021

As reference, Linux's gen_init_cpio supports the following archive manifest:

<cpio_list> is a file containing newline separated entries that
describe the files to be included in the initramfs archive:

# a comment
file <name> <location> <mode> <uid> <gid> [<hard links>]
dir <name> <mode> <uid> <gid>
nod <name> <mode> <uid> <gid> <dev_type> <maj> <min>
slink <name> <target> <mode> <uid> <gid>
pipe <name> <mode> <uid> <gid>
sock <name> <mode> <uid> <gid>

<name>       name of the file/dir/nod/etc in the archive
<location>   location of the file in the current filesystem
             expands shell variables quoted with ${}
<target>     link target
<mode>       mode/permissions of the file
<uid>        user id (0=root)
<gid>        group id (0=root)
<dev_type>   device type (b=block, c=character)
<maj>        major number of nod
<min>        minor number of nod
<hard links> space separated list of other links to file

example:
# A simple initramfs
dir /dev 0755 0 0
nod /dev/console 0600 0 0 c 5 1
dir /root 0700 0 0
dir /sbin 0755 0 0
file /sbin/kinit /usr/src/klibc/kinit/kinit 0755 0 0

dracut currently supports file paths with spaces and newlines, so I think we'll need to instead use a manifest format which allows for this (e.g. '\0' separators). One other manifest addition I'd like to add is something like:

file-chained <name> <mode> <uid> <gid> <location_0> [<location_1>...]

Where the <location_N> file contents are appended for the archived <name>. This should allow CPU microcode to also be archived in-place.

@johannbg
Copy link
Collaborator

johannbg commented Dec 6, 2021

Makes sense.

@ddiss
Copy link
Contributor Author

ddiss commented Dec 7, 2021

My rough task breakdown for this ticket is:

  • add archive manifest generation support to dracut-install
  • add manifest parsing support to dracut-cpio
    • remove old GNU cpio style source-path-is-dest-path behaviour
  • add special file-chained case for CPU microcode handling
  • add extensive test coverage
  • benchmark with reflink (XFS and Btrfs) as well as non-reflink (e.g. EFI FAT initramfs destination)

This will take some time, but I'll try to post an RFC branch once I have something worthy of early feedback. Others are welcome to join the party ;-)

(possibly in future, depending on benchmarks, feedback, rust availability, etc.)

  • switch to dracut-cpio exclusively for archive creation
  • remove dracut-install I/O functionality, leaving it as a simple manifest archive generator

@mwilck given your helpful scrutiny of the dracut-cpio changes, I'd be interested in hearing your thoughts here too :-)

@mwilck
Copy link
Contributor

mwilck commented Dec 7, 2021

(possibly in future, depending on benchmarks, feedback, rust availability, etc.)

* switch to `dracut-cpio` exclusively for archive creation

* remove `dracut-install` I/O functionality, leaving it as a simple manifest archive generator

It looks as if dracut-cpio would grow towards being an almost feature-complete cpio replacement. Perhaps you should create a separate project for it rather than hiding it here in dracut?

@mwilck given your helpful scrutiny of the dracut-cpio changes, I'd be interested in hearing your thoughts here too :-)

I wouldn't call what I did previously a "scrutiny" 😉 I wouldn't even think about scrutinizing it, as my rust knowledge is close to zero. I experimented with your code, and had pleasant results and no issues, that much I can say.

@ddiss
Copy link
Contributor Author

ddiss commented Dec 7, 2021

...

It looks as if dracut-cpio would grow towards being an almost feature-complete cpio replacement. Perhaps you should create a separate project for it rather than hiding it here in dracut?

It's still only for archive creation, so I think it's fine to stay where it is for the moment. Besides, having it embedded allows for the source-path-is-dest-path -> manifest conversion without needing to deal with ugly version dependencies, etc.

@mwilck given your helpful scrutiny of the dracut-cpio changes, I'd be interested in hearing your thoughts here too :-)

I wouldn't call what I did previously a "scrutiny" I wouldn't even think about scrutinizing it, as my rust knowledge is close to zero. I experimented with your code, and had pleasant results and no issues, that much I can say.

Okay, no worries - thanks for the feedback!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement Issue adding new functionality
Projects
None yet
Development

No branches or pull requests

3 participants