New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reproducible image building #498

Open
wants to merge 4 commits into
base: master
from

Conversation

Projects
None yet
4 participants
@marmarek

marmarek commented Oct 6, 2018

As part of Reproducible Builds effort, make the images produced by lorax reproducible, given the same set of inputs (packages, configuration, SOURCE_DATE_EPOCH variable).
See individual commits for explanation of specific changes.

The last commit require matching anaconda change, as it change image layout. It should be possible to make anaconda support both old and new format. Is there any actual reason why ext4 image is packaged into squashfs instead of using squashfs directly? I imagine historically it could be lack of overlayfs in vanilla kernel and the need to use dm-snapshot. But it is no longer the case. Anything else?

One thing not solved here is efiboot.img, because I didn't managed to reproducibly build FAT filesystem. Even after eliminating obvious metadata differences (volume ID, file mtimes, files order), there are still some differences near the end of the image, which I didn't identified.

marmarek added some commits Oct 4, 2018

Use SOURCE_DATE_EPOCH for metadata timestamps
This include .buildinfo, .treeinfo and .discinfo.

Signed-off-by: Marek Marczykowski-Górecki <marmarek@invisiblethingslab.com>
Preserve timestamps when building fs image
Even when FS do not support owner/modes, preserve timestamps.

Signed-off-by: Marek Marczykowski-Górecki <marmarek@invisiblethingslab.com>
Use SOURCE_DATE_EPOCH for volumeid of efi boot image
By default mkfs.mksdos choose volume id based on current time. If
SOURCE_DATE_EPOCH is set, use that instead.

Signed-off-by: Marek Marczykowski-Górecki <marmarek@invisiblethingslab.com>
Drop non-determinism from default templates
Some files are created in non-reproducible way, including including
random data explicitly (/etc/machine-id), timestamps (fontconfig cache,
ldconfig aux-cache, certs cache), or entries in random order (groups,
systemd catalog, package list).
Fix this by either making the files reproducible, or removing them.
@coveralls

This comment has been minimized.

Show comment
Hide comment
@coveralls

coveralls Oct 6, 2018

Pull Request Test Coverage Report for Build 728

  • 5 of 20 (25.0%) changed or added relevant lines in 5 files are covered.
  • 2 unchanged lines in 2 files lost coverage.
  • Overall coverage decreased (-0.05%) to 43.583%

Changes Missing Coverage Covered Lines Changed/Added Lines %
src/pylorax/buildstamp.py 3 4 75.0%
src/pylorax/treebuilder.py 0 2 0.0%
src/pylorax/discinfo.py 1 5 20.0%
src/pylorax/imgutils.py 0 4 0.0%
src/pylorax/treeinfo.py 1 5 20.0%
Files with Coverage Reduction New Missed Lines %
src/pylorax/imgutils.py 1 14.11%
src/pylorax/treebuilder.py 1 12.73%
Totals Coverage Status
Change from base Build 723: -0.05%
Covered Lines: 2162
Relevant Lines: 4631

💛 - Coveralls

coveralls commented Oct 6, 2018

Pull Request Test Coverage Report for Build 728

  • 5 of 20 (25.0%) changed or added relevant lines in 5 files are covered.
  • 2 unchanged lines in 2 files lost coverage.
  • Overall coverage decreased (-0.05%) to 43.583%

Changes Missing Coverage Covered Lines Changed/Added Lines %
src/pylorax/buildstamp.py 3 4 75.0%
src/pylorax/treebuilder.py 0 2 0.0%
src/pylorax/discinfo.py 1 5 20.0%
src/pylorax/imgutils.py 0 4 0.0%
src/pylorax/treeinfo.py 1 5 20.0%
Files with Coverage Reduction New Missed Lines %
src/pylorax/imgutils.py 1 14.11%
src/pylorax/treebuilder.py 1 12.73%
Totals Coverage Status
Change from base Build 723: -0.05%
Covered Lines: 2162
Relevant Lines: 4631

💛 - Coveralls

@bcl bcl self-assigned this Oct 9, 2018

@bcl

This comment has been minimized.

Show comment
Hide comment
@bcl

bcl Oct 9, 2018

Contributor

Thanks for posting these, I'll try to take a look at them sometime this week.

Contributor

bcl commented Oct 9, 2018

Thanks for posting these, I'll try to take a look at them sometime this week.

@bcl

This comment has been minimized.

Show comment
Hide comment
@bcl

bcl Oct 11, 2018

Contributor

Overall this looks pretty simple. ISTR there was a good reason we didn't switch to plain squashfs for the install.img but cannot remember exactly why. @wgwoods may remember though.
Changing the on-disk layout needs to be coordinated with Fedora and especially Anaconda so you may want to join the fedora-devel or anaconda-devel mailing lists and introduce yourself there.
I'm not convinced that reproducible builds are worth the extra effort, in general, compared with good gpg signatures on everything, but these patches are pretty clean so I'm not opposed to merging them in the future if you can get the needed changes into Anaconda.

Contributor

bcl commented Oct 11, 2018

Overall this looks pretty simple. ISTR there was a good reason we didn't switch to plain squashfs for the install.img but cannot remember exactly why. @wgwoods may remember though.
Changing the on-disk layout needs to be coordinated with Fedora and especially Anaconda so you may want to join the fedora-devel or anaconda-devel mailing lists and introduce yourself there.
I'm not convinced that reproducible builds are worth the extra effort, in general, compared with good gpg signatures on everything, but these patches are pretty clean so I'm not opposed to merging them in the future if you can get the needed changes into Anaconda.

@marmarek

This comment has been minimized.

Show comment
Hide comment
@marmarek

marmarek Oct 12, 2018

Apart from making the build reproducible, this also reduce size and complexity of the image. Not requiring dmsquash-live dracut module (which include dmsetup etc) make initramfs noticeably smaller. Normally it isn't a big deal, but if you need to put that + kernel into efiboot.img, it's important. Because boot image on ISO9660 is limited to 32MB.
I'll write on fedora-devel about this.
If you want, I can extract this one commit to a separate PR.

marmarek commented Oct 12, 2018

Apart from making the build reproducible, this also reduce size and complexity of the image. Not requiring dmsquash-live dracut module (which include dmsetup etc) make initramfs noticeably smaller. Normally it isn't a big deal, but if you need to put that + kernel into efiboot.img, it's important. Because boot image on ISO9660 is limited to 32MB.
I'll write on fedora-devel about this.
If you want, I can extract this one commit to a separate PR.

@bcl

This comment has been minimized.

Show comment
Hide comment
@bcl

bcl Oct 12, 2018

Contributor

If you want, I can extract this one commit to a separate PR.

Yeah, I think the others can be taken as-is so that would be good.

Contributor

bcl commented Oct 12, 2018

If you want, I can extract this one commit to a separate PR.

Yeah, I think the others can be taken as-is so that would be good.

@wgwoods

This comment has been minimized.

Show comment
Hide comment
@wgwoods

wgwoods Oct 12, 2018

Contributor

Is there any actual reason why ext4 image is packaged into squashfs instead of using squashfs directly? I imagine historically it could be lack of overlayfs in vanilla kernel and the need to use dm-snapshot. But it is no longer the case. Anything else?

Correct, the problem with bare squashfs images was that anaconda still needs its root filesystem to be writeable (for updates.img mainly), and squashfs doesn't support the write() system call at all, so there's no way to make a squashfs filesystem writeable using device-mapper overlays. And overlayfs wasn't in the mainstream kernel at the time (Fedora 15-16 / kernel 2.6-3.1) - that only became an option in Fedora 22 / kernel 4.0.

I've always thought the ext4-inside-squashfs image payload was needlessly complex and would be happy to see a simpler solution.

Contributor

wgwoods commented Oct 12, 2018

Is there any actual reason why ext4 image is packaged into squashfs instead of using squashfs directly? I imagine historically it could be lack of overlayfs in vanilla kernel and the need to use dm-snapshot. But it is no longer the case. Anything else?

Correct, the problem with bare squashfs images was that anaconda still needs its root filesystem to be writeable (for updates.img mainly), and squashfs doesn't support the write() system call at all, so there's no way to make a squashfs filesystem writeable using device-mapper overlays. And overlayfs wasn't in the mainstream kernel at the time (Fedora 15-16 / kernel 2.6-3.1) - that only became an option in Fedora 22 / kernel 4.0.

I've always thought the ext4-inside-squashfs image payload was needlessly complex and would be happy to see a simpler solution.

@marmarek

This comment has been minimized.

Show comment
Hide comment
@marmarek

marmarek Oct 12, 2018

If you want, I can extract this one commit to a separate PR.

Done: #507

marmarek commented Oct 12, 2018

If you want, I can extract this one commit to a separate PR.

Done: #507

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment