Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

use symlink to hashes to avoid mirror failures #26

Closed
wants to merge 1 commit into from

Conversation

JanZerebecki
Copy link
Member

@JanZerebecki JanZerebecki commented May 17, 2023

Mirrors may use rsync with the skip on same mtime feature, which would
skip files that are different in content but have the same mtime. This
results in an inconsistent mirror.

Avoid this by creating symlinks to files with the real content named
after the content hash.

When the rpm macro %clamp_mtime_to_source_date_epoch is set to Y to
enable reproducible builds, the mtime of files in the rpm will be set to
the date of the last changes entry, but build dependencies that affect
the content may be newer. This is relevant when extracting such an rpm
for a repo that is used by the installer. Some mirrors may fail to
sync to the newest content as they skipped them. This would make an
installer using that mirror fail.

Fixes: https://bugzilla.opensuse.org/show_bug.cgi?id=1148824

@adrianschroeter
Copy link
Member

I am not really a fan of placing this workaround here in that way:

  1. It means that any small change (eg. updating a 1kb sized rpm) would push always 0.5TB to all mirrors again.

  2. It should not be limited to REPO_ONLY , we have also situations were we publish REPO and iso files from same build. It could be disabled when "drop_repo" is set though.

will follow up in bugzilla

@bmwiedemann
Copy link
Member

If we could use the max of

rpm -qa --qf "%{BUILDTIME}\n"|sort -nr|head -1

and changelog date as SOURCE_DATE_EPOCH for building installation-images, we would also be good, because then timestamps will change if and only if build inputs changed.

@JanZerebecki JanZerebecki force-pushed the fix-clamp-mtime branch 2 times, most recently from f522d60 to 7200786 Compare June 6, 2023 07:23
@JanZerebecki
Copy link
Member Author

If we could use the max of [...]

That causes even more changes, as it would prevent optimising away those rebuilds that resulted in the same build output.

I am not really a fan of placing this workaround here in that way:

1. It means that any small change (eg. updating a 1kb sized rpm) would push always 0.5TB to all mirrors again.

No that is only disk io, due to content being the same the actual transfer is much smaller, probably a KB. But I implemented a way to reuse mtimes when content does not change when an old build is available.

However not sure if that works yet as I have problems finding $BUILD_ROOT, as that env variable is not set.

2. It should not be limited to REPO_ONLY , we have also situations were we publish REPO and iso files from same build. It could be disabled when "drop_repo" is set though.

That is a problem as the iso would become non-reproducible if it were to embed these timestamps. Or maybe that needs to be fixed by honoring SOURCE_DATE_EPOCH in the iso generation step. Can you point to an example kiwi file?

@JanZerebecki
Copy link
Member Author

I tried /usr/src/packages/KIWIROOT/main/openSUSE-20230605-x86_64-Build3648.3-Media1/temp/../../../.build.oldpackages and that does not exist in my testbuild: https://build.opensuse.org/package/live_build_log/home:jzerebecki:branches:openSUSE:Factory/000product:openSUSE-ftp-ftp-x86_64/containerfile/x86_64

@JanZerebecki
Copy link
Member Author

I guess .build.oldpackages is in the BUILD_ROOT that is outside the kvm, so I don't have access to that. We could optimize away a few mtime changes in https://github.com/openSUSE/openSUSE-release-tools/blob/master/publish_distro . What do you think?

@JanZerebecki
Copy link
Member Author

I think with #27 we can always apply it even for iso images as mkiso will then fix it again, instead of only for REPO_ONLY.

@JanZerebecki
Copy link
Member Author

Ok mkiso can not fix the file mtime timestamps (only the time in the volume info). That will be a reproducibility regression in the iso, but as the iso is not fully reproducible due to the included .asc file that is generated each iso build, that is fine, we can fix that later, see comment in the code of #27 .

So with the mtimes now being touched in all cases including for iso, I think this PR is ready.

@adrianschroeter
Copy link
Member

.build.oldpackages is inside the VM, but not at product build time.

The idea was that the meta packages themself would check at their build time if they produce new files with different content. In that case they should avoid to set the same time stamp as the old ones.

That is something what could get be done in a generic way in the %clamp_mtime_to_source_date_epoch rpm macro and would be IMHO much nice than putting a workaround on to of a workaround.

Also it would work for all build types and not depending on the tool which is using these rpms.

@JanZerebecki
Copy link
Member Author

The idea was that the meta packages themself would check at their build time if they produce new files with different content. In that case they should avoid to set the same time stamp as the old ones.

This would make packages harder to reproduce as each build would require the previous build output to reproduce the timestamps. In effect fully reproducing a distribution would require redoing all builds ever done for that distribution in the exact same sequence (the sequence is currently not an effective build input). While technically still reproducible, it would in practice require too much build time for a rolling distribution that exists for years.

Am I missing some solution? Do you have a better idea?

Ideally we would fix mirroring to consider also the file content even when the meta data is equal, either by using a sync solution that can keep that as state to not need to rehash files all the times or by accepting the disk IO cost. If we ever do we can remove this change again.

It seems to me this PRs solution is the best next step:

  • All content is reproducible without requiring prior build output as input.
  • Some file meta data in repos as seen in HTTP responses will be difficult to reproduce, but easy to ignore when doing a diff, as it is not inside a file like an rpm.
  • We don't need to fix any mirrors nor change how they sync.
  • We can keep the rsync disk IO cost smaller by dropping some mtime changes on repos when build-compare runs on product build output.
  • The rsync network IO cost from mtime changes is small either way.

@adrianschroeter
Copy link
Member

okay, but always touching the meta files is actually the opposite of reproducible builds, each product run would deliver different results, even when the file content has not changed.

We wouldn't have an issue either if the meta package is reproducible (not sure if that is the case). However, we deal here with a situation where a change is actually happen and wanted. In that case it is IMHO wrong to keep the old mtime of the files via the %clamp_* macro.

Putting now a workaround into product-builder, which leads always to not reproducible results still sounds the wrong way to me.

Another option would be to rename the files and include eg. a hash sum of their content into the file name. product-builder could create symlinks then and that should be handleable by the download redirector.

@JanZerebecki
Copy link
Member Author

No, we have an issue purely because the mirrors will not copy files with same stat but different content. The meta package is reproducible.

No, it is not wrong to keep the same mtime despite a changed content, that is how SOURCE_DATE_EPOCH is defined to work.

Yes, it is a bit wrong, but seemed more practical than fixing how all mirrors sync.

Yes, adding a hash to all extracted file names is an option, if nothing trips over that. I'll implement that.

@JanZerebecki JanZerebecki changed the title update mtime to avoid mirror failures use symlink to hashes to avoid mirror failures Aug 9, 2023
@JanZerebecki
Copy link
Member Author

Done, please rereview.

@JanZerebecki
Copy link
Member Author

I have noticed a problem with using symlinks, it won't work for the files the iso uses for booting. And it won't work if mounted as its Joliet instead of Rock. So ideally the symlinks are only in the repo and not the iso.

I cloud replace them with hardlinks before making the iso and redo them as symlinks after the iso is produced, but not sure that is the best way. What do you think?

@JanZerebecki
Copy link
Member Author

Converting them works: https://github.com/openSUSE/product-builder/pull/27/files#diff-d21f610fbd9c740ba827ceed296d338963b19adcb81545df05a36a73374601e2R1059

Then that will need to be merged first for this one to work.

Mirrors may use rsync with the skip on same mtime feature, which would
skip files that are different in content but have the same mtime. This
results in an inconsistent mirror.

Avoid this by creating symlinks to files with the real content named
after the content hash.

When the rpm macro %clamp_mtime_to_source_date_epoch is set to Y to
enable reproducible builds, the mtime of files in the rpm will be set to
the date of the last changes entry, but build dependencies that affect
the content may be newer. This is relevant when extracting such an rpm
for a repo that is used by the installer. Some mirrors may fail to
sync to the newest content as they skipped them. This would make an
installer using that mirror fail.

Fixes: https://bugzilla.opensuse.org/show_bug.cgi?id=1148824
@adrianschroeter
Copy link
Member

I still do not see any reason to workaround issues created in the meta packages with additional code in product builder.

Either do the proper checking via .oldpackages as mentioned before in the meta packages or just disable the rpm functionality in the meta package spec files there.

You should be able to do so by either setting

%define source_date_epoch_from_changelog 0

or

%define clamp_mtime_to_source_date_epoch 0

in the spec file of the meta packages.

That will have more or less the same effect that every new build will have new mtimes inside of the rpm package. No need to add code in product-builder then.

Sorry, going to close this request now.

@JanZerebecki
Copy link
Member Author

In consequence that means you refuse to allow us to implement https://reproducible-builds.org/ for any OpenSUSE or SUSE distribution?

@adrianschroeter
Copy link
Member

adrianschroeter commented Sep 7, 2023 via email

@JanZerebecki
Copy link
Member Author

You have a totally different definition of reproducible builds than https://reproducible-builds.org/ here.
But I do not think you want your used definition of reproducible builds either as that would mean unnecessarily rebuilding packages. Increasing the used mirror bandwidth massively.

@adrianschroeter
Copy link
Member

adrianschroeter commented Sep 7, 2023 via email

@JanZerebecki
Copy link
Member Author

We had a discussion elsewhere and you proposed something, I though it might work, but I missed a corner case, so it doesn't work.

You proposed: Only while building meta rpm packages that will be extracted during the product builder: Set the rpm macro to clamp mtime to false or unset it. Then ensure in the spec build script the mtime of files is either taken from input binary rpms or manually set to SOURCE_DATE_EPOCH.

I thought that might work, but I forgot a case. Example: There is a file that is built with a compiler. That file is included in a meta-rpm-package, never changes its name, and is extracted during product-builder time with its mtime kept from the rpm where it was compiled, and this mtime is exposed to rsync for mirrors. Then the compiler changes in such a way that the compiled file content changes. The compilers changelog changes, but the file its changelog doesn't change. Thus the file doesn't change its mtime. So the suggestion doesn't work.

In general: Even in a build system that always uses current time for its output files, we can not normally rely on unix time or mtime to be monotonically increasing (it is explicitly declared not to be) or otherwise derive causal relationships from it. The Google Spanner paper https://research.google/pubs/pub39966/ describes what is necessary to do this. OBS doesn't do anything to offer such a guarantee. The problem is just rare enough and rebuilds that correct the problem again happen often enough that in practice it doesn't become visible. So instead of trying to detect causality from our flawed recording of time, we need to detect causality from the cause, the content change. Because fixing our recording of time is prohibitively expensive in terms of the complexity and performance. See also https://reproducible-builds.org/docs/source-date-epoch/#more-detailed-discussion and the following headings for some background about the relationship between time and reproducible builds.

So what do you suggest we should do instead?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants