Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ostree-metadata commit API #1946

Merged
merged 7 commits into from
Feb 9, 2023
Merged

Conversation

dbnicholson
Copy link
Member

@dbnicholson dbnicholson commented Oct 16, 2019

There are 2 pieces of this PR:

  • Add an ostree_repo_regenerate_metadata API that rebuilds the summary and the ostree-metadata commit and signs them both. Currently the ostree-metadata commit is only created by the ostree summary CLI. This adds a way for callers to reliably generate it.

  • Add the local ref map to the ostree-metadata commit. This spawned out of discussion in https://lists.freedesktop.org/archives/flatpak/2019-October/001784.html and WIP: Work on authenticated downloads flatpak/flatpak#3167 (review). Essentially, it seems that since you can't trust the ref map in the peer's summary file, it should be included in the mirrored ostree-metadata commit like the additional summary metadata so that all you need from the peer's summary is a list of what mirrored refs it has.

I can split out the 2nd part, but I figured I'd put them both here for discussion.

@openshift-ci-robot
Copy link
Collaborator

Hi @dbnicholson. Thanks for your PR.

I'm waiting for a ostreedev or openshift member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@cgwalters
Copy link
Member

/ok-to-test

@cgwalters
Copy link
Member

/approve

@openshift-ci-robot
Copy link
Collaborator

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: cgwalters, dbnicholson

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@alexlarsson
Copy link
Member

alexlarsson commented Oct 17, 2019

I don't really understand the local ref map. Is this only about the flatpak xa.cache issue, or do you actually expect ostree to use this? If so, how?

In a situation where you're pulling N refs from M different p2p remotes you'll end up pulling the ostree-metadata ref from one of those M different remotes. And the rest of the N refs from some "random" subset depending on what refs each node has in their respective summary file. If the ostree-metadata you end up with is more recent than some of the other peers, do you expect to fail pulling those refs? That seems like a bad idea to me.

@dbnicholson
Copy link
Member Author

I don't really understand the local ref map. Is this only about the flatpak xa.cache issue, or do you actually expect ostree to use this? If so, how?

In a situation where you're pulling N refs from M different p2p remotes you'll end up pulling the ostree-metadata ref from one of those M different remotes. And the rest of the N refs from some "random" subset depending on what refs each node has in their respective summary file. If the ostree-metadata you end up with is more recent than some of the other peers, do you expect to fail pulling those refs? That seems like a bad idea to me.

The way I see it, you pull the ostree-metadata ref and the summary file from all of the p2p peers. That's how you reliably determine what subset of the remote each of them have. As I said in flatpak/flatpak#3167 (comment), since you can't verify the peer's summary file, the only source of verification it provides is the mirrored ostree-metadata file from the time when it mirrored from the remote. Having a copy of the remote's ref map in the ostree-metadata file allows you to verify that the commits in the peer are actually from the remote without fetching the commit objects from the peer.

However, I'm not the P2P expert.

@alexlarsson
Copy link
Member

The way I see it, you pull the ostree-metadata ref and the summary file from all of the p2p peers. That's how you reliably determine what subset of the remote each of them have. As I said in flatpak/flatpak#3167 (comment), since you can't verify the peer's summary file, the only source of verification it provides is the mirrored ostree-metadata file from the time when it mirrored from the remote. Having a copy of the remote's ref map in the ostree-metadata file allows you to verify that the commits in the peer are actually from the remote without fetching the commit objects from the peer.

However, I'm not the P2P expert.

Well, that is not how any ostree code is written today. The unique identified for the ref (i.e. the ref name + the collection id) can with the current repo layout only be stored once (as refs/mirrors/$collection_id/$ref), and it is pulled like any other ref, ie. from the highest priority available peer. Now, in theory one could pull it from all, but that is not how any code today works.

But, the summary isn't really necessary to verify the the ref, because the commit itself is also signed, and it includes a timestamp so that we can know if the update is a downgrade or not. So, I don't see the point of verifying it with the original summary.

The reason that flapak needs this information is because it stores info related to the refs inside the ostree-metadata itself (rather than in the commit itself), and it then needs to know what commit this data is about. As such, having the commit map in the summary could help (although the form this has uses much more space than the solution i did for flatpak only, as it duplicates every ref in the table, as well as using asciified commit ids instead of raw and including all refs rather than the required ones only.

@dbnicholson
Copy link
Member Author

Given that the last commit does not have good agreement, I'll break that out into a separate PR and leave this just for the new API to handle updating the summary and ostree-metadata commit in one shot.

@rh-atomic-bot
Copy link

☔ The latest upstream changes (presumably #1968) made this pull request unmergeable. Please resolve the merge conflicts.

@openshift-ci-robot
Copy link
Collaborator

@dbnicholson: PR needs rebase.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@dbnicholson
Copy link
Member Author

I rebased this on master now and dropped the "ref-map in ostree-metadata" commit. But I did add on an additional change I originally did in #490 to make the summary and summary.sig in a tmpdir before renaming. Although this doesn't solve a race (as #490 was trying to do), it does mean that you won't publish a summary without a matched summary.sig file.

*/
gboolean
ostree_repo_regenerate_metadata (OstreeRepo *self,
GVariant *additional_metadata,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe summary_additional_metadata to make it clear this goes in the summary file?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think since the metadata goes in both the summary and the ostree-metadata commit then that's not entirely true. But I don't mind if it changes name.

const gchar **key_ids,
const gchar *homedir,
GCancellable *cancellable,
GError **error)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Whenever we add new extensible APIs, I lean towards adding a GVariant *options arg to help future proof it a bit more. WDYT?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Excellent suggestion. I did that and dropped the explicit GPG parameters. Instead, you specify them in options with or without non-GPG signing keys. There are other things I think could be added in the future. For example, flatpak has an option to only list a subset of refs in the summary. I think that was mostly done because flathub's summary file was getting enormous.

{
g_debug ("Deleting existing unmatched summary.sig file");
if (!ot_ensure_unlinked_at (self->repo_dir_fd, "summary.sig", error))
return glnx_prefix_error (error, "Unable to delete summary signature file: ");
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A small improvement is hoisting this out of the if and up to before doing the summary rename. That way, if we're able to rename the summary but not the signature, we don't have to hope that we can remove the old one.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But if you remove the old summary before writing the new one, then you're left with no summary signature if there are errors. I think it's preferable to leave it the old signature in place until you know the new summary has been written.

I suppose if you've specified that you're going to create the new summary without a signature, then you might as well remove the old one, though. Like:

if (!signing)
  remove_old_signature()
rename_new_summary()
if (signing)
  if (!rename_new_signature())
    remove_old_signature()

Is that what you meant? I think it makes sense either way, but I can see the advantage of doing it that way.

@rh-atomic-bot
Copy link

☔ The latest upstream changes (presumably #1957) made this pull request unmergeable. Please resolve the merge conflicts.

@dbnicholson
Copy link
Member Author

I decided to dust off this PR this week. It's mostly the same as before (I'll respond to the original comments above). The main difference is that I changed the API to take a GVariant *options where the GPG keys and/or sign keys can be specified. I don't think non-GPG signing existed when I first started this.

The other thing I did was make the ostree-metadata commit generation an explicit internal API. Back when flatpak was in the ostree-metadata business, @pwithnall suggested that there should be public API in ostree to create the commit. Several years later, flatpak has dropped it and I don't think there's any intention of bringing it back. I don't know if there are any other users of that commit that would benefit from a public API now.

Copy link
Member

@cgwalters cgwalters left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Only gave this a really superficial skim so far...having to context switch in a lot here 😄

Very superficially, seems sane.

';', NULL, &read_values, error))
return glnx_throw(error, "Unable to parse bls-append-except-default");

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could probably do this trailing-whitespace trimming in a prep patch.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point. I only noticed after the fact that my editor is now following the .editorconfig setting to do that.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seemed weird to only strip the trailing whitespace in the files I was touching, so I went all the way and added a prep commit that does all the C files. I can split that out to a separate PR if you prefer.

dbnicholson and others added 7 commits February 7, 2023 14:50
My editor started following the configuration in .editorconfig and is
applying this rule to many files I'm editing. Let's just get this over
with and strip everything. This was done like so:

  git ls-files | grep '\.[ch]$' | xargs sed -ri 's/\s+$//'
Currently this is just a wrapper around regenerating and signing the
summary in one call, but later it will be used to also generate the
`ostree-metadata` commit if the repo has a collection ID.
Call `ostree_repo_regenerate_metadata` with the provided GPG and sign
keys to generate and sign the summary in one call. This changes the
handling when GPGME support is disabled but GPG keys are supplied.
Instead of silently ignoring the option, it will now error. IMO that's
better as callers would otherwise not know that the summary is not GPG
signed.
If a commit is being made during summary generation, then it would
trigger the summary to be generated again. That's either unwanted busy
work or could result in an infinite loop. Add a boolean in
`OstreeRepoTxn` to disable automatic summary generation as seen fit.
Rather than creating the `ostree-metadata` commit in the summary
builtin, do it in the new `ostree_repo_regenerate_metadata` API. The
commit contents are unchanged and the commit is generated before the
summary as before. To keep from triggering an extra summary update,
automatic summary updating is disabled in the transaction.

Since the summary builtin was already using the new API, it will
continue to generate the `ostree-metadata` commit when the repo has a
collection ID. However, the `ostree_repo_regenerate_summary` API will
still only generate the summary file as before.
Refactor the summary signing APIs to use internal versions where the
directory fd containing the summary can be found. The existing signing
APIs still uses the repo directory fd, but this will allow using a
temporary directory for the summary and signature in the new metadata
generating API.
Use a temporary directory for the summary and signature file in
`ostree_repo_regenerate_metadata` so that the summary file isn't
published if signing fails. This prevents publishing a summary without a
signature file or leaving a mismatched signature file in place.
Copy link
Member

@cgwalters cgwalters left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice!

@cgwalters
Copy link
Member

/override continuous-integration/jenkins/pr-merge

@openshift-ci
Copy link

openshift-ci bot commented Feb 9, 2023

@cgwalters: Overrode contexts on behalf of cgwalters: continuous-integration/jenkins/pr-merge

In response to this:

/override continuous-integration/jenkins/pr-merge

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@cgwalters cgwalters merged commit 8025a83 into ostreedev:main Feb 9, 2023
@dbnicholson dbnicholson deleted the ostree-metadata-api branch February 13, 2023 19:33
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants