Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Narrow summary file race #490

Closed
wants to merge 4 commits into from

Conversation

dbnicholson
Copy link
Member

Provide a new function, ostree_repo_regenerate_summary_ext, that creates and signs the summary file in a tmpdir before renaming them into place. This minimizes the race between summary generation and signing without going fully atomic. Hopefully this is good enough for now since being atomic on the client side is quite a bit more complicated.

Closes: #487

This is for signing the summary file, not a static delta.

Closes: ostreedev#487
Provide internal versions that take a directory fd where the summary and
signature are handled. The existing functions continue to operate in the
top repository directory, but this allows another variant to manage the
summary file in a temporary directory.

Closes: ostreedev#487
The ostree_repo_regenerate_summary_ext function allows regenerating and
signing the summary in one call. The implementation tries to reduce the
race between these operations as much as possible by using a temporary
directory and renaming the files.

Closes: ostreedev#487
This minimizes the race between generation and signing of the summary
file. It also means that if the new summary file can't be signed, the
old summary file is retained.

Closes: ostreedev#487
@dbnicholson dbnicholson mentioned this pull request Sep 1, 2016
/* Remove comment when first new symbol is added
LIBOSTREE_2016.10
LIBOSTREE_2016.11 {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe the patch is just weirdly rendered, but it looks like the new function gets called .10 and the old that .11.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's just a little weirdness in the ostree symbols file plus weirdness in symbol version files in general. This part is in a comment with a fake symbol waiting for the next release. My addition is above in the .10 section that chains to the .8 section. There's a little extra noise in this patch because the commented out .10 section prior to this patch (there weren't any new symbols yet) was missing the trailing {.

@alexlarsson
Copy link
Member

Seems like a good first step to me. If at some point we make it atomic on the client side we need this function anyway.

@cgwalters
Copy link
Member

Debian's bug here is similar:

https://bugs.launchpad.net/launchpad/+bug/1430011
http://www.chiark.greenend.org.uk/~cjwatson/blog/no-more-hash-sum-mismatch-errors.html

A few other ostree bugs related here:

https://bugzilla.gnome.org/show_bug.cgi?id=739143
https://bugzilla.gnome.org/show_bug.cgi?id=729388

So I see two things here. First, we could introduce summary-$summaryhash.sigs which is named by the hash of the summary. We'd need garbage collection for older ones (probably want to keep a few around to avoid issues with proxies that give stale results).

Second, we could support inline signatures for the summary file. I suspect most users would be happy enough with that. The problem is though it'd be a hard format change unless we basically doubled the size and had an inline-signed copy?

Related to all of this...where I'd like to go is separating metadata fetches from content fetches. The idea here is you fetch metadata atomically from more centralized servers (with e.g. TLS pinning), then you can get content from anywhere. One could pretty easily keep the ostree metadata here (refs/summary/signatures) entirely in memory and update it atomically. The client side of this is #469

@cgwalters
Copy link
Member

Finally related to all of this...I'd like to consider backing away from storing the static delta data in the summary file. It can get expensive quickly. Flatpak is relying on signed commits in the deltas anyways, and I'd like to push people towards pinned TLS + signed commits as simplest. Metadata is hard, particularly when one involves signing.

@alexlarsson
Copy link
Member

@cgwalters summary-$summaryhash.sigs seems the wrong direction. Then you have to download the potentially large summary before the signature, which breaks the "same summary => no need to redownload" optimization. I.e. since we download the summary.sig first, we should have a mapping from that (could be a simple name of the summary file stored in the summary) to the full summary file.

@alexlarsson
Copy link
Member

I mean "same summary sig => no need to redownload"

@cgwalters
Copy link
Member

So...I am not opposed to this at all, and sorry for introducing the race in the first place (metadata is hard). But, if we're making changes here I'd like to think a little bit harder about what we can do to fix the race vs narrowing the window.

I think what has me hesitating a bit is adding new API without solving the race.

@dbnicholson
Copy link
Member Author

I'd been meaning to follow up here, too. Here's my other idea that removes the race (I think), but would require new client code. Let me see if I can get a reasonable tree representation:

├── summaries
│   ├── 41af286dc0b172ed2f1ca934fd2278de4a1192302ffa07087cea2682e7d372e3
│   │   ├── summary
│   │   └── summary.sig
│   ├── be53d2ef39bda0f458c5cad8590e162a54a44f7c9a45a5017e539fa7b90192b1
│   │   ├── summary
│   │   └── summary.sig
│   └── current -> be53d2ef39bda0f458c5cad8590e162a54a44f7c9a45a5017e539fa7b90192b1
├── summary -> summaries/current/summary
├── summary.sig -> summaries/current/summary.sig

When creating a new summary, use a temporary directory under summaries. Optionally sign in the temporary directory. Rename the temporary directory to the checksum of the summary file. Update the current symlink to point to checksum directory. This removes the server side race.

On new clients, they'd download summaries/current/summary, take a checksum of the summary file, and then download summaries/<checksum>/summary.sig. They might have an old summary at that point, but they'll get the right signature.

Old clients would still race trying to download the toplevel summary.sig after downloading the summary file, but they'd no longer be exposed to the server side race.

There would need to be some cleanup of old summary directories. Maybe you'd keep 3?

BTW, if we're adding the _ext API (which I think we should), it should probably get a flags parameter in case anything is wanted there later.

@rh-atomic-bot
Copy link

☔ The latest upstream changes (presumably 36e8ba1) made this pull request unmergeable. Please resolve the merge conflicts.

@alexlarsson
Copy link
Member

The proposal from @dbnicholson means we always have to start by downloading the entire summary, instead of starting with the sig, before detecting that there has been no chnges. If we do that, then we have to use an etag or something to avoid downloading the large metadata if nothing changed.

Alternatively we can make the subdirectory be named by the hash of the signature. Then you can start by downloading and verifying that, and use its checksum to find the summary file. And for unsigned summaries you can just use the toplevel symlinks, because there is no race.

@dbnicholson
Copy link
Member Author

Yeah, sorry. I'd never thought of that optimization. I thought ostree always downloaded the summary file. Another issue i thought of with that setup is that the symlinks require you to host on a UNIX server. I think the rest of ostree has always allowed it to be hosted on any dumb http server.

So, my other thought is that instead of a symlink that summaries/current is a ref file with the checksum of the summary file, which is also the directory name as above. Then you can download the current file and quickly see if the checksum for the summary file has changed. If you want more verification, you can get the signature file from the checksum directory as above.

Without the symlinks, the top level compat files wouldn't be able to avoid the server side race. The best you could do is narrow it as in the patches here.

@cgwalters
Copy link
Member

It's going to be really messy to try to fix the race solely on the server side without having more intelligence there (like the split metadata/content server model). If we want to support just rsync for example (which today only works sort of by accident because refs/ is alphabetically after objects/ 😄 )

Let's take a step back - do you need the signed summary file vs what we have with per-commit signatures?

And to note again: if we can move the ecosystem to split metadata/content fetches, a lot of things become simpler. The most recent ostree supports this.

@alexlarsson
Copy link
Member

Do we need it? I dunno. Without signed summaries we can't protect against certain things. For instance, someone can MITM you and claim the "org.some.app" app actually contains the commit from "org.other.app" from the same remote. Is this bad? I dunno, its somewhat limited as an attack vector.

We do require signed summaries today though.

What do you mean by split metadata/content, can you describe it in a bit more detail?

@alexlarsson
Copy link
Member

Also, we store some extra info in the summary files, such as app permissions/metadata and sizes in the summary file. Those would be fakeable if not signed, although we could re-verify them after download to catch issues.

@cgwalters cgwalters added the WIP label Oct 4, 2016
@dbnicholson
Copy link
Member Author

dbnicholson commented Jan 21, 2020

I'm gonna close this since it doesn't actually address the client side race. I moved a couple pieces into #1946 - a new API that does the generation and signing in one call and use of a temporary directory for the summary and summary.sig file so that you don't publish mismatched files.

@pwithnall
Copy link
Member

pwithnall commented Jan 22, 2020

Do you mean #1490? It seems unrelated.

@openshift-ci-robot
Copy link
Collaborator

@dbnicholson: PR needs rebase.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@dbnicholson
Copy link
Member Author

Do you mean #1490? It seems unrelated

Oops I did not. It's #1946.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Summary file races
6 participants