P2P code is not Byzantine fault tolerant #1527

mwleeds · 2018-03-30T19:29:01Z

In ostree's P2P code paths, repositories provide unsigned summary files along with signed per-repo and per-commit metadata that corroborates the important information in the summary. However, the signed metadata isn't checked at the same time as the summary file, and decisions are made based on the data in the summary file, so a maliciously crafted summary could DoS a peer on the same network (attacks using Internet remotes are more difficult, see here). Specifically, it's find_remotes_cb that reads the unsigned summary files. So if a peer claims to have the latest versions of every ref, find_remotes_cb would prioritize it and ostree_repo_pull_from_remotes_async would pull from it and fail, never getting any updates (because pull_from_remotes_async doesn't pull refs mapped to NULL checksums, which are refs that seem outdated or missing).

It would be fun to try a proof of concept of this, but I haven't yet.

Some options:

Make it so that pull_from_remotes_async pulls refs from remotes that don't claim to have the latest commit, only when the pull from the remote claiming to provide the latest commit failed.
Pull and check the signed commit metadata to verify the information going into the OstreeRepoFinderResult data structure. Alternatively, move some information mapping refs to commits into ostree-metadata and use that. This wouldn't prevent a peer from claiming to provide the latest real commit, only from claiming to provide a fake future commit. But I think that would be enough to prevent DoS.
Don't do anything, because the attack requires being on the same network as the victim, which usually implies some amount of trust.
Use a blockchain™ (jk)

Moved this into its own issue from #1518 (comment)

The text was updated successfully, but these errors were encountered:

pwithnall · 2018-07-11T11:03:38Z

Make it so that pull_from_remotes_async pulls refs from remotes that don't claim to have the latest commit, only when the pull from the remote claiming to provide the latest commit failed.

This seems reasonable, and was what I intended when writing the initial implementation of pull_from_remotes_async() — that it could be made more intelligent in future wrt pulling from multiple remotes in parallel, falling back to older versions if the newer ones aren’t available at all, etc.

Note that we don’t need to consider an attack to motivate falling back to pulling older commits (which are still newer than what the local machine has) from remotes: the same is necessary if the peer with the latest commits goes offline between finding it and pulling from it.

Pull and check the signed commit metadata to verify the information going into the OstreeRepoFinderResult data structure. Alternatively, move some information mapping refs to commits into ostree-metadata and use that. This wouldn't prevent a peer from claiming to provide the latest real commit, only from claiming to provide a fake future commit. But I think that would be enough to prevent DoS.

I avoided that in the first implementation because I think it would result in too many requests and too much data being pulled. For example, if you’re updating 10 apps on a LAN with machines which advertise 10 different versions of those apps (due to not being up to date, or whatever), that’s 100 requests for commit metadata, and 100 commit metadata files which you need to store locally for a while until they get pruned later on.

Don't do anything, because the attack requires being on the same network as the victim, which usually implies some amount of trust.

That’s OK for the initial implementation (it’s what we currently do), but I think it’s not a long-term solution. While we can trust local networks in some contexts (a well managed company or school, for example), we might not be able to trust them in every context (for example, consider a home network where some IoT device has been compromised) — so I wouldn’t want to put all our eggs in one security basket long term.

Use a blockchain™ (jk)

srsly.

Currently eos-updater asks the download scheduler (mogwai) before fetching an update, so that the user's configured update schedule can be respected. However that system is designed to help users with limited Internet connections; it doesn't make sense to use it when the update comes from local sources, such as the local network or a USB drive. So this commit makes changes to eos-updater so that offline OS updates can happen regardless of whether or not automatic updates are turned on. The way this is implemented is that in the Poll() stage, we first check offline sources for updates, and then only check the Internet if none were found offline. If we simply check all the sources at once as we previously were, you'd have to make significant changes to the Fetch() stage, because the scheduler can't make a decision when the offline and online results are mixed together. There are two notable implications of this implementation: 1. This means that even if the LAN or USB drive has an older update than what's available on the Internet, we update from it first (then on the next check for updates the Internet would be used). In my opinion, this is desired behavior because after the offline update is complete, the delta for the online update may be smaller. 2. Before this commit if the fetch from an offline source fails we try to fetch from the Internet. After this commit, if the offline fetch fails the whole update fails. This might be less than ideal but I don't think it will matter in practice. The alternative would be to rearchitect eos-updater or the libostree API used for the fetch in significant ways. A mounted filesystem or a peer on the network could conceivably prevent you from updating, but that's true both before and after this commit; see ostreedev/ostree#1527 https://phabricator.endlessm.com/T23972

Currently eos-updater asks the download scheduler (mogwai) before fetching an update, so that the user's configured update schedule can be respected. However that system is designed to help users with limited Internet connections; it doesn't make sense to use it when the update comes from local sources, such as the local network or a USB drive. So this commit makes changes to eos-updater so that offline OS updates can happen regardless of whether or not automatic updates are turned on. The way this is implemented is that in the Poll() stage, we first check offline sources for updates, and then only check the Internet if none were found offline. If we simply check all the sources at once as we previously were, you'd have to make significant changes to the Fetch() stage, because the scheduler can't make a decision when the offline and online results are mixed together. There are two notable implications of this implementation: 1. This means that even if the LAN or USB drive has an older update than what's available on the Internet, we update from it first (then on the next check for updates the Internet would be used). In my opinion, this is desired behavior because after the offline update is complete, the delta for the online update may be smaller. 2. Before this commit if the fetch from an offline source fails we try to fetch from the Internet (if they both provide the latest commit). After this commit, if the offline fetch fails the whole update fails. This might be less than ideal but I don't think it will matter in practice. The alternative would be to rearchitect eos-updater or the libostree API used for the fetch in significant ways. A mounted filesystem or a peer on the network could conceivably prevent you from updating, but that's true both before and after this commit; see ostreedev/ostree#1527 https://phabricator.endlessm.com/T23972

Currently libostree essentially has two modes when it's pulling refs: the "legacy" code paths pull only from the Internet, and the code paths that are aware of collection IDs try to pull from the Internet, the local network, and mounted filesystems (such as USB drives). The problem is that while we eventually want to migrate everyone to using collection IDs, we don't want to force checking LAN and USB sources if the user just wants to pull from the Internet, since the LAN/USB code paths can have privacy[1], security[2], and performance[3] implications. So this commit implements a new repo config option called "repo-finders" which can be configured to, for example, "config;lan;mount;" to check all three sources or "config;mount;" to disable searching the LAN. The set of values mirror those used for the --finders option of the find-remotes command. This configuration affects pulls in three places: 1. the ostree_repo_find_remotes_async() API, regardless of whether or not the user of the API provided a list of OstreeRepoFinders 2. the ostree_repo_finder_resolve_async() / ostree_repo_finder_resolve_all_async() API 3. the find-remotes command This feature is especially important right now since we soon want to have Flathub publish a metadata key which will have Flatpak clients update the remote config to add a collection ID.[4] This effectively fixes flatpak/flatpak#1863 but I'll patch Flatpak too, so it doesn't pass finders to libostree only to then have them be removed. [1] flatpak/flatpak#1863 (comment) [2] ostreedev#1527 [3] Based on how long the "ostree find-remotes" command takes to complete, having the LAN finder enabled slows down that step of the pull process by about 40%. See also flatpak/flatpak#1862 [4] flathub/flathub#676

Currently libostree essentially has two modes when it's pulling refs: the "legacy" code paths pull only from the Internet, and the code paths that are aware of collection IDs try to pull from the Internet, the local network, and mounted filesystems (such as USB drives). The problem is that while we eventually want to migrate everyone to using collection IDs, we don't want to force checking LAN and USB sources if the user just wants to pull from the Internet, since the LAN/USB code paths can have privacy[1], security[2], and performance[3] implications. So this commit implements a new repo config option called "repo-finders" which can be configured to, for example, "config;lan;mount;" to check all three sources or "config;mount;" to disable searching the LAN. The set of values mirror those used for the --finders option of the find-remotes command. This configuration affects pulls in three places: 1. the ostree_repo_find_remotes_async() API, regardless of whether or not the user of the API provided a list of OstreeRepoFinders 2. the ostree_repo_finder_resolve_async() / ostree_repo_finder_resolve_all_async() API 3. the find-remotes command This feature is especially important right now since we soon want to have Flathub publish a metadata key which will have Flatpak clients update the remote config to add a collection ID.[4] This effectively fixes flatpak/flatpak#1863 but I'll patch Flatpak too, so it doesn't pass finders to libostree only to then have them be removed. [1] flatpak/flatpak#1863 (comment) [2] #1527 [3] Based on how long the "ostree find-remotes" command takes to complete, having the LAN finder enabled slows down that step of the pull process by about 40%. See also flatpak/flatpak#1862 [4] flathub/flathub#676 Closes: #1758 Approved by: cgwalters

cgwalters added the bug label Mar 31, 2018

mwleeds mentioned this issue Jul 11, 2018

Add a way to disable checking LAN/USB sources flatpak/flatpak#1863

Closed

mwleeds mentioned this issue Oct 15, 2018

eos-updater: Avoid scheduler for offline updates endlessm/eos-updater#240

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

P2P code is not Byzantine fault tolerant #1527

P2P code is not Byzantine fault tolerant #1527

mwleeds commented Mar 30, 2018

pwithnall commented Jul 11, 2018

P2P code is not Byzantine fault tolerant #1527

P2P code is not Byzantine fault tolerant #1527

Comments

mwleeds commented Mar 30, 2018

pwithnall commented Jul 11, 2018