Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CLI: --offline should mean fully offline #483

Open
woodruffw opened this issue Jan 31, 2023 · 21 comments
Open

CLI: --offline should mean fully offline #483

woodruffw opened this issue Jan 31, 2023 · 21 comments
Labels
component:cli CLI components component:tuf TUF related components enhancement New feature or request

Comments

@woodruffw
Copy link
Member

Once #478 is merged, sigstore verify will have an --offline flag that disables online transparency log lookups.

This flag should also disable TUF refreshes, since those require network access. As such, this is a subset/sub-issue of #376.

@woodruffw woodruffw added enhancement New feature or request component:cli CLI components component:tuf TUF related components labels Jan 31, 2023
@di
Copy link
Member

di commented Jan 31, 2023

While we're at it, we should make sure that TUF refreshes fail gracefully without a network connection.

@emilejbm
Copy link
Contributor

@emboman13 @omartounsi7 and I were thinking about a possible solution for this. We believe a workaround to this could be to check for the offline flag being set around:

path = self._updater.find_cached_target(target_info)
if path is None:
path = self._updater.download_target(target_info)

In the case where we are offline and there is no cached data, we would make sure the data is still empty and have the other _get functions handle that exception. This would entail passing the offline flag to the other _get functions for ctfe / rekor keys and fulcio certs.

Are we thinking about this correctly? Where do the TUF refreshes come into play?

@woodruffw
Copy link
Member Author

There's a bigger conceptual/design challenge here: TUF's security/threat model assumes that the TUF repository can always be refreshed, since the way TUF handles things like revocations is by deleting the relevant key from the repo entirely. We'll need to figure out if and how sigstore-python should compromise on that model, if we choose to support a "full" offline mode.

In the case where we are offline and there is no cached data, we would make sure the data is still empty and have the other _get functions handle that exception. This would entail passing the offline flag to the other _get functions for ctfe / rekor keys and fulcio certs.

If offline is passed and there's no cached TUF state, we should probably produce a hard error (since there's nothing meaningful we can do, verification wise, if we don't have any root of trust).

@emboman13
Copy link
Contributor

Would a possible solution be adding a time stamp to the cached state and require it be updated every so often to function? Otherwise this does seem to be quite an impasse

@woodruffw
Copy link
Member Author

Would a possible solution be adding a time stamp to the cached state and require it be updated every so often to function?

This would be the technical solution, but we'd need to work out how sigstore-python signals that it's effectively verifying in a "degraded" capacity (similar to how offline Rekor verification is already weaker than online verification).

Some kind of warning on stderr would probably be sufficient, but the most immediate step here is to coordinate with TUF and figure out if this is an already known use case/if they already have best practices written down somewhere.

@jku
Copy link
Member

jku commented Mar 27, 2023

This is maybe a slightly off-topic (or too high level) for this specific issue but possibly relevant so I'll write this down:

There seem to be two TUF pain points for sigstore-python:

  1. Always checking the remote server for new metadata (in other words "I'm running 'sigstore verify' multiple times in a short time, there is no point in checking for a new timestamp more than once" )
  2. No offline mode: It should be possible to make a user decision to stay offline (I other words "I'm running 'sigstore verify' without a network connection but have TUF metadata and targets caches: would like to use them even if they are expired"):
    • Is the description here accurate?
    • The end user workaround in this case is to just find the TUF target cache, and use the bundle in the cache as input to sigstore directly (IIRC you can give the bundle as input?). The downside of this workaround is that TUF protections are then completely overridden: if this use case was supported by the tuf library, you could at least check that the metadata otherwise correctly signs the target files -- it's just potentially expired.

I think both of these "issues" would be reasonable, but I wanted to see an agreement on the use cases, preferably with more details than I have above before we try to fix things.... It's so easy to "fix" the wrong thing. @woodruffw can you confirm if the above use cases are correct, if there are any others to take into account, and if they have a priority order for you?

@jku
Copy link
Member

jku commented Mar 27, 2023

Oh and also: I think the decisions here are also very much sigstore system level decisions:

  • how long can clients use key material without checking for new key material?
  • what sort of compromise is best for offline users?

I'm happy to figure out possible solutions with sigstore-python and python-tuf but in the end the answers probably should have wider sigstore ecosystem consensus

@emboman13
Copy link
Contributor

  1. No offline mode: It should be possible to make a user decision to stay offline (I other words "I'm running 'sigstore verify' without a network connection but have TUF metadata and targets caches: would like to use them even if they are expired"):

One of the things we were looking at changing from your previous branch of Python TUF was either using 2 Booleans in the config or just making it a multi-valued integer, such that offline mode would either hard fail if the cached data was expired or try to fetch new up-to-date data online, depending on how the config was set.

@woodruffw
Copy link
Member Author

@woodruffw can you confirm if the above use cases are correct, if there are any others to take into account, and if they have a priority order for you?

Those look correct to me! In terms of priority, I'd say (2) is higher priority than (1) at the moment -- IMO reducing roundtrips in the "online" case would be good for us to do, but doesn't reflect a current user pain point (at least, not one that's been reported to us).

  • Is the description here accurate?

I think so -- the way I'd frame it is "I have all of the local materials needed for a Sigstore root of trust, and I don't want to do any network connections at all." This precludes (initial) support for the signing case, only verifying.

  • IIRC you can give the bundle as input?

I actually don't think we directly support this, yet 😅 -- we have a couple of flags that effectively allow the user to build up the root of trust piece-by-piece, but not a flag that just says "use the trust bundle at <path> for everything." I think that would be good for us to add, though!

@woodruffw
Copy link
Member Author

  • how long can clients use key material without checking for new key material?

Hmm, I'm of a few different minds on this:

  1. KISS: a user who explicitly opts into a fully offline flow is effectively opting into a degraded security model, and we should communicate that (with lots of scary warning messages about understanding one's threat model) rather than trying to mediate it (e.g. by setting a somewhat arbitrary invalidation date for offline materials). This also has the benefit of making the use of a "detached" trust bundle easier, since it won't have a corresponding TUF repository to check staleness against.
  2. Different policies for different offline behaviors: a user who uses a detached "trust" bundle could just be warned, while a user who uses a TUF repository with updates explicitly disabled could be given a failure after 24 hours without an update. The downside there is additional cognitive overhead/complexity in the security model.
  3. Something else?

@jku
Copy link
Member

jku commented Mar 29, 2023

I think so -- the way I'd frame it is "I have all of the local materials needed for a Sigstore root of trust, and I don't want to do any network connections at all." This precludes (initial) support for the signing case, only verifying.

Can you clarify this a bit: Do you mean you expect the user to provide all key material (the sigstore root of trust) as input if they want to be offline, or is the idea that _internal.tuf module should be able to provide cached key material even if the TUF metadata is invalid because of expiry, when it's told to work "offline"?

The former (user provides all key material) sounds like just sigstore-python UI work (if it's not possible already), latter needs at least modifying the _internal.tuf module but likely should be a python-tuf feature -- this might not be completely trivial but it is a development I'd be interested in seeing.

@jku
Copy link
Member

jku commented Mar 29, 2023

  1. No offline mode: It should be possible to make a user decision to stay offline (I other words "I'm running 'sigstore verify' without a network connection but have TUF metadata and targets caches: would like to use them even if they are expired"):

One of the things we were looking at changing from your previous branch of Python TUF was either using 2 Booleans in the config or just making it a multi-valued integer, such that offline mode would either hard fail if the cached data was expired or try to fetch new up-to-date data online, depending on how the config was set.

Note that my branch does not attempt to solve the offline mode case at all, it's only trying to make the root and timestamp requests a little less often. Implementation of "offline mode" (IOW serving cached targets even if metadata is expired) in python-tuf would likely look different (and as I mentioned in previous comment, I don't yet know 100% if it is what sigstore-python wants).

I do agree the "fail fast if any network requests are absolutely needed" would make sense if an "offline mode" was added in python-tuf.

@emboman13
Copy link
Contributor

I do agree the "fail fast if any network requests are absolutely needed" would make sense if an "offline mode" was added in python-tuf.

I should've been more clear with my words; this is what we had meant. We wouldn't be making too large of a deviation from what you previously had, mostly just adding an additional flag that would make it so lazy refresh will hard fault instead of grabbing new metadata if metadata is expired. Then on the Sigstore side we would largely just be dealing with setting appropriate expiry times + setting up passing different config files based on if --offline (or even an additional --lazy-refresh flag) was set. That would allow for both a hard offline mode and your existing lazy refresh be accessible for SIgstore users.

This seems like a reasonable potential solution to start work on while specifics on expiry standards are finalized.

@woodruffw
Copy link
Member Author

Can you clarify this a bit: Do you mean you expect the user to provide all key material (the sigstore root of trust) as input if they want to be offline, or is the idea that _internal.tuf module should be able to provide cached key material even if the TUF metadata is invalid because of expiry, when it's told to work "offline"?

I was thinking of it as the former, but I could be (dis)convinced of either approach 🙂

I agree the former would primarily be UI work, rather than TUF work -- in effect it'd just be something like sigstore verify identity --offline --bundle /path/to/trust/bundle, which would cause us to read the specified trust bundle rather than attempting to update the TUF repository.

My thinking there was that the default value of --bundle would be whatever's already in the TUF repo, if it's been initialized. If it hasn't, then using --offline would produce an error. I think that should be fine, but I might have missed something!

@jku
Copy link
Member

jku commented Mar 31, 2023

I think this sounds quite reasonable.

  • first step is --offline that works by side-stepping TUF altogether, just looking into the target cache to find the key material. This is likely doable in sigstore-python only
  • a further improvement would be a python-tuf feature that would allow verifying the cached target in an offline mode (in other words, using cached metadata even if it is expired already, and not downloading any new metadata): this would give sigstore-python confidence that the local key material is valid (or was considered valid at an earlier date) when it is used with --offline. There likely is no python-tuf issue for this yet but I think it's an interesting idea
  • A separate improvement (somewhat unrelated to the two above) would be to decrease the amount of requests python-tuf makes (IOW, in some cases not downloading metadata if we already have valid metadata we can use): this likely requires a python-tuf feature described in ngclient feature: Add option to only update metadata if needed theupdateframework/python-tuf#2225 (comment), and may require a TUF spec addition

@jku
Copy link
Member

jku commented Mar 31, 2023

I found one more (possibly different) requirement:

we should make sure that TUF refreshes fail gracefully without a network connection.

@di What does this mean exactly? This does not quite sound like the third point in my previous comment (avoiding requests in situations where we think it's safe)...

What does a graceful failure look like in detail? The common situations we might want to consider:

  • no key material or metadata is locally cached
  • key material is locally cached, but TUF metadata is expired
  • (this third case should mostly be covered by my previous comment:) key material and metadata are up-to-date but TUF still wants to make requests

@di
Copy link
Member

di commented Mar 31, 2023

What does a graceful failure look like in detail?

Generally by this, I just mean "not raise an exception to the user in the CLI".

@emboman13
Copy link
Contributor

emboman13 commented Apr 24, 2023

Opened a draft PR for Python tuf that, if implemented, should provide a clean way to get offline functionality within Sigstore again. The mention from Emile above is an implementation of this fix in a testing setting.
theupdateframework/python-tuf#2363

@fproulx-boostsecurity
Copy link

Looking forward to this. Any chance this might ship before end of year ?

@jku
Copy link
Member

jku commented Nov 16, 2023

thanks for the ping... We discussed the TUF aspects with @woodruffw a couple of weeks ago but it seems I did not update the issue (sorry):

  • the "offline" feature for python-tuf does not currently lead to security wins that would make the (development and runtime) complexity reasonable: It's a long story but at least Updater feature request: verify chain of trust from bootstrapped root metadata theupdateframework/python-tuf#1168 would be needed to actually reap real tangible benefits from a offline TUF client
  • so the sigstore offline feature should be built to work around TUF, at least for now: If the user says "--offline", don't use the tuf client, instead assume that a) either there is a TUF cached trust root and it is the correct one or b) user provides the trust root material explicitly as arguments

The TUF workaround should be fairly easy to implement. I'm not sure if there are other aspects to --offline that need to be done.


The following is a hand wave design:

  • make _internal/tuf.py aware of --offline state
  • modify that component so that the public methods (like get_ctfe_keys()) will, when offline, just look into the target cache and return the cached target without verifying the target with the actual tuf client

@jku
Copy link
Member

jku commented Dec 14, 2023

I'm planning to add internal support for this while fixing #821: see #821 (comment)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
component:cli CLI components component:tuf TUF related components enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

6 participants