Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Crate quarantine #3464

Open
wants to merge 7 commits into
base: master
Choose a base branch
from
Open

Conversation

LawnGnome
Copy link

@LawnGnome LawnGnome commented Jul 25, 2023

/cc @rust-lang/crates-io, @rust-lang/cargo, @rust-lang/mods

Rendered

@LawnGnome LawnGnome added the T-crates-io Relevant to the crates.io team, which will review and decide on the RFC. label Jul 25, 2023
@LawnGnome LawnGnome self-assigned this Jul 25, 2023
text/0000-crate-quarantine.md Outdated Show resolved Hide resolved
text/0000-crate-quarantine.md Outdated Show resolved Hide resolved
text/0000-crate-quarantine.md Outdated Show resolved Hide resolved
text/0000-crate-quarantine.md Outdated Show resolved Hide resolved
LawnGnome and others added 2 commits July 25, 2023 12:34
Co-authored-by: Caleb Cartwright <calebcartwright@users.noreply.github.com>
@tarcieri
Copy link

I've been interested in a similar feature for internal use within a team of crates.io end users that releases crates: they could be configured to be immediately quarantined upon publication until approved by one (or potentially k of n) additional publisher beyond the person who published it before it goes live.

Such a 2-person (or more) system would prevent the compromise of a single publisher's credentials from translating directly to publication of a malicious crate. This would be useful for high-value crates. It would also help secure systems where crates are published by a release automation, then approved by a human/k human(s).

I'm curious if the design of the quarantine system could have an associated policy about how a quarantined crate can be published, i.e. approval of other crate publishers versus approval by crates.io administrators.

@epage
Copy link
Contributor

epage commented Jul 26, 2023

I've been interested in a similar feature for internal use within a team of crates.io end users that releases crates: they could be configured to be immediately quarantined upon publication until approved by one (or potentially k of n) additional publisher beyond the person who published it before it goes live.

Quarantining for extreme cases is easier to deal with imo. When it becomes expected, things get a lot more complex if we also want to handle cases like rust-lang/cargo#1169

@tarcieri
Copy link

@epage if a quarantine system were available to end-users, it could be used to implement rust-lang/cargo#1169.

You could stage all of the releases in quarantine, then set them live at the same time.

@epage
Copy link
Contributor

epage commented Jul 26, 2023

Yes, we could have sandboxed indexes that these quarantined packages go to.

That requires a lot of more design and implementation work than this RFC proposes.

@tarcieri
Copy link

I think all it requires out of the gate is attaching an AuthZ policy about how a crate can be un-quarantined/published to each quarantine state.

Arguably if you only have one AuthZ policy you don’t need that. It can be backfilled later.

I just hope you can keep the design flexible enough that it could potentially accommodate these features in the future.

@epage
Copy link
Contributor

epage commented Jul 26, 2023

As a future possibility, I think its worth it. We just need to recognize that is likely not a simple addition on top of this. The design work alone would need to examine the whole publish pipeline and address parts at each stage (cargo publish, crates.io API, crates.io index and .crate storage, the local index and crate storage, etc).

@LawnGnome
Copy link
Author

I've been interested in a similar feature for internal use within a team of crates.io end users that releases crates: they could be configured to be immediately quarantined upon publication until approved by one (or potentially k of n) additional publisher beyond the person who published it before it goes live.

That definitely sounds like an interesting area to explore! Consensus based peer approval is something that might go very well with distributed reviews in cargo vet, or other similar tooling. And, of course, one of my concerns in this RFC is workload on the crates.io team, so being able to distribute that more widely would be beneficial in the long run if this mechanism is used for more than just dealing with immediate spam attacks and bad actors.

That said, as @epage said, I'd prefer to keep this RFC focused on those more prosaic concerns right now — I don't believe anything in this RFC would prevent future work (and, indeed, simply decoupling the concept of a crate version being available in the index from running cargo publish will help significantly there), but this is largely around getting the crates.io team the tooling we need today, rather than trying to design a full review/quarantine system that is all things to all people for the future.

LawnGnome and others added 2 commits July 30, 2023 11:18
Co-authored-by: Josh Triplett <josh@joshtriplett.org>
Co-authored-by: teor <teor@riseup.net>
@ijackson
Copy link

ijackson commented Aug 7, 2023

Under this proposal, quarantining a crate could very seriously complicate publication of a multi-crate workspace (or other kind of "deep stack"). Rust wants us to make smaller crates for separate compilation reasons, but cargo then insists that these crates which can be conceptually part of a single program must be published separately. For release management to be practical, there must be a way to publish a workspace even if some of the involved crates end up quarantined.

This does not seem to be dicussed at all in the RFC. IMO it should be.

Solutions to this problem that occurred to me:

  • Allow publication to refer to quarantined dependencies, linking the new crate to the quarantined of those dependencies. (This seems like it would be a lot of work.)
  • Allow overriding of the "dependencies actually exist" check. The user would be expected to retry the publication of the failed crate(s) with the new flag. (This seems easy, but the resulting UX is poor.)
  • Allow overriding of the "dependencies actually exist" check. Automatically apply this when a dependency doesn't exist but might be be resolvable using quarantined uploads. (This needs a way to check for the existence of a quarantined crate.)
  • Do the server side "dependencies actually exist" check after quarantine review,; automatically quarantine any crate that has a quarantined dependency. (This may be doable; the UX will be poor, since the user would need to redo with --no-verify)
  • Define circumastances where we promise not to apply quarantine. (But it seems like it would be difficult to define rules that wouldn't be exploitable.)

@ijackson
Copy link

ijackson commented Aug 7, 2023

Another thing that needs to be addressed in the RFC:

When a new crate is quarantined, the name must not be visible via any public API. Otherwise it would be possible for a malicious actor to see these, and engage in "front-running", grabbing the crate name with a crate tailored to not be quarantined.

(The DNS suffers badly from front-running, where that malicious practice can offer benefits that are valuable to the perpetrators. We should try to avoid creating similar sets of incentives.)

@Nemo157
Copy link
Member

Nemo157 commented Aug 7, 2023

When a new crate is quarantined, the name must not be visible via any public API. Otherwise it would be possible for a malicious actor to see these, and engage in "front-running", grabbing the crate name with a crate tailored to not be quarantined.

This is not possible:

A version in the quarantined state [..] will appear in crates.io API responses as an unpublished crate, but will not be added to the index [..]

By being part of the API the crate has already been registered in the database as owned by the publishing user, and the published version registered on that crate. It's simply the index metadata telling cargo about the crate that hasn't been updated yet.


`cargo publish` will also be modified to report when an uploaded crate version has been placed into quarantine.

Once a crate version has been quarantined, then a member of the crates.io team will review the quarantined crate, ideally within 1 week[^sla]. Once reviewed, the crate may be either approved and published to the index, or rejected and deleted. Either way, the crate owner(s) will be notified by e-mail, assuming crates.io has a verified e-mail available for their account.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Once a crate version has been quarantined, then a member of the crates.io team will review the quarantined crate, ideally within 1 week[^sla]. Once reviewed, the crate may be either approved and published to the index, or rejected and deleted. Either way, the crate owner(s) will be notified by e-mail, assuming crates.io has a verified e-mail available for their account.
Once a crate version has been quarantined, then a member of the crates.io team will review the quarantined crate, ideally within 1 week[^sla]. Once reviewed, the crate may be either approved and published to the index, or rejected and deleted. Either way, the crate owner(s) will be notified by e-mail.

We only allow publishes from users with verified email addresses these days :)


Once a crate version has been quarantined, then a member of the crates.io team will review the quarantined crate, ideally within 1 week[^sla]. Once reviewed, the crate may be either approved and published to the index, or rejected and deleted. Either way, the crate owner(s) will be notified by e-mail, assuming crates.io has a verified e-mail available for their account.

Should the crate owner(s) disagree with the quarantine action, they may appeal to the moderation team.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we want to specify how long we will wait for an answer of the crate owner before potentially deleting a quarantined crate?


### API

A new field will be added to API endpoints that return crate versions called `published`. This will only be `false` for crate versions that are in quarantine.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm wondering if we should use something like state: "published" and state: "quarantined" instead. this would allow us to add other states in the future without adding more fields to it. one example could be temporary suspension of a crate/version due to policy violations or retroactive quarantine due to potentially malicious code being discovered.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thinking about this some more, I guess we could also encode whether the version has been published to the indexes yet. OTOH we have two of them now, so a single field might not be enough for that then 🤔


The publish endpoint will notify the publisher (generally `cargo publish`) of the quarantined version by extending the response[^publish-response] in two ways:

1. A new field will be added to the response called `quarantined`.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What do you think about adding a version field instead, containing the version that was just created? That would automatically contain the quarantine status (see "API" section below) and might be useful in other ways too.


## `cargo publish`

The wait loop in `cargo publish`[^wait-loop] currently refreshes the crate index and waits for the new crate version to appear.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks like the ^wait-loop reference is missing below 😉


### Large scale spam attack

In the event of a large scale spam attack on the entire site, we may want to either restrict individual users (if the spam is coming from a finite set of users) or quarantine all published versions temporarily (using a `.*` crate name rule) until it subsides, at which point we can analyse which crate versions are actually valid.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

another rule that might be useful: putting new crates into quarantine but allowing new versions of existing crates. not necessarily right now for this RFC, but we should keep it in mind when we implement this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
T-crates-io Relevant to the crates.io team, which will review and decide on the RFC.
Projects
Archived in project
Development

Successfully merging this pull request may close these issues.

None yet

10 participants