Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make the authors field optional #3052

Merged
merged 5 commits into from
Mar 17, 2021
Merged
Changes from 4 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Jump to
Jump to file
Failed to load files.
Diff view
Diff view
140 changes: 140 additions & 0 deletions text/0000-deprecate-authors-field.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,140 @@
# RFC: Deprecate the authors field

- Feature Name: `deprecate_authors_field`
- Start Date: 2021-01-07
- RFC PR: [rust-lang/rfcs#0000](https://github.com/rust-lang/rfcs/pull/0000)
- Rust Issue: [rust-lang/rust#0000](https://github.com/rust-lang/rust/issues/0000)

# Summary
[summary]: #summary

This RFC proposes to deprecate the `package.authors` field of `Cargo.toml`.
This also implies preventing Cargo from auto-filling it, allowing crates to be
published to crates.io without the field being present, and avoiding displaying
its contents on the crates.io and docs.rs UI.

# Motivation
[motivation]: #motivation

The crates.io registry does not allow users to change the contents of already
published versions: this is highly desirable to ensure working builds don't
break in the future, but it also has the unfortunate side-effect of preventing
people from updating the list of crate authors defined in `Cargo.toml`'s
`package.authors` field.
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it really not possible to redact their names from existing packages? The only real use case for package.authors is env!("CARGO_PKG_AUTHORS"). Could anyone conduct a research to study how often this is actually used? Even if they are used, redacting a field from an existing package is unlikely to cause any issues unless, for some reason, a certain crate fails to compile without having a : in $CARGO_PKG_AUTHORS, or unless the crate tries to encode some logic inside the authors field. (This is hilarious, but I have actually seen the latter done in another community by someone who doesn't want his software to be "stolen" by forking + changing author name)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changing the contents of a crate will invalidate its hash, which will prevent any person depending on the crate from building their code.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The only other use-case I know is for listing maintainers of separate crates in large internal workspaces, but that can be easily achieved in some other way.


This is especially problematic when people change their name or want to remove
their name from the Internet, and the crates.io team doesn't have any way to
address that at the moment except for deleting the affected crates or versions
altogether. We don't do that lightly, but there were a few cases where we were
forced to do so.
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it really justified that we conduct a major change just for a minor use case that happens very rarely?

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This also only removes their name from the internet, but not the contents they created. Is this really meaningful in that sense? In particular, what if for exmaple, their names for some reason got into the code section of another person's crate?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it really justified that we conduct a major change just for a minor use case that happens very rarely?

One of the things I value the most is the personal safety of every Rust user. I strongly believe changes like this are justified if they can prevent people from being harmed.

This also only removes their name from the internet, but not the contents they created. Is this really meaningful in that sense? In particular, what if for exmaple, their names for some reason got into the code section of another person's crate?

This is anecdotal evidence, but I have had access to help@crates.io for almost two years, and all of the cases where personal information needed to be deleted were related to package.authors, not the source code of the crates. Of course we can't prevent people from intentionally adding their name in the source code, but not forcing them to do so will address most of the issues.


The contents of the field also tend to scale poorly as the size of a project
grows, with projects either making the field useless by just stating "The
$PROJECT developers" or only naming the original authors without mentioning
other major contributors.
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What if we look at it from another way? Authors is not for accreditation, but for contacting a maintainer. In that case what if we just rename authors to maintainer/contact?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's not the main reason why I'd like for this RFC to land. It's another effect that I personally think is positive, but it's more of a collateral benefit.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the authors want to be contactable they can provide contact details in the description/readme/homepage still (I assume most maintainers will want to be contacted via their projects issue tracker, not random emails).

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If authors is for contact information, you've got the same problem again – people's contact info changes in a manner completely unrelated to crate versions. Do you make a minor release when you change your email address? Things like this shouldn't even need to be in the version control, imo, because they're conceptually unlinked from the software.


# Guide-level explanation
[guide-level-explanation]: #guide-level-explanation

crates.io will allow publishing crates without the `package.authors` field, and
it will stop showing the contents of the field in its UI (the current owners
will still be shown). docs.rs will also replace that data with the crate
owners.

`cargo init` will stop pre-populating the field when running the command, and
it will not include the field at all in the default `Cargo.toml`. Crate authors
will still be able to manually include the field before publishing if they so
choose. Eventually Cargo will warn when publishing crates with the field set.

Crates that currently rely on the field being present (for example by reading
the `CARGO_PKG_AUTHORS` environment variable) will have to handle the field
being missing (for example by switching from the `env!` macro to
`option_env!`). Eventually they will have to provide a way to inline the
information in the crate's source code, if the consumer of the crate desires to
do so.

# Reference-level explanation
[reference-level-explanation]: #reference-level-explanation

The implementation of this RFC spans multiple parts of the Rust project:

## Cargo

Cargo will stop fetching the current user's name and email address when running
`cargo init`, and it will not include the field in the default template for
`Cargo.toml`. Cargo will also treat the field as deprecated, eventually
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does cargo currently warn when a non-existent field is present? Like say [package] does_not_exist = 124? If not, I don't think we should warn for author either. Alternatively we should start warning for all unknown or unused fields (which would probably be a change independent of this RFC)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cargo currently errors for nonexistant fields IIRC

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cargo check emits a warning, but cargo publish just goes silent

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, my bad

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@nagisa if we decide to deprecate the field I still think cargo should warn when it's present, regardless of whether other fields are issuing warnings. The main goal of the warning would be to let the authors know that they can and probably should remove the authorship information they left in Cargo.toml.

displaying a deprecation warning when someone tries to publish a crate with the
field set.
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Plus, this no longer requires the $USER variable to be set in cargo new. This is actually good news for docker image maintainers.


## crates.io

crates.io will allow publishing versions without the field and with the field
empty. The Web UI will remove the authors section, while retaining the current
owners section.

The API will continue returning the `authors` field in every endpoint which
currently includes it, but the field will always be empty (even if the crate
author manually adds data to it). The database dumps will also stop including
the field.
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems to cause a superset of the problems caused by redacting authors in existing versions upon author's explicit request. Are you sure this is justified?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As far as I'm aware there is no documented API endpoint that exposes the authorship information, and the database dumps are clearly marked as "experimental". Removing the information from there will mean we can delete it from the crates.io database.


## docs.rs

docs.rs will replace the authors with the current owners in its UI.

# Drawbacks
[drawbacks]: #drawbacks

Cargo currently provides author information to the crate via
`CARGO_PKG_AUTHORS`, and some crates (such as `clap`) use this information.
Deprecating the authors field will require crates currently using it to change,
such as by inlining the author data.
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is the expected impact in the long term? If it is eventually removed, will the BC for current packages using $CARGO_PKG_AUTHORS be broken?

If we don't intend to remove it in the long term, why deprecate (instead of remove) it at all?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How to remove the field in the future is left as a future possibility. An approach I could see working is using the edition mechanism, but I think that's out of scope for this RFC.


This RFC will make it harder for third-party tools to query the author
information of crates published to crates.io.

By design, this RFC removes the ability to know historical crate authors. In
some cases, crate authors may have wanted that information preserved. After
this RFC, crate authors who want to display historical authors who are not
current crate owners will have to present that information in some other way.

# Rationale and alternatives
[rationale-and-alternatives]: #rationale-and-alternatives

This RFC reduces the problems related to changing the names in the authors
field significantly, as people will now have to explicitly want to add that
data instead of it being there by default.

We could do nothing, but that would increase the support load of the crates.io
team and would result in more crates being removed from the registry due to
this issue.

# Prior art
[prior-art]: #prior-art

* **JavaScript:** `package.json` has an optional `authors` field, but it's not
required and the interactive `npm init` command does not prepopulate the
field, leaving it empty by default. The npm Web UI does not show the contents
of the field.
* **Python:** `setup.py` does not require the `authors` field. The PyPI Web UI
shows its contents when present.
* **Ruby:** `*.gemspec` requires the `authors` field, and the RubyGems Web UI
shows its contents.
* **PHP:** `composer.json` has an optional `authors` field. While it's not
required, the interactive `composer init` command allows you to choose
whether to pre-populate it based on the current environment or skip it. The
Packagist Web UI does not show the contents of the field.

# Unresolved questions
[unresolved-questions]: #unresolved-questions

* What should we do about the metadata in already published crates?

# Future possibilities
[future-possibilities]: #future-possibilities

The `package.authors` field could be removed in a future edition.

A future RFC could propose separating metadata fields that could benefit from
being mutable out of `Cargo.toml` and the crate tarball, allowing them to be
changed without having to publish a new version. Such RFC should also propose a
standardized way to update and distribute the extracted metadata.