Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Decision: Separator mapping #2

Closed
Manishearth opened this issue Oct 15, 2020 · 8 comments
Closed

Decision: Separator mapping #2

Manishearth opened this issue Oct 15, 2020 · 8 comments

Comments

@Manishearth
Copy link
Owner

Manishearth commented Oct 15, 2020

Related: #1

The current RFC maps foo/bar to foo_bar in Rust code.

There are alternate mappings that can be selected

The proposal suggests mapping foo/bar to foo_bar, but as mentioned in the typosquatting section, this has problems. There may be other mappings that work out better:

  • foo::bar (see section above)
  • foo::crate::bar
  • foo::/bar
  • ~foo::bar

and the like.

@Manishearth
Copy link
Owner Author

I am personally a huge fan of mapping to _ because it does not require any additional changes to rustc and makes this a purely crates/cargo-side feature

@Manishearth Manishearth changed the title Separator mapping Decision: Separator mapping Oct 15, 2020
@carols10cents
Copy link

carols10cents commented Oct 16, 2020

The typosquatting section referenced:

Dash typosquatting

This proposal does not prevent anyone from taking foo-bar after you publish foo/bar. Given that the Rust crate import syntax for foo/bar is foo_bar, same as foo-bar, it's totally possible for a user to accidentally type foo-bar in Cargo.toml instead of foo/bar, and pull in the wrong, squatted, crate.

We currently prevent foo-bar and foo_bar from existing at the same time. We could do this here as well, but it would only go in one direction: if foo/bar exists foo-bar/foo_bar cannot be published, but not vice versa. This limits the "damage" to cases where someone pre-squats foo-bar before you publish foo/bar, and the damage can be mitigated by checking to see if such a clashing crate exists when publishing, if you actually care about this attack vector. There are some tradeoffs there that we would have to explore.

One thing that could mitigate foo/bar mapping to the potentially ambiguous foo_bar is using something like foo::crate::bar or ~foo::bar or foo::/bar in the import syntax.

I was confused about the multiple negatives and slashes in this sentence, especially when I lost the markdown formatting the first time I copied it (I've fixed it now):

if foo/bar exists foo-bar/foo_bar cannot be published, but not vice versa.

To clarify, I would remove the / and spell out the "vice versa" and the reasoning so that this says:

- if `foo/bar` exists `foo-bar`/`foo_bar` cannot be published, but not vice versa.
+ if `foo/bar` exists, neither `foo-bar` nor `foo_bar` will be allowed to be published. However, 
+ if `foo-bar` or `foo_bar` exist, we would choose to allow `foo/bar` to be 
+ published, because we don't want to limit the use of names within a crate namespace 
+ due to crates outside the namespace existing.

Have I understood correctly?

To put this another way, this would mean crates.io could have the crates serde_gelf and serde/gelf (if serde_gelf exists first, which it would). If for some reason you wanted to depend on both, your Cargo.toml would contain:

[dependencies]
serde_gelf = "0.1.6"
"serde/gelf" = "1.0.0"

and to bring items from each of these into scope, your code would need:

use serde_gelf::stuff; // brings in from `serde_gelf`
use serde_gelf as serde_somethingelse_gelf; // need to choose a new name for one of them
use serde_somethingelse_gelf::other_stuff; // can then bring more names into scope with the alias

Seems... non-ambiguous to the compiler but potentially a little annoying for the programmer, but probably not common?

Manishearth added a commit that referenced this issue Oct 16, 2020
@Manishearth
Copy link
Owner Author

Have I understood correctly?

Correct! I've updated the RFC. The intent is to make sure that foo- does not become a de facto namespace.

use serde_gelf as serde_somethingelse_gelf; // need to choose a new name for one of them

This would have to be renamed in Cargo.toml, otherwise both would map as serde_gelf and Cargo would have to throw an error. Cargo lets you do this already.

(Worth noting, it's already possible to publish multiple crates that have the same lib name, so Cargo needs to deal with this already)

Seems... non-ambiguous to the compiler but potentially a little annoying for the programmer, but probably not common?

Correct, however someone could maliciously do this to confuse people, e.g. by publishing serde-gelf containing a bitcoin miner, whereas serde/gelf is the Actual Good crate.

@carols10cents
Copy link

This would have to be renamed in Cargo.toml

Ahhh right.

Correct, however someone could maliciously do this to confuse people, e.g. by publishing serde-gelf containing a bitcoin miner, whereas serde/gelf is the Actual Good crate.

LOL it's really difficult to consider this and #1 and #3 separately... hmmm....

@Manishearth
Copy link
Owner Author

Yeah to be clear discussion for that should probably belong on #3, but we might need to merge things.

@pksunkara
Copy link

Keeping in mind that we also solve the problem of typosquatting mentioned in #3, I have a proposal:

Let's make the mapping __ (double underscore). So, foo/bar would map to foo__bar in Rust code.

use foo::bar; // mod `bar` of crate `foo`
use foo_bar; // crate `foo_bar`/`foo-bar`
use foo__bar; // crate `foo/bar`

// We can even expand
use actix__web__middleware__cors; // crate `actix/web/middleware/cors`

// We can even combine
use foo__bar_baz; // crate `foo/bar-baz`

There is a relevant issue that proposes the banning of --/__ in crate name which we could implement first while we work on namespacing.

These are the only crates that currently use either of them:

$ rg --files -l -g "*__*" .
./te/ar/teardown_tree___treap
./ir/ef/iref__bbqsrc
./a4/2-/a42-__-
$ rg --files -l -g "*--*" .
./xn/--/xn--ls8h
./co/ns/const_env_impl--value
./co/ns/const_env--value

@pitaj
Copy link

pitaj commented Mar 10, 2022

I like the double underscore idea, since it maps to the same number of characters.

I think it provides a clear path forward:

  • for backwards compatibility with older editions and whatnot, have cargo map ns::crt to ns__ct
  • if future rustc chooses to support ns::crt for namespaced crates, cargo can add a identity map as well, ns::crt to ns::crt

This encapsulates the cargo side from the rustc side and allows for an easy upgrade path.

Manishearth added a commit to Manishearth/rfcs that referenced this issue Mar 25, 2022
@Manishearth
Copy link
Owner Author

Fixed in Manishearth/rfcs@1986c3d, we have switched to ::

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants