Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

New #[pkgid]-based library naming is problematic #10922

Closed
lilyball opened this issue Dec 11, 2013 · 21 comments
Closed

New #[pkgid]-based library naming is problematic #10922

lilyball opened this issue Dec 11, 2013 · 21 comments

Comments

@lilyball
Copy link
Contributor

This change to using #[pkgid] is causing a serious problem. Library names are now completely different. And I don't mean the hash. I mean the actual name on-disk. Not only that, but libraries without a #[pkgid] will potentially be named completely incorrectly and collide on-disk.

The library name on disk is lib<name>-<hash>.... Previously, <name> came from the link args, so #[link(name="foo")]; produced libfoo-<hash>.... Now, it's apparently based off of the last path component of the #[pkgid]. So if my library doesn't have a #[pkgid], and the root file is named lib.rs, then I'll end up with liblib-<hash>.... And since the hash is also based off of the #[pkgid], then every single non-#[pkgid] library with the same root source file will result in the same full filename (as both the library name and the hash are identical). That's really bad.

If I go ahead and add a #[pkgid], and use the recommended approach of using my github path as the pkgid, then I end up with a library that's potentially named incorrectly. For example, my github.com/kballard/rust-lua library wants to produce liblua-<hash>... but instead with the #[pkgid] it's now librust-lua-<hash>....

@metajack
Copy link
Contributor

This is working as designed. Whereas you were controlling the library's name with #[link(name="foo")]; before, you must do this with #[pkgid="foo"]; now.

The rust compiler used to infer the name "liblib-HASH-VERSION" for lib.rs, and does so still, so that behavior hasn't changed at all. I think in this case even the hash ids would have collided assuming the libs didn't have dependencies.

You can't have the library be called something different than the package id if we want rustpkg to work in its current form. It uses the pkgid to know where to get what it needs to build. So if you have github.com/kballard/rust-lua it is going to look for librust-lua at that location. You can control the name at the point of use by using extern mod lua = "github.com/kballard/rust-lua";

@brson
Copy link
Contributor

brson commented Dec 11, 2013

I see how this is troublesome, and as this is going to effect everybody I should have thought through how to transition existing packages to this system.

At one point we discussed teaching the compiler to cope with the common repo conventions of "rust-foo" and "foo-rs", but even if we did that, forcing the library to have the same name as the github repo seems limiting (I don't think we've ever resolved the issue of how to have multiple libs in a single repo).

Does anybody have any ideas about what we should do here?

@lilyball
Copy link
Contributor Author

@metajack: Your own blog post recommends #[pkgid] be of the form github.com/kballard/rust-lua#0.1 (assuming that's the correct URL, of course). The problem with this is my library is supposed to be called lua, not rust-lua. This is a problem that rustpkg needs to fix, not one that needs to be enshrined as a core language "feature". And no, requiring all users to say extern mod lua = "github.com/kballard/rust-lua" is not a good fix.

The standard way to deal with this in Go is to have the actual library be one level deeper in the repo, i.e. github.com/kballard/go-lua/lua (note: this project does not actually exist). AFAIK, rustpkg still does not support doing this.

In any case, there are two issues I have with this current implementation:

  1. Libraries that don't declare #[pkgid] are now completely screwed. Assuming a common convention of lib.rs, every single library that doesn't have #[pkgid] now hashes to the same liblib-9ed81b85-0.0.<ext>. That's a serious collision issue. The only libraries that are immune are ones that use a custom filename for the root file.
  2. Libraries that do declare #[pkgid] must now choose between using a semantically-meaningful #[pkgid], such as the one suggested by your blog post that looks like github.com/kballard/rust-lua, or one that produces the desired library name.

I do not understand why this change was made. I mean, I fully understand the desire to make hashes computable externally instead of relying on an obscure (from the perspective of external tools) hashing algorithm operating over an unknown collection of metadata. But it seems like that could be accomplished without also making the library name dependent on this same piece of metadata. What's wrong with still using #[link(name="foo")] (or is it #[link(package_id="foo")]? I have both set to the same string) to produce the actual library name and then the #[pkgid] for the hash?

@lilyball
Copy link
Contributor Author

@brson: I'm not clear on exactly what limitations rustpkg has for figuring out the library name/hash given a URL. Surely it can't require just the URL, or the version would cause an issue. But assuming the path is still important, the same solution that Go uses could be employed here, in that my library would become github.com/kballard/rust-lua/lua. This requires rustpkg be able to support this sort of thing, though.

That said, I think that making rustpkg happy should not impact rust as a whole. I should be able to use rust without having to worry about rustpkg if I don't want to. And by that I mean I should be able to continue using github.com/kballard/rust-lua as the location for my library, if I don't care about rustpkg compatibility (and at the moment I don't as I rely on a custom Makefile to build my library). I like the idea of using the canonical library location + version as the unique identifier that is used for the hash, but I don't think it should affect the library name. The library name should be computed from link attributes (is it #[link(name="foo")] or #[link(package_id="foo")]? That's never been clear to me), falling back to the root source filename if there are no link attributes (or maybe falling back to the pkgid path if there are no link attributes, thus simplifying the common case of a rustpkg-compatible library).

With this approach, all existing libraries would keep their existing name, and only have their hash change (which is perfectly acceptable). And libraries that ditch link attributes and use just #[pkgid], which I'm assuming was the idea here, would end up with the same name as they do with @metajack's change.

@metajack
Copy link
Contributor

Libraries that don't declare their name in lib.rs have always been screwed. That behavior hasn't changed. What has changed is that libraries that used to declare arbitrary names using link metadata now must do this with a different attribute.

There is nothing preventing you from using #[pkgid="lua"];. It won't work with rustpkg, but it will generate the library name you want.

rustpkg also didn't support link metadata, so nothing has really changed here with the choice between using semantically meaningful things and not. The semantically meaningful name is meaningful for use by rustpkg. If you can't use rustpkg, then what that name is seems irrelevant. Since you were using link metadata before, I assume you were not using rustpkg.

I am not sure what the best thing to do is; this seemed to be the most logical way to improve from where we are. I probably agree with you about rustpkg's limitations. This patch does not try to change or address those.

The only semantic difference introduced by this patch is that the crate hash no longer depends metadata from the crates dependencies. That can lead to more collisions, but how many more is not obvious to me.

Is the issue just that you wish link metadata was now disallowed?

@metajack
Copy link
Contributor

Just to be clear, the reason I didn't disallow link metadata was because I figured it would cause the compile to fail to bootstrap.

@lilyball
Copy link
Contributor Author

@metajack: I think you're completely missing the point. Actually, missing two very important points.

The first is that all existing libraries are now screwed, unless they happened to use their library name as the name for the root rust file. Some libraries do, but a lot don't (and I believe the current recommended convention is to use lib.rs). Every library that did not happen to use its own name as the root file now has the completely wrong name, and is at serious risk of name collision.

The second is that #[pkgid="lua"] is a terrible solution. That will lead to a name collision with any other library that uses the same #[pkgid="lua"]. Using a unique package id is an extremely good idea, regardless of whether you care about rustpkg compatibility, in order to get a unique hash. The best way of doing this is using the canonical project location, as that's a string that you can reasonably guarantee will be unique.

In fact, the only reasonable solution I have available to me with this patch is to use a name that's almost my project repo, but with an extra path component at the end containing the real name of my library. This is lying about the package id just to satisfy the rules for library naming, and it feels completely unacceptable to me. The only reason why I should ever use github.com/kballard/rust-lua/lua as my package id is if I actually move the library into the lua subfolder of my project! But besides the fact that I don't want to do that, my understanding is that this will break rustpkg compatibility (not that rust-lua is compatible, but other libraries are) because rustpkg doesn't know how to deal with a library that's in a subfolder.

See my response to @brson for how I wish this to work. Link metadata in order to produce a library name seems like the right approach. #[pkgid] in order to produce the library hash seems like the right approach as well, along with a strong recommendation to use a unique string (e.g. project path#version) as the id. The problem is using #[pkgid] to produce the library name, as that just seems unusable.

@metajack
Copy link
Contributor

Surely it can't require just the URL, or the version would cause an issue.

It's not a url since it doesn't start with a protocol, and it's required to be relative (no beginning /). The version is taken as what appears after #. rustpkg has a fair bit of logic to infer and deal with these when dealing with git repos for example.

@lilyball
Copy link
Contributor Author

@metajack: Surely you know what I meant by "URL" in that sentence. Does rustpkg allow you to use "#vers" in the import line? And what does it do if you don't specify a version? That's what I meant when I was asking how rustpkg deals with this. Saying extern mod lua = "github.com/kballard/rust-lua/lua" doesn't provide enough information to compute a library hash, as it's missing the version (and it's assuming the pkgid is the same as the import string). I don't think this question is particularly relevant to the issue at hand though.

@metajack
Copy link
Contributor

Yes, rustpkg will downoad and build the correct version and allows it to be specified in the extern mod statement. If you don't specify one it just builds master. Or if you already have it downloaded, it just uses what you have.

If you leave out a version from the pkgid, it is infered as 0.0, but the crate hash will be distinct from one with 0.0. I'm not sure if that's a huge issue. I wanted it to be simple to compute without logic for defaults and canonicalization.

@metajack
Copy link
Contributor

In IRC we came up with a potential solution which is to use #[pkgid="github.com/kballard/rust-lua;lua#1.0"]; to specify the library's name independent of its repo. @kballard says this plus making old link metadata a compile error would be satisfactory to him.

@lilyball
Copy link
Contributor Author

@metajack: If your worries regarding making link metadata a compile error are still valid, then making it a warning would be better than nothing.

@metajack
Copy link
Contributor

I saw that @alexcrichton started a new snapshot that has pkgid in it. So we can probably do it once that is done. Maybe it is already :)

@pnkfelix
Copy link
Member

cc me

@metajack
Copy link
Contributor

Current best candidate is #[pkgid="github.com/kballard/rust-lua#lua:1.0"];

Using the ; in the path part changes the behavior of web servers potentially (so does ?). Using the fragment is a bit safer.

@brson
Copy link
Contributor

brson commented Dec 12, 2013

@kballard Existing packages are 'screwed' yes, because link no longer does anything, but I think this has largely been a communication issue, since we didn't make any announcement telling people how to transition. The pkgid attribute provides mostly the same metadata as the link attribute did, and replacing link with pkgid will continue to behave the same, even if the pkgid doesn't agree with the URL of the repo.

I'm considering though whether we should revert this and proceed more carefully.

@brson
Copy link
Contributor

brson commented Dec 12, 2013

We could revert this change and just make a smaller, simpler change that just makes the hashes stable (add a command line flag to compute them), then regroup when we figure out what we're going to do with rustpkg. The link attribute is destined to disappear though because it conflicts with other uses of link as an attribute.

@lilyball
Copy link
Contributor Author

@brson: I think we came up with a decent compromise in IRC today, which is what @metajack summarized here. The problem with existing libraries is more that the link attributes weren't disallowed, which meant existing libraries would compile without any errors or warnings, but would silently produce an unexpected library name. If we disallow the existing link attributes, and then modify #[pkgid] the way @metajack has summarized, that would fix both issues I raised.

@lilyball
Copy link
Contributor Author

@metajack: There's been no activity for 3 days, so it seems nobody has any objections to the listed best candidate.

@metajack
Copy link
Contributor

I am running make check on a new pkgid parser as I type this. PR incoming shortly.

bors added a commit that referenced this issue Dec 17, 2013
This change extends the pkgid attribute to allow of explicit crate names, instead of always inferring them based on the path. This means that if your GitHub repo is called `rust-foo`, you can have your pkgid set your library name to `foo`. You'd do this with a pkgid attribute like `github.com/somewhere/rust-foo#foo:1.0`.

This is half of the fix for #10922.
@lilyball
Copy link
Contributor Author

lilyball commented Jan 7, 2014

I believe this is now completely resolved.

@lilyball lilyball closed this as completed Jan 7, 2014
flip1995 pushed a commit to flip1995/rust that referenced this issue Jun 30, 2023
[`redundant_closure_call`]: handle nested closures

Fixes rust-lang#9956.

This ended up being a much larger change than I'd thought, and I ended up having to pretty much rewrite it as a late lint pass, because it needs access to certain things that I don't think are available in early lint passes (e.g. getting the parent expr). I think this'll be required to fi-x rust-lang#10922 anyway, so this is probably fine.
(edit: had to write "fi-x" because "fix" makes github think that this PR fixes it, which it doesn't 😅 )

Previously, it would suggest changing `(|| || 42)()()` to `|| 42()`, which is a type error (it needs parens: `(|| 42)()`). In my opinion, though, the suggested fix should have really been `42`, so that's what this PR changes.

changelog: [`redundant_closure_call`]: handle nested closures and rewrite as a late lint pass
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants