Implement debuginfo path remapping #38348

Open
wants to merge 1 commit into
from

Projects

None yet

10 participants

@sanxiyn
Member
sanxiyn commented Dec 13, 2016

Fix #38322.

@rust-highfive
Collaborator

r? @pnkfelix

(rust_highfive has picked a reviewer for you, use r? to override)

@alexcrichton alexcrichton removed their assignment Dec 14, 2016
@alexcrichton
Member

(note that I'd like to discuss in tools triage tmw, but seems good to me)

@michaelwoerister
Contributor

Yes, let's discuss it at the meeting.

@brson
Contributor
brson commented Dec 14, 2016

Seems like functionality we need. Thanks @sanxiyn!

Can this be used to let us create seperate rust-dbg packages in rustup? Could we put e.g. std in sysroot/lib/std.rlib and the debuginfo in sysroot/lib/std.debuginfo?

Is this unstable?

Can it be tested? No test cases here.

@brson
Contributor
brson commented Dec 14, 2016

Can we get negative test cases too, for when this feature isn't supported? Does it e.g. work on windows?

@michaelwoerister
Contributor

Can this be used to let us create seperate rust-dbg packages in rustup? Could we put e.g. std in sysroot/lib/std.rlib and the debuginfo in sysroot/lib/std.debuginfo?

Separate debuginfo is a different problem. GDB has ways of handling this: https://www.sourceware.org/gdb/onlinedocs/gdb/Separate-Debug-Files.html
We should take advantage of that eventually.

@michaelwoerister
Contributor

@vadimcn suggested : instead of = as separator character, which I personally also like more. That's also how we separate paths for another commandline option (which was it, @vadimcn?)

@michaelwoerister
Contributor

Here is a summary of what we discussed in the tools meeting:

  • We want to support this functionality.
  • We would prefer : over = as the path separator.
  • The feature should be implemented as a -Z flag so we get a testing period before it becomes stable.
  • Once it's implemented as unstable, we'd like to get feedback if it actually solved people's problems.

A usability concerns that was raised:

  • The path where Cargo will place the source code of downloaded dependencies is not really predictable. So in and of itself supporting this in the compiler might not be enough to make this feature usable.

Also, as @brson said, there should be an auto-test for this. A codegen test that checks the generated LLVM IR would probably work best.

@sanxiyn Are you still up to this?

@vadimcn
Contributor
vadimcn commented Dec 15, 2016

Also, I wanted to note that we should be careful if the prefix includes more than just the root source directory. I am finding stuff like this (below) in debug info of a Rust binary. A naive matching of /Users/rustbuild/src/rust-buildbot/slave/nightly-dist-rustc-mac/build/src/libcore would probably fail to find these?

AT_name( "./../src/libstd/lib.rs" )
AT_stmt_list( 0x000000e3 )
AT_comp_dir( "/Users/rustbuild/src/rust-buildbot/slave/nightly-dist-rustc-mac/build/obj" )

and

AT_call_file( "/Users/rustbuild/src/rust-buildbot/slave/nightly-dist-rustc-mac/build/obj/../src/libcore/cell.rs" )
@vadimcn
Contributor
vadimcn commented Dec 15, 2016

@vadimcn suggested : instead of = as separator character, which I personally also like more. That's also how we separate paths for another commandline option (which was it, @vadimcn?)

13477c7#diff-8ad3595966bf31a87e30e1c585628363R1457

@jmesmon
Contributor
jmesmon commented Dec 20, 2016

13477c7#diff-8ad3595966bf31a87e30e1c585628363R1457

Parse string of the form "[KIND=]lib[:new_name]",

Are those actually paths? They look like library names, which are typically more restricted. As I've noted in the issue, to actually be fool-proof here we need to allow each path to be a separate argument.

@infinity0
Contributor

Does this mean paths that contain : themselves won't be able to be remapped-from or remapped-to? Or is there an escaping mechanism planned?

The reason I ask is because we at the reproducible builds project are in the process of specifying the format of a standardised environment variable for this, that different compilers can all support, and we are trying to decide (e.g.) what to use as a separator characters. It would be good to get many different compiler writers' opinions on this. We also have the constraint that multiple mappings would have to be joined up into a single string, for this envvar.

@vadimcn
Contributor
vadimcn commented Dec 29, 2016

Yeah, looks like we forgot about Windows drive letters :(

In today's Tools Team meeting we've discussed several options (none of which is particularly appealing):

  • go back to the '=' separator (cons: '=' may occur in file names),
  • put the "NEW" part into a separate command line argument: -Zdebug-prefix-map=C:\\foo D:\\bar (cons: inconsistent with every other rustc parameter),
  • quoting (but it'd also need escaping and shouldn't use the same characters as the shell),
  • load the map from a .json file, whose name is given on the command line (actually, this isn't bad, aside from implementation complexity).

A couple more that occurred to me later:

  • escape the separator character by doubling it up: -Zdebug-prefix-map=C::\\foo:D::\\bar (cons: surprising to users, who must remember to run paths through string replacement when generating command line programmatically (oh, and for extra fun, both ':' and '=' are used in Windows shell string substitution syntax! ))
  • just be careful with drive letters. They follow a pretty specific pattern: <letter> ':' '\'. We can detect and avoid splitting on colons following a letter and followed by a backslash. With that in mind, strings like -Zdebug-prefix-map=C:\\foo:D:\\bar can be split unambiguously, and stuff like -Zdebug-prefix-map=C:\\foo:\\bar, i.e. OLD=C:\foo, NEW=\bar, is unlikely to be used in practice.
@michaelwoerister
Contributor

We should probably just avoid getting into the whole cross-platform shell/filename escaping business. I'd be for either going the json-file route or just allow an arbitrary number of -Zdebug-prefix-map-from=<path> / -Zdebug-prefix-map-to=<path> pairs which are matched up by their position in the command. (I know that I've argued against position-dependent arguments in the past, but given the alternatives they seem like an OK option)

@michaelwoerister
Contributor

@vadimcn

Also, I wanted to note that we should be careful if the prefix includes more than just the root source directory.

That's a very good point.

@infinity0
Contributor

I still think it's fine just having one argument -Zdebug-prefix-map=A=B with A=B being split at the final = so that A can be arbitrary but B can't contain a =. I think the reduced complexity is worth the loss of function. I haven't seen anyone propose any use-case (even an unreasonable one) where B must be able to contain a =. The consistency with GCC also helps to reinforce this idea globally.

If this is still not convincing to the Rust team, then I suggest that instead of doing complex position-based logic, you define -Zdebug-prefix-map-from= and -Zdebug-prefix-map-to= to be lists that must be equal length, then zip them together. That would be easier to parse and reason about.

It would allow things like -Zdpm-from=A1 -Zdpm-from=A2 -Zdpm-to=B1 -Zdpm-to=B2 but I don't think that's so bad in practice.

@michaelwoerister
Contributor

It would allow things like -Zdpm-from=A1 -Zdpm-from=A2 -Zdpm-to=B1 -Zdpm-to=B2 but I don't think that's so bad in practice.

That's pretty much what I had in mind.

@jsgf
Contributor
jsgf commented Jan 14, 2017

A couple of points:

  • Are debug maps defined at non-directory boundaries? Ie, would a mapping of foo=bar remap foo-src/lib.rs -> bar-src/lib.rs? It looks like the current implementation would do this, but that seems like it could result in unexpected remappings.
  • I'm a bit concerned about not allowing = (or any other separator character) in a path. I'm also don't like introducing any new layers of quoting. One solution that occurred to me was to default to (say) =, but add -Z remap-separator=x, where x is the separator character. This could be specified multiple times, and takes effect for all following parameters until its replaced. Still ugly, but somewhat like separators in MIME.
@infinity0
Contributor
infinity0 commented Jan 15, 2017 edited

It looks like the current implementation would do this [..]

Yes, and this is the same behaviour as GCC. If one wanted foo-src to map to something else (including itself, i.e. no change), one can just add another mapping for foo-src that appears later in the list.

This should be fine - if both foo and foo-src are going to be compiled in the same compilation unit, there must exist some program that knows about both of them, to be able to set these flags. Otherwise, one of them is not going to need to be remapped anyways. I think it's not worth the complexity to add extra logic in there.

At the Reproducible Builds project we have some interest in getting multiple build tools to adopt the same approach (roughly what's been described here, and what GCC does), and adding this complexity would likely hinder adoption.

@jsgf
Contributor
jsgf commented Jan 16, 2017 edited

100% OK with matching gcc behaviour. I guess the user could include the '/' if they wanted to enforce matching directory boundaries.

+ for (old, new) in sess.opts.debug_prefix_map.clone() {
+ if path.starts_with(&old) {
+ return new + &path[old.len()..];
+ }
@jsgf
jsgf Jan 16, 2017 Contributor

It seems unnecessary to clone the whole map first; it would be better to iterate references, then only clone the "new" once we've found a match.

@@ -669,7 +678,10 @@ pub fn file_metadata(cx: &CrateContext, path: &str, full_path: &Option<String>)
}
});
- file_metadata_(cx, path, file_name, &work_dir)
+ let sess = cx.sess();
+ let remap_file_name = remap_path(sess, file_name);
@jsgf
jsgf Jan 16, 2017 Contributor

This ends up matching against a full path rather than the path passed on the command line. The let file_name expression above ends up using the full_path parameter if its present (which it always appears to be) rather than the relative path (which it goes to some lengths to make relative to work_dir). I'm not sure what the larger intent is here, but the effect is that this doesn't work as expected.

It looks like this was done deliberately in 24e7491 by @luser. This suggests that path remapping should be done in absolute space - though the code is pretty inconsistent. If full_path is None, then I think the resulting path will be relative.

@@ -253,6 +253,7 @@ top_level_options!(
// Include the debug_assertions flag into dependency tracking, since it
// can influence whether overflow checks are done or not.
debug_assertions: bool [TRACKED],
+ debug_prefix_map: Option<(String, String)> [TRACKED],
@jsgf
jsgf Jan 16, 2017 Contributor

This only allows a single remapping. It needs to allow an arbitrary number of remappings to be useful.

@alexcrichton
Member

I haven't been following this too closely, but was a conclusion of how to implement this reached? (just triaging some old PRs)

@michaelwoerister
Contributor

@alexcrichton No, not yet, mostly due to my being busy with other things and not properly following up over at #38322. I have an idea what it could look like, I just need to write it up and confirm that it is what people except.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment