Dump results of analysis phase as CSV (DXR support in Rust) #13222

nrc · 2014-03-31T05:06:19Z

Adds a -Z flag save-analysis which runs after the analysis phase of the compiler and saves a bunch of info into a CSV file for the crate. This is designed to work with the DXR code browser, but is frontend-independent, that is this info should be useful for all kinds of code browsers, IDEs, or other tools.

I need to squash commits before landing (there will probably be a fair few to come), please ignore them for now and just comment on the changes.

huonw · 2014-03-31T05:23:36Z

src/librustc/middle/save/mod.rs

+// option. This file may not be copied, modified, or distributed
+// except according to those terms.
+
+// Output a CSV file containing the output from rustc's analysis. The data is


This could be a doc comment with //!.

huonw · 2014-03-31T05:26:53Z

(Do you happen to have a DXR instance displaying the output of this?)

nrc · 2014-03-31T05:34:57Z

@huonw I'm working on it :-) Need to land the changes to DXR and then set up the instance, hopefully later this week, all being well. I hope I can get it up soon so you can appreciate the joy this patch will bring :-)

alexcrichton · 2014-03-31T17:32:39Z

It seems unfortunate to bake this support directly in to the compiler itself. Most usage of rustc won't be looking for DXR support.

A suggestion by @thestinger on IRC was to put this support in a separate tool inside of the rust repo if maintenance outside of the repo is too difficult.

nrc · 2014-03-31T20:33:36Z

I fear I have been selling this badly - please think of this not as DXR support, but just like dumping the JSON representation of the AST, but with semantic info too (and in CSV format, not JSON). There is nothing actually DXR-specific here, and if there is by mistake, then I should change it. DXR is just one way to use this info.

That said, I wouldn't mind moving this into a different crate if you think that is a better place for it.

brson · 2014-04-01T00:58:47Z

What relationship does this have to the analysis already included in the crate metadata?

nikomatsakis · 2014-04-01T17:56:54Z

I'd love to see this support land somewhere. I don't have a strong opinion about whether it lands in rustc or as an external tool. In principle, it'd be nice if rustc included a nice API for accessing this sort of data in a reliable way -- but that's a long way off. I guess having semantic tools like dxr and rustdoc in tree at least gives us a clear idea of what the requirements are.

nrc · 2014-04-01T22:25:17Z

@brson - there is some overlap - if I knew then what I know now I would have leveraged some of the metadata code for this. But it is not duplication, in particular this info contains info about expressions in function bodies which metadata misses and also has more detail about intermediate forms, e.g., in foo::bar::baz, metadata only contains info about baz, whereas here we have info on all three idents (although not as much as I would like).

@nikomatsakis Just to clarify, most of DXR is out of tree (in the DXR repo), what is here is just a dump of analysis info in CSV format.

brson · 2014-04-01T23:41:51Z

This type of thing is fine with me I think as long as it doesn't impose much maintenance burden.

brson · 2014-04-12T03:43:41Z

Let's move this forward. It's very self-contained. No big deal.

@nick29581 I agree with Huon about the doc comment. Can you convert it? The issue with match is very subjective.

With those 'rebase' commits I'm guessing the intermediate commits are broken. The history here looks a bit messy anyway so can you rewrite it into one or a few?

huonw · 2014-04-12T03:56:55Z

src/librustc/middle/save/mod.rs

+    // Return all non-empty prefixes of a path.
+    // For each prefix, we return the span for the last segment in the prefix and
+    // a str representation of the entire prefix.
+    fn process_path_prefixes(&self, path: &ast::Path) -> ~[(Span, ~str)] {


These ~[] should all be Vec.

huonw · 2014-05-25T12:07:43Z

src/librustc/middle/save/mod.rs

+
+    match fs::mkdir_recursive(&root_path, io::UserRWX) {
+        Err(_) => {
+            sess.bug(format!("Could not create directory {}",


I think this is just an error (not a bug), since it's almost certainly a problem on the users side. Also, this should print the error too

Err(e) => sess.err(format!("Could not create directory {}: {}", root_path.display(), e).as_slice())

huonw · 2014-05-25T12:39:15Z

I'm in favour of this, and I think the patch can be landed (modulo a few final nits) and then iteratively improved/extended/refactored if necessary, especially since it's so self-contained. However, it's quite large so someone else might have another suggestion for how to handle it.

I would be super-interested in seeing a regularly updated version of this hosted somewhere (part of the nightly build? I guess that adds yet another dependency (dxr) to the whole process. :( ). Presumably that would be assisted by targets in the makefile too. E.g. imitating make check like

make source-analysis
make source-analysis-stage2
make source-analysis-stage2-std
make source-analysis-stage2-rustc
# etc.

(seems fine to postpone.)

nrc · 2014-05-27T22:39:42Z

@huonw the plan is the Mozilla (web eng) will host an index of Rust and Servo which will be re-indexed daily. Going to take another few weeks though. Do you also want to be able to download the raw data? We could probably arrange that.

Given that indexing is pretty much orthogonal to the make target, is there any advantage to having a make target for this as well as a -Z flag?

nrc · 2014-05-27T22:42:36Z

Rebased and addressed @huonw 's comments.

@huonw, @brson r? (@brson - @huonw wanted a second opinion on landing this, see #13222 (comment))

huonw · 2014-05-27T22:45:20Z

Awesome!

I was thinking the make target would just whatever the rules are for re-indexing, i.e. running rustc -Z save-analysis on each crate. This means that the rule naturally tracks compiler changes & library additions/removals (because the Makefiles have full knowledge of all crates) and that others can easily get the analysis if they wish.

(I personally have no particular use for the raw data, but I imagine some people might. shrug)

r=me assuming there's agreement for landing this.

nrc · 2014-05-27T22:50:29Z

@huonw couldn't you just use RUSTFLAGS env var for that? For the DXR index run, I just do a normal make with RUSTFLAGS_STAGE2 += -Zsave-analysis

huonw · 2014-05-27T22:51:56Z

Oh, I guess so... seems a little bit of a hack, but I suppose it works fine.

nrc · 2014-06-01T01:48:52Z

I don't know what this means: recipe for target `i686-pc-mingw32/stage1/bin/rustlib/i686-pc-mingw32/lib/stamp.rustc' failed

There are no errors I can see in the output, so I don't see why this is failing - is this just Windows build stupidity?

huonw · 2014-06-01T01:50:56Z

Weird. Let's just retry...

klutzy · 2014-06-01T04:10:21Z

(my win32 machine compiled this pull request well)

alexcrichton · 2014-06-01T04:33:16Z

An error like that normally means that the windows bots segfaulted while building. If you push to try you'll get a similar environment which should have the same results, and I can help you see what the stack trace is at the time of the segfault.

alexcrichton · 2014-06-05T02:26:49Z

It looks like #14604 has suffered a similar build problem which I am currently investigating. It looks like it also requires rustc itself to be built without optimizations.

alexcrichton · 2014-06-05T16:25:23Z

I looked into the failure cause of #14604, and it appears to fail because of OOM. Perhaps this patch is triggering a large number of allocations?

alexcrichton · 2014-06-10T17:27:04Z

Note that I have un-gated the builder that this PR failed on (#14791), so a rebase will likely allow this to land.

@pelmers

Adds the option -Zsave-analysis which will dump the results of syntax and type checking into CSV files. These can be interpreted by tools such as DXR to provide semantic information about Rust programs for code search, cross-reference, etc. Authored by Nick Cameron and Peter Elmers (@pelmers; including enums, type parameters/generics).

Adds a -Z flag `save-analysis` which runs after the analysis phase of the compiler and saves a bunch of info into a CSV file for the crate. This is designed to work with the DXR code browser, but is frontend-independent, that is this info should be useful for all kinds of code browsers, IDEs, or other tools. I need to squash commits before landing (there will probably be a fair few to come), please ignore them for now and just comment on the changes.

pelmers · 2014-06-13T16:00:02Z

\o/ impressive work!

Remove redundant 'resolve_obligations_as_possible' call Hi! I was looking for a "good first issue" and saw this one: rust-lang/rust-analyzer#7542. I like searching for performance improvements, so I wanted to try to find something useful there. There are two tests in integrated_benchmarks.rs, I looked at 'integrated_highlighting_benchmark' (not the one discussed in the issue above). Profile from that test looks like this: ``` $ RUN_SLOW_BENCHES=1 cargo test --release --package rust-analyzer --lib -- integrated_benchmarks::integrated_highlighting_benchmark --exact --nocapture Finished release [optimized] target(s) in 0.06s Running unittests src/lib.rs (target/release/deps/rust_analyzer-a80ca6bb8f877458) running 1 test workspace loading: 358.45ms initial: 9.60s change: 13.96µs cpu profiling is disabled, uncomment `default = [ "cpu_profiler" ]` in Cargo.toml to enable. 273ms - highlight 143ms - infer:wait @ per_query_memory_usage 143ms - infer_query 0 - crate_def_map:wait (3165 calls) 4ms - deref_by_trait (967 calls) 96ms - resolve_obligations_as_possible (22106 calls) 0 - trait_solve::wait (2068 calls) 21ms - Semantics::analyze_impl (18 calls) 0 - SourceBinder::to_module_def (20 calls) 36ms - classify_name (19 calls) 19ms - classify_name_ref (308 calls) 0 - crate_def_map:wait (461 calls) 4ms - descend_into_macros (628 calls) 0 - generic_params_query (4 calls) 0 - impl_data_with_diagnostics_query (1 calls) 45ms - infer:wait (37 calls) 0 - resolve_obligations_as_possible (2 calls) 0 - source_file_to_def (1 calls) 0 - trait_solve::wait (42 calls) after change: 275.23ms test integrated_benchmarks::integrated_highlighting_benchmark ... ok ``` 22106 calls to `resolve_obligations_as_possible` seem like the main issue there. One thing I noticed (and fixed in this PR) is that `InferenceContext::resolve_ty_shallow` first calls `resolve_obligations_as_possible`, and then calls `InferenceTable::resolve_ty_shallow`. But `InferenceTable::resolve_ty_shallow` [inside](https://github.com/rust-lang/rust-analyzer/blob/2e9f1204ca01c3e20898d4a67c8b84899d394a88/crates/hir-ty/src/infer/unify.rs#L372) again calls `resolve_obligations_as_possible`. `resolve_obligations_as_possible` inside has a while loop, which works until it can't find any helpful information. So calling this function for the second time does nothing, so one of the calls could be safely removed. `InferenceContext::resolve_ty_shallow` is actually quite a hot place, and after fixing it, the total number of `resolve_obligations_as_possible` in this test is reduced to 15516 (from 22106). "After change" time also improves from ~270ms to ~240ms, which is not a very huge win, but still something measurable. Same profile after PR: ``` $ RUN_SLOW_BENCHES=1 cargo test --release --package rust-analyzer --lib -- integrated_benchmarks::integrated_highlighting_benchmark --exact --nocapture Finished release [optimized] target(s) in 0.06s Running unittests src/lib.rs (target/release/deps/rust_analyzer-a80ca6bb8f877458) running 1 test workspace loading: 339.86ms initial: 9.28s change: 10.69µs cpu profiling is disabled, uncomment `default = [ "cpu_profiler" ]` in Cargo.toml to enable. 236ms - highlight 110ms - infer:wait @ per_query_memory_usage 110ms - infer_query 0 - crate_def_map:wait (3165 calls) 4ms - deref_by_trait (967 calls) 64ms - resolve_obligations_as_possible (15516 calls) 0 - trait_solve::wait (2068 calls) 21ms - Semantics::analyze_impl (18 calls) 0 - SourceBinder::to_module_def (20 calls) 34ms - classify_name (19 calls) 18ms - classify_name_ref (308 calls) 0 - crate_def_map:wait (461 calls) 3ms - descend_into_macros (628 calls) 0 - generic_params_query (4 calls) 0 - impl_data_with_diagnostics_query (1 calls) 45ms - infer:wait (37 calls) 0 - resolve_obligations_as_possible (2 calls) 0 - source_file_to_def (1 calls) 0 - trait_solve::wait (42 calls) after change: 238.15ms test integrated_benchmarks::integrated_highlighting_benchmark ... ok ``` The performance of this test could be further improved but at the cost of making code more complicated, so I wanted to check if such a change is desirable before sending another PR. `resolve_obligations_as_possible` is actually called a lot of times even when no new information was provided. As I understand, `resolve_obligations_as_possible` could do something useful only if some variables/values were unified since the last check. We can store a boolean variable inside `InferenceTable`, which indicates if `try_unify` was called after last `resolve_obligations_as_possible`. If it wasn't called, we can safely not call `resolve_obligations_as_possible` again. I tested this change locally, and it reduces the number of `resolve_obligations_as_possible` to several thousand (it is not shown in the profile anymore, so don't know the exact number), and the total time is reduced to ~180ms. Here is a generated profile: ``` $ RUN_SLOW_BENCHES=1 cargo test --release --package rust-analyzer --lib -- integrated_benchmarks::integrated_highlighting_benchmark --exact --nocapture Finished release [optimized] target(s) in 0.06s Running unittests src/lib.rs (target/release/deps/rust_analyzer-a80ca6bb8f877458) running 1 test workspace loading: 349.92ms initial: 8.56s change: 11.32µs cpu profiling is disabled, uncomment `default = [ "cpu_profiler" ]` in Cargo.toml to enable. 175ms - highlight 21ms - Semantics::analyze_impl (18 calls) 0 - SourceBinder::to_module_def (20 calls) 33ms - classify_name (19 calls) 17ms - classify_name_ref (308 calls) 0 - crate_def_map:wait (461 calls) 3ms - descend_into_macros (628 calls) 0 - generic_params_query (4 calls) 0 - impl_data_with_diagnostics_query (1 calls) 97ms - infer:wait (38 calls) 0 - resolve_obligations_as_possible (2 calls) 0 - source_file_to_def (1 calls) 0 - trait_solve::wait (42 calls) after change: 177.04ms test integrated_benchmarks::integrated_highlighting_benchmark ... ok ``` Let me know if adding a new bool field seems like a reasonable tradeoff, so I can send a PR.

…1995 Use a deterministic number of digits in rustc_tools_util commit hashes Using `git rev-parse --short` in rustc_tools_util causes nondeterministic compilation of projects that use `setup_version_info!` and `get_version_info!` when built from the exact same source code and git commit. The number of digits printed by `--short` is sensitive to how many other branches and tags in the repository have been fetched so far, what other commits have been worked on in other branches, how recently you had run `git gc`, platform-specific variation in git's default configuration, and platform differences in the sequence of steps performed by the release pipeline. Someone can compile a tool from a particular commit, switch branches to work on a different commit (or simply do a git fetch), go back to the first commit and be unable to reproduce the binary that was built from it previously. Currently, variation in short commit hashes causes Clippy version strings to be out of sync between different targets. On x86_64-unknown-linux-gnu: ```console $ clippy-driver +1.80.0 --version clippy 0.1.80 (0514789 2024-07-21) ``` Whereas on aarch64-apple-darwin: ```console $ clippy-driver +1.80.0 --version clippy 0.1.80 (0514789 2024-07-21) ``` --- changelog: none

huonw reviewed Mar 31, 2014
View reviewed changes

nrc changed the title ~~DXR support in Rust~~ Dump results of analysis phase as CSV (DXR support in Rust) Mar 31, 2014

huonw reviewed Apr 12, 2014
View reviewed changes

huonw reviewed May 25, 2014
View reviewed changes

bors merged commit 984e9af into rust-lang:master Jun 13, 2014

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Dump results of analysis phase as CSV (DXR support in Rust) #13222

Dump results of analysis phase as CSV (DXR support in Rust) #13222

nrc commented Mar 31, 2014

huonw Mar 31, 2014

huonw commented Mar 31, 2014

nrc commented Mar 31, 2014

alexcrichton commented Mar 31, 2014

nrc commented Mar 31, 2014

brson commented Apr 1, 2014

nikomatsakis commented Apr 1, 2014

nrc commented Apr 1, 2014

brson commented Apr 1, 2014

brson commented Apr 12, 2014

huonw Apr 12, 2014

huonw May 25, 2014

huonw commented May 25, 2014

nrc commented May 27, 2014

nrc commented May 27, 2014

huonw commented May 27, 2014

nrc commented May 27, 2014

huonw commented May 27, 2014

nrc commented Jun 1, 2014

huonw commented Jun 1, 2014

klutzy commented Jun 1, 2014

alexcrichton commented Jun 1, 2014

alexcrichton commented Jun 5, 2014

alexcrichton commented Jun 5, 2014

alexcrichton commented Jun 10, 2014

pelmers commented Jun 13, 2014

Dump results of analysis phase as CSV (DXR support in Rust) #13222

Dump results of analysis phase as CSV (DXR support in Rust) #13222

Conversation

nrc commented Mar 31, 2014

huonw Mar 31, 2014

Choose a reason for hiding this comment

huonw commented Mar 31, 2014

nrc commented Mar 31, 2014

alexcrichton commented Mar 31, 2014

nrc commented Mar 31, 2014

brson commented Apr 1, 2014

nikomatsakis commented Apr 1, 2014

nrc commented Apr 1, 2014

brson commented Apr 1, 2014

brson commented Apr 12, 2014

huonw Apr 12, 2014

Choose a reason for hiding this comment

huonw May 25, 2014

Choose a reason for hiding this comment

huonw commented May 25, 2014

nrc commented May 27, 2014

nrc commented May 27, 2014

huonw commented May 27, 2014

nrc commented May 27, 2014

huonw commented May 27, 2014

nrc commented Jun 1, 2014

huonw commented Jun 1, 2014

klutzy commented Jun 1, 2014

alexcrichton commented Jun 1, 2014

alexcrichton commented Jun 5, 2014

alexcrichton commented Jun 5, 2014

alexcrichton commented Jun 10, 2014

pelmers commented Jun 13, 2014