New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement support for LLVMs code coverage instrumentation #34701

Open
gnzlbg opened this Issue Jul 7, 2016 · 13 comments

Comments

Projects
None yet
7 participants
@gnzlbg
Contributor

gnzlbg commented Jul 7, 2016

There are ways to more or less easily obtain code coverage information from rust binaries, some of which are in widespread use (e.g. travis-cargo + gcov + coveralls.io). However, these are either platform specific (gcov/kcov are linux only), or incomplete (coverage for documentation tests is not collected).

It would be better if rustc would be able to instrument rust binaries and tests using LLVM Coverage Instrumentation. This would allow code coverage information to work portably across the supported platforms, as well as produce reliable coverage information.

Ideally, such support would be integrated in an easy to use cargo coverage subcommand and the Rust Language Server, such that IDEs can use this information to guide the user while writing tests.

@alexcrichton

This comment has been minimized.

Member

alexcrichton commented Jul 7, 2016

I know I'd love to see this work! Do you know what it would entail in terms of what the hoops a prototype would have to jump through? Also, does this work reliably on platforms like Windows? Or is it still best on Linux and "works mostly elsewhere"?

@gnzlbg

This comment has been minimized.

Contributor

gnzlbg commented Jul 7, 2016

A prototype would need to generate the LLVM IR for the instrumentation, for that it needs to generate a mapping between source ranges and performance counters (I would start with just functions). Then it needs to generate and embed the IR in the object files. It would be a good idea to look how clang then outputs this into a file to use the same format (which is documented), and test that we can read it with llvm-cov in linux and macos (don't know about windows support but since clang is gaining full windows support if its not there it will be there soon).

I think that would be enough for a prototype, from there we can move on to generating instrumentation for conditionals (match/loop/jumps/ifs...) and blocks (how many iterations was a loop body executed). We could go down to expressions, but then it would be wise to offer users a way to control how much instrumentation is generated: functions, branches, and loops (which are branches) is typically enough. We should then support skipped regions (for conditional compilation), expansions (for macros, maybe plugins), and dealing with unreachable code (there is a performance counter that is always zero for that).

The meat of the work is in the source=> performance counter mapping, generating the IR, and generating the conforming output. llvm-cov works at least on mac and linux, but IIRC clang has options to generate coverage information even for particular gcov versions. The source code for all this is available so taking a look wouldn't hurt.

Also, does this work reliably on platforms like Windows? Or is it still best on Linux and "works mostly elsewhere"?

It works on Mac and Linux, don't know about Windows. Even if it doesn't work everywhere, this is probably the way to make it work in as much platforms as possible anyways. Clang support on windows is pretty good already and it is only getting better.

@alexcrichton

This comment has been minimized.

Member

alexcrichton commented Aug 3, 2016

@sujayakar's done some awesome work to prototype this on OSX at least, discovering:

  • For now, codegen-units probably has to be one to make this work (worth investigating)
  • The pass to insert into LLVM is -C passes=insert-gcov-profiling
  • The runtime support is available through -C link-args=-lclang_rt.profile_osx, and that library
  • This runtime support lives around the directory xcrun --find clang in a path like usr/lib/clang/7.3.0/lib/darwin/libclang_rt.profile_osx.a.
  • Binaries run with that instrumentation will produce a .gcda file
  • Next to xcrun --find clang, there's an llvm-cov tool, and that can be run over the output file to generate information.

That should at least get code coverage working on OSX in some respect! THis is also a pretty strong case that it may not be too hard to actually get this working for all platforms if LLVM's standard tool suite "just works".

@sanxiyn

This comment has been minimized.

Member

sanxiyn commented Aug 4, 2016

@alexcrichton That's gcov coverage, which is completely different from what @gnzlbg meant in this issue.

The correct pass name is instrprof, implemented in lib/Transform/Instrumentation/InstrProfiling.cpp. See also LDC LLVM profiling instrumentation.

@gnzlbg

This comment has been minimized.

Contributor

gnzlbg commented Aug 4, 2016

@sanxiyn is correct.

I just want to add that I don't think we should implement clang-like gcov coverage generation.

LLVM format can be used for many more things (e.g. profile guided optimizations), and LLVM's llvm-cov tool can generate gcov files from it.

@frewsxcv

This comment has been minimized.

Member

frewsxcv commented May 2, 2017

afaik, there's a PR open for this: #38608

@sanxiyn

This comment has been minimized.

Member

sanxiyn commented May 2, 2017

As far as I can tell, #38608 is still gcov coverage and not instrprof.

@alexcrichton

This comment has been minimized.

Member

alexcrichton commented Aug 25, 2017

I believe this is now implemented as -Z profile and tracked at #42524, so closing in favor of that.

@sanxiyn

This comment has been minimized.

Member

sanxiyn commented Aug 28, 2017

This is not implemented yet. LLVM coverage is different from gcov coverage, which is also different from sanitizer coverage. LLVM implements three separate coverage mechanisms.

@mehcode

This comment has been minimized.

Contributor

mehcode commented Sep 21, 2017

Would LLVM coverage ( instrprof ) enable usable code coverage of generic heavy code? None of the existing code coverage solutions work well with heavy use of generics.

My project (a gameboy emulator) makes very heavy usage of generics and it'd be wonderful to get code coverage working on integration tests but I'm not really sure how.

@gnzlbg

This comment has been minimized.

Contributor

gnzlbg commented Sep 21, 2017

I think good code coverage support is the single most important piece of tooling missing in Rust.

When I run cargo test, and everything is green, I get a good feeling, but this feeling actually means nothing at all. To make cargo test mean something, it would need to at least tell me how many code-paths of my application have been exercised.

This information does not mean much either (just because a code-path was exercised does not mean that it was exercised for all inputs), but it would mean more than nothing, and it would make writing tests and asserting the value of test way easier.

IMO adding this kind of capability to cargo test (and yes, it should be part of cargo test, people should always know their code coverage) would be extremely valuable.

There is only another part of tooling infrastructure that would come close to this in value, and that would be an undefined behavior detector for all rust code.

This might sound like a rant, but I just want to raise awareness that this is a very important issue at least for me (and from the other issues being filled, for others as well) because it directly affects the correctness of rust programs (we don't want them to only be memory safe, but also to have correct logic).

Maybe if rustfmt, clippy, and the RLS advance enough during the impl period, the Rust tool team could prioritize these two tools next?

If I don't have precise auto-completion information or my source code is not perfectly formatted, well, I can live with that. But if my program has undefined behavior and I am not catching it because I am only testing 40% of my code-paths then... I am screwed. Does this make sense?

@mssun

This comment has been minimized.

Contributor

mssun commented Apr 12, 2018

This is not implemented yet. LLVM coverage is different from gcov coverage, which is also different from sanitizer coverage. LLVM implements three separate coverage mechanisms.

Hi, can anyone explain that what's the difference between "LLVM coverage" (profile-instr-generate) and "gcov" (insert-gcov-profiling)? I don't know the detailed implementations of these tow passes, but will profile-instr-generate be more accurate than insert-gcov-profiling.

@mssun

This comment has been minimized.

Contributor

mssun commented Apr 12, 2018

I found @kennytm's answer to my question. Let me put here in case someone like me gets lost in the overwhelmed references: #44673 (comment)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment