Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add a command "cargo grcov" #326

Open
calixteman opened this issue Oct 7, 2019 · 29 comments · May be fixed by #533
Open

Add a command "cargo grcov" #326

calixteman opened this issue Oct 7, 2019 · 29 comments · May be fixed by #533

Comments

@calixteman
Copy link
Collaborator

calixteman commented Oct 7, 2019

https://github.com/mozilla/productivity-tools-projects/issues/289:

Should be helpful to hide some complexity (e.g. rustc options).
We add some stuff to directly upload data on ccov servers...
Ideally just have something like:
cargo grcov ... -token FOO -server=codecov.io... in travis or tc would be great.

@gilescope
Copy link
Contributor

Would this be as a separate crate? Could we put it in a sub-dir still of this repo? Happy to do an initial PR to get this going if there's appetite?

@gilescope
Copy link
Contributor

cargo grcov --open -- cargo test
The command would set up the env variables before hand, then run the command after -- and then would run an output type of -t html by default and output to target/debug/coverage by default and accept a --release flag to output it into the release dir.

The --open would open the html results in a browser.

That's kinda what I was thinking. Does it tally with your mozilla-internal spec?

@marco-c
Copy link
Collaborator

marco-c commented Feb 25, 2020

@gilescope that sounds good to me, it could be part of this repo.

An additional feature could be uploading to the coverage services.

@marco-c
Copy link
Collaborator

marco-c commented Nov 24, 2020

@gilescope would you still be interested in contributing?

@gilescope
Copy link
Contributor

I think it's high time this was done and I've got some time on my hands. I've done similar in other places but really having it in one place would be excellent. I can start with a straw man PR and we can iterate from there?

@gilescope
Copy link
Contributor

(unless you've got a burning desire to have a crack yourself @marco-c - I'm good with that too. I'd be just as happy testing a PR out and giving feebdback.

@marco-c
Copy link
Collaborator

marco-c commented Nov 24, 2020

I think it's high time this was done and I've got some time on my hands. I've done similar in other places but really having it in one place would be excellent. I can start with a straw man PR and we can iterate from there?

@gilescope that'd be great, thanks!

(unless you've got a burning desire to have a crack yourself @marco-c - I'm good with that too. I'd be just as happy testing a PR out and giving feebdback.

I'm happy to just review ;)

@gilescope
Copy link
Contributor

Ok here's my take on things: I don't think that one command is enough. I think two are required:

cargo grcov-build -- cargo build
cargo grcov-report --output-types html,json

grcov-build sets up the environment variables for invoking rustc correctly so that coverage data is generated. I suspect we don't need to do a full cargo clean before hand, but probably should delete existing coverage files (and old reports?) in the target dir. It would invoke whatever's after the '--' so that it would work with xargo etc. as well as cargo build.
The build command could be a cargo test that implicitly compiles in the simplest case.

By breaking it into two separate commands it enables people to run whatever testing framework(s) they wish in between (for example pytest calling pyo3 to invoke some rust.

Once all the tests are run then grv-report would run grcov that would produce a consolidated report.
There would be a standard location in the target dir for the coverage report and by default it would produce a html report and output a text summary to the std out.

(Both commands should support --message-format=json for machine readable output like most of the cargo commands do)

Ideally for the simplest case something like this should be enough:

cargo grcov-build -- cargo test
cargo grcov-report

Are people comfortable with the above? Have I missed some common usecases that this wouldn't work well for?
(Would the security implications of starting any process be a concern - should we limit it to just running cargo?)

@marco-c
Copy link
Collaborator

marco-c commented Nov 25, 2020

Having two separate commands sounds good to me, as many projects might have different needs than just running cargo test. As you said, some will run pytest. Some others might not even run automated tests but collect coverage for manual test runs.

I'd only do one thing differently: if cargo grcov-report is run when no build and no coverage artifacts are available, it should automatically run the first command too.
This way the simplest case becomes just cargo grcov-report.
This second part can also be a follow-up.

@gilescope
Copy link
Contributor

I like the idea of that. Would one run cargo test with no parameters in that case?

@gilescope
Copy link
Contributor

I see these new .profraw files appearing in the root. I'm going to try setting PROFDATA_DIR so that they're generated inside the CARGO_TARGET_DIR as I think it's unhygenic for generated files to be produced outside the target dir (I like cargo clean to delete all the generated stuff).

@marco-c
Copy link
Collaborator

marco-c commented Nov 26, 2020

I like the idea of that. Would one run cargo test with no parameters in that case?

Yep.
Thinking on it a bit more, I think we could support the following commands:

  • cargo grcov env -- BUILD_COMMAND to setup the coverage env variables and run a user-defined command to build;
  • cargo grcov build to setup the coverage env variables and run cargo build;
  • cargo grcov test to setup the coverage env variables, run tests with cargo test and then parse resulting coverage artifacts;
  • cargo grcov report -- TEST_COMMAND to run the user-defined command and then parse any resulting coverage artifacts.

This would cover most use cases easily by running cargo grcov build and cargo grcov test (or even just cargo grcov test).
And it would cover any complex scenarios by running cargo grcov env -- COMMAND and cargo grcov report -- COMMAND.

I see these new .profraw files appearing in the root. I'm going to try setting PROFDATA_DIR so that they're generated inside the CARGO_TARGET_DIR as I think it's unhygenic for generated files to be produced outside the target dir (I like cargo clean to delete all the generated stuff).

@gilescope you can use LLVM_PROFDATA_FILE as we do in .travis.yml (it's your best bet also to avoid some problems with multiple processes writing to the same profraw file)

@gilescope
Copy link
Contributor

Obviously I'm keen on as little configuration as we can get away with, but I suspect we might need a few arguments. Do we aim for as detailed coverage as we can do (i.e. source level if the llvm_tools rustup component is available/installed)?

If we take cargo grcov build as an example, do we pass almost all parameters on to cargo build but if there are grcov specific parameters then we'd parse them or were you thinking more along the lines of: cargo grcov build -- *cargo parameters* ?

@marco-c
Copy link
Collaborator

marco-c commented Nov 27, 2020

If we take cargo grcov build as an example, do we pass almost all parameters on to cargo build but if there are grcov specific parameters then we'd parse them or were you thinking more along the lines of: cargo grcov build -- *cargo parameters* ?

I was thinking the first option, but maybe the second more clearly splits the grcov arguments from the cargo build arguments.

@gilescope
Copy link
Contributor

What about +nightly ? If it's good enough for cargo then I guess we can have cargo grcov --setting build *cargo-build-params* but maybe '--' is less surprising. Let's try '--' and see how that looks.

@gilescope
Copy link
Contributor

@marco-c for tests I'm tempted to use command_access feature on nightly so I can see what the env vars set are (and only assert those bits of the test when run on nightly). But seems the only nice way to do that is to define a 'nightly' feature to know if it's nightly or not. If that's ok let me know. If not I'll ditch command_access and maybe wrap Command instead or maybe switchout the actual command to env/set dependent on OS and call the Command to see what's set. Or maybe I should go all in and just write integration tests...

@gilescope gilescope linked a pull request Dec 2, 2020 that will close this issue
@marco-c
Copy link
Collaborator

marco-c commented Dec 2, 2020

But seems the only nice way to do that is to define a 'nightly' feature to know if it's nightly or not. If that's ok let me know.

That seems a bit ugly :(

@gilescope
Copy link
Contributor

Agreed - went for lots of quick integration tests, and looks like that's going to be fine as grcov is build on CI before it runs the tests so I can call it.

@lu-zero
Copy link
Collaborator

lu-zero commented Dec 4, 2020

* `cargo grcov env -- BUILD_COMMAND` to setup the coverage env variables and run a user-defined command to build;

* `cargo grcov build` to setup the coverage env variables and run `cargo build`;

* `cargo grcov test` to setup the coverage env variables, run tests with `cargo test` and then parse resulting coverage artifacts;

* `cargo grcov report -- TEST_COMMAND` to run the user-defined command and then parse any resulting coverage artifacts.

It looks good, just make sure you default to have a custom target-dir so cargo build and cargo grcov test do not stomp on each other.

It would be good to add a cargo grcov clean that wipes the matching target-dir or all of them.

@gilescope
Copy link
Contributor

gilescope commented Dec 7, 2020 via email

@lu-zero
Copy link
Collaborator

lu-zero commented Dec 7, 2020

(To be honest it would be nice if rustc could hold a few different flavours of builds in the target dir at once. It can handle check and build at the same time without stomping, but it does definitely stomp at the moment when we change rustcflags.)

Beware: changes in rustcflags would definitely invalidate everything almost all the time. Check is a conceptual subset of build even if right now it isn't exactly the case in practice (cargo build && cargo check would not run a quicker check).

@gilescope
Copy link
Contributor

True enough. Ideally we use the shiny new source based coverage, given that it's significantly closer to correct for rust.

The path to the compiled binary must be given as an argument when source-based coverage is used

I have lots of questions.

I'm thinking if we run a bunch of tests in different test exes that get created then we would not have one but several. (We can get the json output of the tests so we can figure out the exes but seems messy).

Is the location of the exe(s) found in the profraw file? Does grcov definitely only accept one bin for source based coverage? Are there some separate issues that are related that should be mentioned here?

@marco-c
Copy link
Collaborator

marco-c commented Jun 29, 2021

The workaround solution described at https://doc.rust-lang.org/nightly/unstable-book/compiler-flags/instrument-coverage.html#tips-for-listing-the-binaries-automatically can be used to get the list of binaries.

@jonhoo
Copy link
Contributor

jonhoo commented Dec 15, 2021

My understanding was that -b could simply be ./target/debug/ (assuming no CARGO_TARGET_DIR or other target dir location config). Of course the question becomes what do we do if someone does, say:

cargo grcov env -- cargo test
cargo grcov env -- cargo test --release
cargo grcov report

There is no one -b that would work here, since we'd need to use both target/debug and target/release.

@jonhoo
Copy link
Contributor

jonhoo commented Dec 15, 2021

I think this is arguably just a shortcoming of grcov. It should allow --binary-path to be passed multiple times by just adjusting the logic here

grcov/src/llvm_tools.rs

Lines 44 to 59 in fb68ecc

let binaries = if binary_path.is_file() {
vec![binary_path.to_owned()]
} else {
let mut paths = vec![];
for entry in WalkDir::new(&binary_path) {
let entry =
entry.unwrap_or_else(|_| panic!("Failed to open directory '{:?}'.", binary_path));
if entry.path().is_executable() && entry.metadata().unwrap().len() > 0 {
paths.push(entry.into_path());
}
}
paths
};

That way we could always pass in both target/debug and target/release.

@jerel
Copy link

jerel commented Mar 3, 2023

The multi --binary-path usage as discussed here seems to be untenable on medium to large projects. I have a project that produces ~100 test binaries and 2 libraries. If I use grcov on the libraries' profraw files and provide --binary-path a path to the two binaries it works fine. But if I point --binary-path to target/debug or to a directory that contains only the 100 + 2 binaries it takes 20+ minutes to complete. For comparison llvm-profdata merge + llvm-cov export --object --object + genhtml does the same work in around a minute.

@jonhoo
Copy link
Contributor

jonhoo commented Mar 5, 2023

Doesn't that suggest a performance issue in grcov itself that should be fixed, rather than indicate that the feature is undesired?

@jerel
Copy link

jerel commented Mar 5, 2023

Yes, the feature is desirable IMO. My reason for reporting was that I could easily see --binary-path one --binary-path two getting implemented and shipping without the performance issue being apparent on test projects.

What ended up working for me on this large project was to first manually use llvm merge and then llvm export --object one --object two ... to get the lcov file and grcov was then happy to do the html generation.

So since my working solution uses all the same pieces as grcov I'm suspicious that the performance issue might be in how grcov calls llvm export. llvm export is being called in a loop with binary.as_ref() as the positional argument instead of calling it once and passing [multiple] --object binary.as_ref() args.

grcov/src/llvm_tools.rs

Lines 65 to 72 in fb68ecc

let args = [
"export".as_ref(),
binary.as_ref(),
"--instr-profile".as_ref(),
profdata_path.as_ref(),
"--format".as_ref(),
"lcov".as_ref(),
];

If I get time I'll see if I can track it down further.

@marco-c
Copy link
Collaborator

marco-c commented Mar 7, 2023

Regarding binary, see also #535.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

6 participants