Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Stabilize fn coverage by creating a clean room #7576

Merged
merged 5 commits into from Dec 23, 2019

Conversation

@ryoqun
Copy link
Member

ryoqun commented Dec 19, 2019

Problem

Coverage report is lying (partly). So, increasing coverage isn't fun. ;)

Summary of Changes

Casual quick inspection and my hunch says, this might fix the issue. :p I'll add the findings as a comment later. First, let the CI spin. :)

Found in the middle of #7546

echo "Reverting files"
find target/cov -newer target/cov/build-finished -type f -print -delete
echo "Reverting (empty) directories"
find target/cov -newer target/cov/build-finished -type d -empty -print -delete

This comment has been minimized.

Copy link
@ryoqun

ryoqun Dec 19, 2019

Author Member

After quick googling, I bet -delete and -newer is available on macOS. Dunno why the current code uses print0-and-xargs trick needlessly.

This comment has been minimized.

Copy link
@ryoqun

ryoqun Dec 19, 2019

Author Member

As a matter of fact, this can't be replaced with the trick because we want to recursively remove directories which might get empty while doing search & destroy. The trick is effectively a BFS, and this find is effectively a DFS.

This comment has been minimized.

Copy link
@t-nelson

t-nelson Dec 20, 2019

Contributor

Confirmed macOS find has -delete and -newer

@mvines

This comment has been minimized.

Copy link
Member

mvines commented Dec 19, 2019

Cool I hope this helps! 🤞

source ci/rust-version.sh nightly

RUST_LOG=solana=trace _ cargo +$rust_nightly test --target-dir target/cov --no-run "${packages[@]}"
touch target/cov/build-finished

This comment has been minimized.

Copy link
@ryoqun

ryoqun Dec 20, 2019

Author Member

The new logic is simpler: Remove 'em all files and directories written after the build including any coverage data and report file. Just all of them.

@codecov

This comment has been minimized.

Copy link

codecov bot commented Dec 20, 2019

Codecov Report

Merging #7576 into master will increase coverage by 9.8%.
The diff coverage is n/a.

@@           Coverage Diff            @@
##           master   #7576     +/-   ##
========================================
+ Coverage    71.8%   81.7%   +9.8%     
========================================
  Files         244     244             
  Lines       56626   49807   -6819     
========================================
+ Hits        40680   40706     +26     
+ Misses      15946    9101   -6845
@t-nelson

This comment has been minimized.

Copy link
Contributor

t-nelson commented Dec 20, 2019

This has to be better than the full cargo clean I usually do when coverage looks weird 😅

@ryoqun

This comment has been minimized.

Copy link
Member Author

ryoqun commented Dec 21, 2019

After painful trials, I've found pruning doesn't work.

The somewhat-complete detailed investigation report is here. :)

The excerpt for the pinned-down culprit of our diverged coverage reports:

Good:

Current view: 	top level - merkle-tree/src - merkle_tree.rs (source / functions) 			Hit 	Total 	Coverage
Test: 	lcov.info 		Lines: 	139 	153 	90.8 %
Date: 	2019-12-18 14:49:55 		Functions: 	34 	***57*** 	59.6 %
Legend: 	Lines: hit not hit 	

Bad:

Current view: 	top level - merkle-tree/src - merkle_tree.rs (source / functions) 			Hit 	Total 	Coverage
Test: 	lcov.info 		Lines: 	139 	153 	90.8 %
Date: 	2019-12-18 10:48:20 		Functions: 	34 	***114*** 	29.8 %
Legend: 	Lines: hit not hit 	

As shown above, the bad coverage report double-counts the total number of functions. This seems to happen for many files, not alone this.

This is caused by the existence of stale .gcno files, which accumulates over time in the build dir and which isn't removed by coverage.sh.

My guess is that multiple stale gcno files cause grcov to mistakenly double-counts fn definitions. No wonder for the doubled count if it can't see these are latest and stale versions and sees different object files.

Simply removing them doesn't help as well. Removing would cause grcov to fail. Because it won't be created by cargo test --no-run later. And that file is mandatory to generate a coverage report. Also, randomly removing .gcno files (like was with pruning) would make our shared? build dir a bad state, which should be expected to be consumed at various points of the git history...

Ideally, grcov should ignore old gcno files. But it don't, as it would require it to tap into cargo's build artifact file's arcane caching logic.

So, drawing from these tearful and restraining findings, as the last resort, let's stash updated and relevant coverage data files into a newly-created empty dir...

(Sorry bad explanation; I'm getting tired... :p)

find target/cov -name \*.gcda -newer "$timing_file" |
(while read -r gcda_file; do
gcno_file="${gcda_file%.gcda}.gcno"
ln -s "../../../$gcda_file" "$data_dir/$(basename "$gcda_file")"

This comment has been minimized.

Copy link
@ryoqun

ryoqun Dec 21, 2019

Author Member

I guess realpath isn't available on macOS...

This comment has been minimized.

Copy link
@mvines

mvines Dec 23, 2019

Member

yeah realpath is a GNU program that requires an extra install for macOS. We can require it if absolutely needed but that's not ideal (another barrier for entry)

This comment has been minimized.

Copy link
@mvines

mvines Dec 23, 2019

Member

I suggest just removing the data_dir variable and directly embedding target/cov/tmp everywhere data_dir is used. The two ln command aren't generic enough to handle any modifications to data_dir due to the ../../.. so the variable doesn't really help much (and in fact could hurt a future person who attempts to modify this file).

This comment has been minimized.

Copy link
@ryoqun

ryoqun Dec 23, 2019

Author Member

I think you have a nice point!: 690a7b3

rm -rf "$data_dir"
mkdir -p "$data_dir"

find target/cov -name \*.gcda -newer "$timing_file" |

This comment has been minimized.

Copy link
@mvines

mvines Dec 23, 2019

Member

I'm curious why you didn't go for a simple for loop:

for gcda_file in $(find target/cov -name \*.gcda -newer "$timing_file"); do
  declare gcno_file="${gcda_file%.gcda}.gcno"
  ln -s "../../../$gcda_file" "$data_dir/$(basename "$gcda_file")"
  ln -s "../../../$gcno_file" "$data_dir/$(basename "$gcno_file")"
done

This comment has been minimized.

Copy link
@ryoqun

ryoqun Dec 23, 2019

Author Member

No problem. :) That's a natural question. In short, I wanted to avoid another shellcheck exception. (SC2044). And among various alternative loop constructions, I ended to choose this particular one based on existing other uses of find in this script after some enduring time for compat. thinking. ;)

As a bonus for this round of commenting, I've made it more robust: 97fe9d9.

Ideally, I wanted to write like this (but I can't due to the restriction of bash 3+) based on this:

ryoqun@ubuqun:~/work/solana/solana$ git diff
diff --git a/scripts/coverage.sh b/scripts/coverage.sh
index 7a0df653a..3b985712f 100755
--- a/scripts/coverage.sh
+++ b/scripts/coverage.sh
@@ -51,12 +51,14 @@ mkdir -p target/cov/tmp
 
 # Can't use a simpler construct under the condition of SC2044 and bash 3
 # (macOS's default). See: https://github.com/koalaman/shellcheck/wiki/SC2044
-find target/cov -name \*.gcda -newer "$timing_file" -print0 |
-  (while IFS= read -r -d '' gcda_file; do
+(
+  shopt -s globstar nullglob
+  for gcda_file in target/cov/**/*.gcda; do
     gcno_file="${gcda_file%.gcda}.gcno"
     ln -s "../../../$gcda_file" "target/cov/tmp/$(basename "$gcda_file")"
     ln -s "../../../$gcno_file" "target/cov/tmp/$(basename "$gcno_file")"
-  done)
+  done
+)
 
 _ grcov target/cov/tmp > target/cov/lcov-full.info
Copy link
Member

mvines left a comment

Thank for chasing this down!

@ryoqun ryoqun changed the title Stabilize fn coverage by pruning all updated files Stabilize fn coverage by creating a clean room Dec 23, 2019
@ryoqun ryoqun force-pushed the ryoqun:fn-coverage-stabilization branch from f18c324 to 97fe9d9 Dec 23, 2019
@mergify mergify bot dismissed mvines’s stale review Dec 23, 2019

Pull request has been modified.

@@ -31,9 +31,12 @@ export RUST_MIN_STACK=8388608

echo "--- remove old coverage results"
if [[ -d target/cov ]]; then
find target/cov -name \*.gcda -print0 | xargs -0 rm -f
find target/cov -name \*.gcda -delete

This comment has been minimized.

Copy link
@ryoqun

ryoqun Dec 23, 2019

Author Member

Let's utilize our new findings: #7576 (comment)

@ryoqun ryoqun merged commit 141131f into solana-labs:master Dec 23, 2019
12 checks passed
12 checks passed
Summary 1 rule matches and 3 potential rules
Details
buildkite/solana Build #17036 passed (37 minutes, 55 seconds)
Details
buildkite/solana/bench Passed (2 seconds)
Details
buildkite/solana/checks Passed (1 minute, 48 seconds)
Details
buildkite/solana/coverage Passed (11 minutes, 57 seconds)
Details
buildkite/solana/local-cluster Passed (18 minutes, 2 seconds)
Details
buildkite/solana/move Passed (4 seconds)
Details
buildkite/solana/pipeline-upload Passed (11 seconds)
Details
buildkite/solana/shellcheck Passed (33 seconds)
Details
buildkite/solana/stable Passed (32 minutes, 58 seconds)
Details
buildkite/solana/stable-perf Passed (10 seconds)
Details
ci-gate Pull Request accepted for CI pipeline
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
3 participants
You can’t perform that action at this time.