Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Stabilize fn coverage by creating a clean room #7576

Merged
merged 5 commits into from
Dec 23, 2019

Conversation

ryoqun
Copy link
Member

@ryoqun ryoqun commented Dec 19, 2019

Problem

Coverage report is lying (partly). So, increasing coverage isn't fun. ;)

Summary of Changes

Casual quick inspection and my hunch says, this might fix the issue. :p I'll add the findings as a comment later. First, let the CI spin. :)

Found in the middle of #7546

echo "Reverting files"
find target/cov -newer target/cov/build-finished -type f -print -delete
echo "Reverting (empty) directories"
find target/cov -newer target/cov/build-finished -type d -empty -print -delete
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After quick googling, I bet -delete and -newer is available on macOS. Dunno why the current code uses print0-and-xargs trick needlessly.

Copy link
Member Author

@ryoqun ryoqun Dec 19, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As a matter of fact, this can't be replaced with the trick because we want to recursively remove directories which might get empty while doing search & destroy. The trick is effectively a BFS, and this find is effectively a DFS.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Confirmed macOS find has -delete and -newer

@mvines
Copy link
Member

mvines commented Dec 19, 2019

Cool I hope this helps! 🤞

source ci/rust-version.sh nightly

RUST_LOG=solana=trace _ cargo +$rust_nightly test --target-dir target/cov --no-run "${packages[@]}"
touch target/cov/build-finished
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The new logic is simpler: Remove 'em all files and directories written after the build including any coverage data and report file. Just all of them.

@codecov
Copy link

codecov bot commented Dec 20, 2019

Codecov Report

Merging #7576 into master will increase coverage by 9.8%.
The diff coverage is n/a.

@@           Coverage Diff            @@
##           master   #7576     +/-   ##
========================================
+ Coverage    71.8%   81.7%   +9.8%     
========================================
  Files         244     244             
  Lines       56626   49807   -6819     
========================================
+ Hits        40680   40706     +26     
+ Misses      15946    9101   -6845

@t-nelson
Copy link
Contributor

This has to be better than the full cargo clean I usually do when coverage looks weird 😅

@ryoqun
Copy link
Member Author

ryoqun commented Dec 21, 2019

After painful trials, I've found pruning doesn't work.

The somewhat-complete detailed investigation report is here. :)

The excerpt for the pinned-down culprit of our diverged coverage reports:

Good:

Current view: 	top level - merkle-tree/src - merkle_tree.rs (source / functions) 			Hit 	Total 	Coverage
Test: 	lcov.info 		Lines: 	139 	153 	90.8 %
Date: 	2019-12-18 14:49:55 		Functions: 	34 	***57*** 	59.6 %
Legend: 	Lines: hit not hit 	

Bad:

Current view: 	top level - merkle-tree/src - merkle_tree.rs (source / functions) 			Hit 	Total 	Coverage
Test: 	lcov.info 		Lines: 	139 	153 	90.8 %
Date: 	2019-12-18 10:48:20 		Functions: 	34 	***114*** 	29.8 %
Legend: 	Lines: hit not hit 	

As shown above, the bad coverage report double-counts the total number of functions. This seems to happen for many files, not alone this.

This is caused by the existence of stale .gcno files, which accumulates over time in the build dir and which isn't removed by coverage.sh.

My guess is that multiple stale gcno files cause grcov to mistakenly double-counts fn definitions. No wonder for the doubled count if it can't see these are latest and stale versions and sees different object files.

Simply removing them doesn't help as well. Removing would cause grcov to fail. Because it won't be created by cargo test --no-run later. And that file is mandatory to generate a coverage report. Also, randomly removing .gcno files (like was with pruning) would make our shared? build dir a bad state, which should be expected to be consumed at various points of the git history...

Ideally, grcov should ignore old gcno files. But it don't, as it would require it to tap into cargo's build artifact file's arcane caching logic.

So, drawing from these tearful and restraining findings, as the last resort, let's stash updated and relevant coverage data files into a newly-created empty dir...

(Sorry bad explanation; I'm getting tired... :p)

find target/cov -name \*.gcda -newer "$timing_file" |
(while read -r gcda_file; do
gcno_file="${gcda_file%.gcda}.gcno"
ln -s "../../../$gcda_file" "$data_dir/$(basename "$gcda_file")"
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess realpath isn't available on macOS...

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah realpath is a GNU program that requires an extra install for macOS. We can require it if absolutely needed but that's not ideal (another barrier for entry)

Copy link
Member

@mvines mvines Dec 23, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suggest just removing the data_dir variable and directly embedding target/cov/tmp everywhere data_dir is used. The two ln command aren't generic enough to handle any modifications to data_dir due to the ../../.. so the variable doesn't really help much (and in fact could hurt a future person who attempts to modify this file).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you have a nice point!: 690a7b3

rm -rf "$data_dir"
mkdir -p "$data_dir"

find target/cov -name \*.gcda -newer "$timing_file" |
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm curious why you didn't go for a simple for loop:

for gcda_file in $(find target/cov -name \*.gcda -newer "$timing_file"); do
  declare gcno_file="${gcda_file%.gcda}.gcno"
  ln -s "../../../$gcda_file" "$data_dir/$(basename "$gcda_file")"
  ln -s "../../../$gcno_file" "$data_dir/$(basename "$gcno_file")"
done

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No problem. :) That's a natural question. In short, I wanted to avoid another shellcheck exception. (SC2044). And among various alternative loop constructions, I ended to choose this particular one based on existing other uses of find in this script after some enduring time for compat. thinking. ;)

As a bonus for this round of commenting, I've made it more robust: 97fe9d9.

Ideally, I wanted to write like this (but I can't due to the restriction of bash 3+) based on this:

ryoqun@ubuqun:~/work/solana/solana$ git diff
diff --git a/scripts/coverage.sh b/scripts/coverage.sh
index 7a0df653a..3b985712f 100755
--- a/scripts/coverage.sh
+++ b/scripts/coverage.sh
@@ -51,12 +51,14 @@ mkdir -p target/cov/tmp
 
 # Can't use a simpler construct under the condition of SC2044 and bash 3
 # (macOS's default). See: https://github.com/koalaman/shellcheck/wiki/SC2044
-find target/cov -name \*.gcda -newer "$timing_file" -print0 |
-  (while IFS= read -r -d '' gcda_file; do
+(
+  shopt -s globstar nullglob
+  for gcda_file in target/cov/**/*.gcda; do
     gcno_file="${gcda_file%.gcda}.gcno"
     ln -s "../../../$gcda_file" "target/cov/tmp/$(basename "$gcda_file")"
     ln -s "../../../$gcno_file" "target/cov/tmp/$(basename "$gcno_file")"
-  done)
+  done
+)
 
 _ grcov target/cov/tmp > target/cov/lcov-full.info

mvines
mvines previously approved these changes Dec 23, 2019
Copy link
Member

@mvines mvines left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank for chasing this down!

@ryoqun ryoqun changed the title Stabilize fn coverage by pruning all updated files Stabilize fn coverage by creating a clean room Dec 23, 2019
@mergify mergify bot dismissed mvines’s stale review December 23, 2019 06:28

Pull request has been modified.

@@ -31,9 +31,12 @@ export RUST_MIN_STACK=8388608

echo "--- remove old coverage results"
if [[ -d target/cov ]]; then
find target/cov -name \*.gcda -print0 | xargs -0 rm -f
find target/cov -name \*.gcda -delete
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's utilize our new findings: #7576 (comment)

@ryoqun ryoqun merged commit 141131f into solana-labs:master Dec 23, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants