Dart VM code coverage reports may include false negatives #42061

rmacnak-google · 2020-05-26T19:16:11Z

Code coverage reports from the VM are based on invocation counters on functions (Function) and call sites (ICData). If the invocation counter is non-zero, the position is a hit, otherwise a miss. When code coverage was originally added, the optimizer was much simpler. A call site would not be inlined unless it was executed, so every call site that ever executed would increment the ICData usage counter before being optimized away (probably modulo a small set or recognized static functions). Since then, the optimizer has gotten smarter and may inline calls with a zero usage count (during AOT compilation all usage counts are zero). The language has also changed to enforced static types, giving the optimizer an opportunity to inline away calls before any type feedback is collected. If the VM inlines away a call site before its first invocation, code coverage will report it as a miss. This means currently the only way to get accurate coverage data is to run with the VM with the optimizer disabled, --optimization_counter_threshold=-1.

This comes a heavy performance cost. We should add a flag that specifically disables only the optimizations that interfere with code coverage: inlining or otherwise replacing call sites with a zero usage count.

@bkonyi @MichaelRFairhurst

The text was updated successfully, but these errors were encountered:

mraleph · 2020-05-27T07:30:14Z

I think we should consider emitting coverage collecting instructions into the graph (similar to DebugBreak) and decouple it from an already existing machinery - I think this might simplify both coverage and compiler itself.

Optimiser then could contain a very simple canonicalisation rule which removes coverage instructions which already have triggered.

rmacnak-google · 2023-01-18T21:02:13Z

@liamappelbe

rmacnak-google · 2023-01-18T21:55:34Z

The situation is probably somewhat improved after, i.e., 8896894 and 2eb4ea5, though I'm not sure if this handles everything that might be optimized away.

stereotype441 · 2024-06-04T20:44:20Z

Note that this recently game up in https://dart-review.googlesource.com/c/sdk/+/368920. See analysis from @jensjoha in this comment: https://dart-review.googlesource.com/c/sdk/+/368920/comment/74b0ce83_a454627a/

jensjoha · 2024-06-06T08:46:42Z

8896894 doesn't really work either because (I think) RecordCoverageInstr goes away with optimization --- see https://dart-review.googlesource.com/c/sdk/+/369920

/cc @mraleph

jensjoha · 2024-06-07T13:31:08Z

https://dart-review.googlesource.com/c/sdk/+/370220 could make it not go away with optimization.

Dead Code Elimination can remove `RecordCoverageInstr`, but shouldn't. This CL makes it stay by returning true in `HasUnknownSideEffects` making `MayHaveVisibleEffect` return true, making Dead Code Elimination keep it. Note that `RecordCoverageInstr::Canonicalize` will still let it go away if the position is already covered which is what we want. Bug: #42061 TEST=pkg/vm_service/test/coverage_closure_call_after_optimization_test.dart,vm/cc/IL_RecordCoverageSurvivesOptimizations Change-Id: Ifd72f9071a51924fd71f3dae91687acb1467047d Reviewed-on: https://dart-review.googlesource.com/c/sdk/+/370220 Reviewed-by: Slava Egorov <vegorov@google.com> Commit-Queue: Jens Johansen <jensj@google.com>

liamappelbe · 2024-06-10T21:30:28Z

We already insert some of these instructions behind the --branch-coverage flag. So I guess what we're proposing here is to add the instruction in way more places, to totally replace the ICData based coverage logic.

We'll still want to gate this behind a flag, because the extra instructions will significantly bloat the binary. So either way, users will have to pass a VM flag to collect coverage. The only difference between a --coverage flag and the existing --optimization_counter_threshold=-1 flag is that --coverage will have a much smaller performance impact.

jensjoha · 2024-06-11T06:15:44Z

Also worth noting is that --optimization_counter_threshold=-1 for whatever reason does not produce the right output.

mraleph · 2024-06-11T06:25:29Z

We'll still want to gate this behind a flag, because the extra instructions will significantly bloat the binary.

It's unclear if that really matters that much. I think the only binary we could potentially care about is app-jit snapshot size - so we could add a flag which we would disable during app-jit training, but keep enabled by default otherwise. JIT snapshots for tools are on their way out anyway, so it's a really niche thing.

The size of unoptimized code might be of some consideration (we should probably measure the impact), but I am not sure if it is worth worrying about it too much (and it is already rather bloated).

liamappelbe · 2024-06-11T22:43:55Z

It's unclear if that really matters that much.

I think I've missed something. Are these coverage instructions being removed in AOT builds somehow?

mraleph · 2024-06-12T01:40:56Z

It's unclear if that really matters that much.

I think I've missed something. Are these coverage instructions being removed in AOT builds somehow?

Right now they are never emitted in AOT in the first place. See SupportsCoverage:

sdk/runtime/vm/compiler/frontend/base_flow_graph_builder.cc

Lines 24 to 30 in d0998d6

    
           static bool SupportsCoverage() { 
        
           #if defined(PRODUCT) 
        
             return false; 
        
           #else 
        
             return !CompilerState::Current().is_aot(); 
        
           #endif 
        
           }

FWIW AOT builds don't use ICs for method dispatching and all the code is specialized and optimized - so you can't collect good coverage data with the current approach anyway. With RecordCoverage instructions we can actually support collecting coverage from AOT builds if we wanted. (and if that's indeed something that we wanted we could also do few local optimizations to significantly limit code size blow out as well)

Before this CL, becase of optimizations, coverage of some of these goes away, making coverage unreliable. I believe this fixes the issues for "regular" runs (at least it seems to be stable on the CFE coverage tests). If setting `--optimization-counter-threshold=-1` there'll still be trouble though and we would have to also insert these calls in the start of FunctionBody and the start of FieldInitializer for it to produce the same results. TEST=pkg/vm_service/test/coverage_instance_call_after_optimization_test.dart,pkg/vm_service/test/coverage_static_call_after_optimization_test.dart Bug: #42061 Bug: #55959 Bug: #56018 Change-Id: I34947f0d4b123e52ce67b71a195782d31e4bda16 Reviewed-on: https://dart-review.googlesource.com/c/sdk/+/370501 Commit-Queue: Jens Johansen <jensj@google.com> Reviewed-by: Slava Egorov <vegorov@google.com>

rmacnak-google added area-vm Use area-vm for VM related issues, including code coverage, FFI, and the AOT and JIT backends. vm-regression labels May 26, 2020

jensjoha mentioned this issue Sep 14, 2020

VM coverage is unreliable #43405

Closed

rmacnak-google mentioned this issue Oct 19, 2022

Field guards from unreachable code can be moved to reachable code and can cause repeated deoptimization #50245

Closed

alexmarkov mentioned this issue Jun 26, 2024

Coverage reports differ when --optimization-counter-threshold=-1 is used #56075

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Dart VM code coverage reports may include false negatives #42061

Dart VM code coverage reports may include false negatives #42061

rmacnak-google commented May 26, 2020

mraleph commented May 27, 2020

rmacnak-google commented Jan 18, 2023

rmacnak-google commented Jan 18, 2023

stereotype441 commented Jun 4, 2024

jensjoha commented Jun 6, 2024

jensjoha commented Jun 7, 2024

liamappelbe commented Jun 10, 2024

jensjoha commented Jun 11, 2024

mraleph commented Jun 11, 2024

liamappelbe commented Jun 11, 2024

mraleph commented Jun 12, 2024 •

edited

Loading

Dart VM code coverage reports may include false negatives #42061

Dart VM code coverage reports may include false negatives #42061

Comments

rmacnak-google commented May 26, 2020

mraleph commented May 27, 2020

rmacnak-google commented Jan 18, 2023

rmacnak-google commented Jan 18, 2023

stereotype441 commented Jun 4, 2024

jensjoha commented Jun 6, 2024

jensjoha commented Jun 7, 2024

liamappelbe commented Jun 10, 2024

jensjoha commented Jun 11, 2024

mraleph commented Jun 11, 2024

liamappelbe commented Jun 11, 2024

mraleph commented Jun 12, 2024 • edited Loading

mraleph commented Jun 12, 2024 •

edited

Loading