Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Optionally extend branch coverage to instrument the last operand of lazy boolean expressions #124120

Closed
RenjiSann opened this issue Apr 18, 2024 · 10 comments · Fixed by #125756
Closed
Labels
A-code-coverage Area: Source-based code coverage (-Cinstrument-coverage)

Comments

@RenjiSann
Copy link
Contributor

This issue aims to be a discussion about the implementation of branch coverage, to follow up the discussions in #79649 and #124118.

In the following code example, the current implementation of branch coverage does not instrument the outcome of b in the indirect function, because it has no impact on the control flow of the program and is just assigned to v. a, however, creates a conditional evaluation of b because of the short circuits semantics and thus creates a branch in the control flow.

fn direct(a: bool, b: bool) {
    if a || b { 
        println!("success");
    }
}

fn indirect(a: bool, b: bool) {
    let v = a || b;
    if v { 
        println!("success");
    }
}

#[coverage(off)]
fn main() {
    direct(true, false);
    direct(false, false);

    indirect(true, false);
    indirect(false, false);
}

Even though this instrumentation is "branch coverage" according to the definition, there are several reasons for which one would expect b to be instrumented as well :

  • The current behavior may be found counter-intuitive, as illustrated by the question in Add support for branch coverage in source-based coverage #79649.
  • This is the behavior adopted in clang/LLVM.
  • MC/DC coverage instrumentation depends on branch coverage, and in certification context, MC/DC coverage implies that all the conditions in all the boolean expressions with 2+ operands are instrumented.

@Zalathar gave pertinent reasons on why the "instrument b" behavior might not suit everyone's needs, one being that it implies inserting branches in the MIR that are not here by default. Roughly speaking, this would imply that the MIR of the expression would look like the one of let x = if a || b { true } else { false }; rather than let x = if a { b } else { false };.


In order to give the choice to the user, I think we could implement this behavior under another compilation flags: -Z coverage-options=condition.
This idea is that condition coverage instruments the outcome of each condition inside boolean expressions:

  • when the expression has 2+ conditions and is outside of a control flow structure
  • anytime the expression is in a control flow structure.

Basically, this is branch coverage with the special case of b being instrumented.

In terms of implementation, I think the easiest would be to make condition imply branch coverage, and treat the special case only when condition is enabled.

@rustbot rustbot added the needs-triage This issue may need triage. Remove it if it has been sufficiently triaged. label Apr 18, 2024
@Zalathar
Copy link
Contributor

@rustbot label +A-code-coverage

@rustbot rustbot added the A-code-coverage Area: Source-based code coverage (-Cinstrument-coverage) label Apr 18, 2024
@Zalathar
Copy link
Contributor

There's a similar concern for some other boolean expressions that don't directly participate in a branch. For example:

// Non-lazy logical operators:
a | b
a & b

// Branch hidden in a standard-library function:
b.then_some(x)

So any mode that tries to instrument boolean expressions probably needs to consider whether/how to handle cases like these, too.

@Zalathar
Copy link
Contributor

In terms of implementation, I think the easiest would be to make condition imply branch coverage, and treat the special case only when condition is enabled.

When I originally designed -Zcoverage-options, I imagined that it would mostly consist of individual boolean flags.

But with branch coverage, I think what we're seeing is closer to an enum-valued setting with multiple levels, e.g.:

  • branch=off
  • branch=basic (alias: branch, branch=on)
  • branch=condition (alias: condition)
  • branch=mcdc (alias: mcdc)

That might relieve the awkwardness of having so many “independent” flags that actually do depend on each other.

@RenjiSann RenjiSann changed the title Optionnaly extend branch coverage to instrument the last operand of lazy boolean expressions Optionally extend branch coverage to instrument the last operand of lazy boolean expressions Apr 19, 2024
@Lambdaris
Copy link

Lambdaris commented Apr 21, 2024

Based on #124217 we could accept it in mcdc first as they are required on the same occasion mostly.
If we do that we probably get different results of llvm-cov --show-branches=count between --coverage-options=branch and --coverage-options=mcdc.
What's more, since LLVM also produces branch coverage reports with MappingKind::MCDCBranch, we can view --coverage-options=mcdc as another "branch coverage" and make them mutually exclusive temporarily.

@RenjiSann
Copy link
Contributor Author

I am unsure about how this should be implemented.
Shall we

  1. insert an actual "useless" branch and instrument it no matter what;
  2. or maybe be we could read the assigned result and update the corresponding counter/condition bitmap for MCDC.

Solution 1. is probably the simplest to implement, at the cost of small runtime bloat.
Solution 2. may involve harder implementations of the next steps.

I would rather go with solution 1, because again, slowdown is expected when instrumenting for code coverage

@RenjiSann
Copy link
Contributor Author

In terms of implementation, I think the easiest would be to make condition imply branch coverage, and treat the special case only when condition is enabled.

When I originally designed -Zcoverage-options, I imagined that it would mostly consist of individual boolean flags.

But with branch coverage, I think what we're seeing is closer to an enum-valued setting with multiple levels, e.g.:

  • branch=off
  • branch=basic (alias: branch, branch=on)
  • branch=condition (alias: condition)
  • branch=mcdc (alias: mcdc)

That might relieve the awkwardness of having so many “independent” flags that actually do depend on each other.

Yes, we definitely need a better way to organize all the options.
However, I am not convinced by the name branch=*, as MC/DC and Branch coverage are clearly different things.

Maybe we could use -Z coverage-level=(statement (default) | branch | condition | mcdc), and -Z coverage-options= could be used for things like no-pattern-matching

@saethlin saethlin removed the needs-triage This issue may need triage. Remove it if it has been sufficiently triaged. label Apr 28, 2024
bors added a commit to rust-lang-ci/rust that referenced this issue Apr 30, 2024
…errors

coverage: Replace boolean options with a `CoverageLevel` enum

After rust-lang#123409, and some discussion at rust-lang#79649 (comment) and rust-lang#124120, it became clear to me that we should have a unified concept of “coverage level”, instead of having several separate boolean flags that aren't actually independent.

This PR therefore introduces a `CoverageLevel` enum, to replace the existing boolean flags for `branch` and `mcdc`.

The `no-branch` value (for `-Zcoverage-options`) has been renamed to `block`, instructing the compiler to only instrument for block coverage, with no branch coverage or MD/DC instrumentation.

`@rustbot` label +A-code-coverage
cc `@ZhuUx` `@Lambdaris` `@RenjiSann`
@Zalathar
Copy link
Contributor

I still don't understand the motivation for wanting to instrument the RHS of lazy boolean operators specifically, but then not also instrument other boolean-valued expressions.

Because if instrumenting all booleans is the goal, then I don't think focusing on lazy booleans specifically is a good intermediate step.

@RenjiSann
Copy link
Contributor Author

I still don't understand the motivation for wanting to instrument the RHS of lazy boolean operators specifically, but then not also instrument other boolean-valued expressions.

Because if instrumenting all booleans is the goal, then I don't think focusing on lazy booleans specifically is a good intermediate step.

Condition coverage would really only be meant as a building block for MC/DC.

The MC/DC criterion only makes sense for boolean expressions with at least two operands, so tracking branch coverage for simple boolean expressions isn't really relevant.
We do need to track the valuation of all operands, however, as otherwise it would be too easy to cheese the criteria (even for branch coverage). Instead of requiring 4 tests to show full MC/DC coverage (and same for branch coverage, due to the short circuit operators) for

if (a && b && c) { ... }

We could rewrite this as `

let tmp1 = a && b;
let tmp2 = tmp1 && c;
if (tmp2) { ... }

Then, by only testing with a == True and a == False, all other variables being True we'd get 100% branch coverage.

The reason why we don't want to instrument all booleans (including simple assignements like let x: bool = a) is because it produces an overwhelming quantity of coverage information which is not very relevant for certification contexts

For example:

fn foo(a: bool) {
	let x = a;  // 2. check
	if x {      // 3. check
		...
	}
}

fn main() {
	let res = foo(true); // 1. check
}

In this example. if we were to instrument all boolean values, we would end up with 3 tests for the exact same value, when the 3rd one would be enough.

Regarding the instrumentation of short-circuit operators only, there are seemingly 2 reasons for that:

  • llvm-cov builds the expressions BDD (binary decision diagram) from the mappings present in the binary. From what I understood, it assumes that operators are short-circuits, and will consider short-circuited conditions as DONT_CARE when performing analysis, no matter what the actual operators are. I am not sure how this can interfere with the actual correctness of the analysis.
  • The MC/DC criterion is easier to fulfill for short-circuited expressions, for the DONT_CARE reason. Also, legacy MCDC analysis solutions (including gnatcov) were based on tracing the branches in assembly at runtime, thus it was only possible to track the execution of short-circuit operands. Also, these solutions needed a -fpreserve-control-flow flag to be passed to the compiler to instrument the last conditions as well.

@Zalathar
Copy link
Contributor

Zalathar commented May 3, 2024

Condition coverage would really only be meant as a building block for MC/DC.

I've been thinking some more about whether we should have a separate condition mode at all (instead of just having it be a part of mcdc), and I think my conclusion for now is that I'm reluctantly OK with it.

My main concern is that I want to avoid a proliferation of modes that complicate and weaken testing, especially if they'll be contentious to remove later on.

On the other hand, I do recognise the implementation benefits of having a separate tier on a temporary basis, so that it can be developed and tested separately from the full complexity of MC/DC, without affecting the main branch coverage mode.

So I'd like to suggest that we primarily think of the condition mode as a temporary implementation detail that will eventually go away once the MC/DC mode is more mature, unless there turns out to be real end-user demand for it.

Does that sound OK?

@RenjiSann
Copy link
Contributor Author

Does that sound OK?

Yes that sounds good. No need to keep condition AND branch coverage available to the user in the long term, as it will likely create confusion.

matthiaskrgr added a commit to matthiaskrgr/rust that referenced this issue May 31, 2024
coverage: Optionally instrument the RHS of lazy logical operators

(This is an updated version of rust-lang#124644 and rust-lang#124402. Fixes rust-lang#124120.)

When `||` or `&&` is used outside of a branching context (such as the condition of an `if`), the rightmost value does not directly influence any branching decision, so branch coverage instrumentation does not treat it as its own true-or-false branch.

That is a correct and useful interpretation of “branch coverage”, but might be undesirable in some contexts, as described at rust-lang#124120. This PR therefore adds a new coverage level `-Zcoverage-options=condition` that behaves like branch coverage, but also adds additional branch instrumentation to the right-hand-side of lazy boolean operators.

---

As discussed at rust-lang#124120 (comment), this is mainly intended as an intermediate step towards fully-featured MC/DC instrumentation. It's likely that we'll eventually want to remove this coverage level (rather than stabilize it), either because it has been incorporated into MC/DC instrumentation, or because it's getting in the way of future MC/DC work. The main appeal of landing it now is so that work on tracking conditions can proceed concurrently with other MC/DC-related work.

`@rustbot` label +A-code-coverage
jieyouxu added a commit to jieyouxu/rust that referenced this issue May 31, 2024
coverage: Optionally instrument the RHS of lazy logical operators

(This is an updated version of rust-lang#124644 and rust-lang#124402. Fixes rust-lang#124120.)

When `||` or `&&` is used outside of a branching context (such as the condition of an `if`), the rightmost value does not directly influence any branching decision, so branch coverage instrumentation does not treat it as its own true-or-false branch.

That is a correct and useful interpretation of “branch coverage”, but might be undesirable in some contexts, as described at rust-lang#124120. This PR therefore adds a new coverage level `-Zcoverage-options=condition` that behaves like branch coverage, but also adds additional branch instrumentation to the right-hand-side of lazy boolean operators.

---

As discussed at rust-lang#124120 (comment), this is mainly intended as an intermediate step towards fully-featured MC/DC instrumentation. It's likely that we'll eventually want to remove this coverage level (rather than stabilize it), either because it has been incorporated into MC/DC instrumentation, or because it's getting in the way of future MC/DC work. The main appeal of landing it now is so that work on tracking conditions can proceed concurrently with other MC/DC-related work.

``@rustbot`` label +A-code-coverage
jieyouxu added a commit to jieyouxu/rust that referenced this issue May 31, 2024
coverage: Optionally instrument the RHS of lazy logical operators

(This is an updated version of rust-lang#124644 and rust-lang#124402. Fixes rust-lang#124120.)

When `||` or `&&` is used outside of a branching context (such as the condition of an `if`), the rightmost value does not directly influence any branching decision, so branch coverage instrumentation does not treat it as its own true-or-false branch.

That is a correct and useful interpretation of “branch coverage”, but might be undesirable in some contexts, as described at rust-lang#124120. This PR therefore adds a new coverage level `-Zcoverage-options=condition` that behaves like branch coverage, but also adds additional branch instrumentation to the right-hand-side of lazy boolean operators.

---

As discussed at rust-lang#124120 (comment), this is mainly intended as an intermediate step towards fully-featured MC/DC instrumentation. It's likely that we'll eventually want to remove this coverage level (rather than stabilize it), either because it has been incorporated into MC/DC instrumentation, or because it's getting in the way of future MC/DC work. The main appeal of landing it now is so that work on tracking conditions can proceed concurrently with other MC/DC-related work.

```@rustbot``` label +A-code-coverage
@bors bors closed this as completed in 7667a91 May 31, 2024
rust-timer added a commit to rust-lang-ci/rust that referenced this issue May 31, 2024
Rollup merge of rust-lang#125756 - Zalathar:branch-on-bool, r=oli-obk

coverage: Optionally instrument the RHS of lazy logical operators

(This is an updated version of rust-lang#124644 and rust-lang#124402. Fixes rust-lang#124120.)

When `||` or `&&` is used outside of a branching context (such as the condition of an `if`), the rightmost value does not directly influence any branching decision, so branch coverage instrumentation does not treat it as its own true-or-false branch.

That is a correct and useful interpretation of “branch coverage”, but might be undesirable in some contexts, as described at rust-lang#124120. This PR therefore adds a new coverage level `-Zcoverage-options=condition` that behaves like branch coverage, but also adds additional branch instrumentation to the right-hand-side of lazy boolean operators.

---

As discussed at rust-lang#124120 (comment), this is mainly intended as an intermediate step towards fully-featured MC/DC instrumentation. It's likely that we'll eventually want to remove this coverage level (rather than stabilize it), either because it has been incorporated into MC/DC instrumentation, or because it's getting in the way of future MC/DC work. The main appeal of landing it now is so that work on tracking conditions can proceed concurrently with other MC/DC-related work.

````@rustbot```` label +A-code-coverage
fmease added a commit to fmease/rust that referenced this issue Jun 19, 2024
…l, r=nnethercote

MCDC Coverage: instrument last boolean RHS operands from condition coverage

Fresh PR from rust-lang#124652

--

This PR ensures that the top-level boolean expressions that are not part of the control flow are correctly instrumented thanks to condition coverage.

See discussion on rust-lang#124120.
Depends on `@Zalathar` 's condition coverage implementation rust-lang#125756.
rust-timer added a commit to rust-lang-ci/rust that referenced this issue Jun 19, 2024
Rollup merge of rust-lang#125766 - RenjiSann:fresh-mcdc-branch-on-bool, r=nnethercote

MCDC Coverage: instrument last boolean RHS operands from condition coverage

Fresh PR from rust-lang#124652

--

This PR ensures that the top-level boolean expressions that are not part of the control flow are correctly instrumented thanks to condition coverage.

See discussion on rust-lang#124120.
Depends on `@Zalathar` 's condition coverage implementation rust-lang#125756.
github-actions bot pushed a commit to rust-lang/miri that referenced this issue Jun 21, 2024
…hercote

MCDC Coverage: instrument last boolean RHS operands from condition coverage

Fresh PR from #124652

--

This PR ensures that the top-level boolean expressions that are not part of the control flow are correctly instrumented thanks to condition coverage.

See discussion on rust-lang/rust#124120.
Depends on `@Zalathar` 's condition coverage implementation #125756.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-code-coverage Area: Source-based code coverage (-Cinstrument-coverage)
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants