False negatives definition #239

audreymaniez · 2018-07-26T08:16:52Z

In the section 15.2. Accuracy Benchmarking, false positive and negative are defined :

False positives: This is the percentage of test targets, that were failed by the rule, but were not failed by an accessibility expert.

False negatives: This is the percentage of test targets, that were passed by the rule, but were failed by an accessibility expert.

To be consistent with the definition of a false positive, I would say for false negatives : "not passed" instead of "failed" because a test target can be missed by the accessibility expert which will evaluate the rule to "innaplicable" and not "failed" ?

Proposal :

False negatives: This is the percentage of test targets, that were passed by the rule, but were not passed by an accessibility expert.

WilcoFiers · 2018-07-30T14:01:57Z

@audreymaniez Thanks for the feedback. This is intentional. I would not consider it a false negative if something was passed by a rule and called as inapplicable by an accessibility expert. I don't think that's a noteworthy distinction.

maryjom · 2018-07-31T20:33:33Z

More to Wilco's point - if an accessibility expert called it inapplicable, that is essentially equivalent to "pass" - which wouldn't be a false negative.

WilcoFiers · 2018-08-20T12:03:44Z

Discussed during the call today. General consensus seems to be that the way the Accuracy Benchmark section is written up today doesn't quite work. None of the contributors measure accuracy in the way proposed. Instead relying on bug reports to figure out false positives, and in some cases making "beta" versions of rules to try them out before finalising them.

Proposal on the table is to replace this section with a more generic section on how to approach rule quality. This should probably not be part of normative document but be place in notes that can be referenced from the rules format.

maryjom · 2018-09-06T13:22:38Z

Need a generic section talking about benchmarking, false positives, false negatives. Make it a "SHOULD" and link out to the Review Process document.

* Remove aspects MUST requirement #264 * Updated aspects based on feedback #257 * Make test cases required for all rules * Change "local laws" to "laws" #231 * Add a paragraph on accessibility of rules #226 * Rewrote benchmark to non-normative section #236 #239 #163 * Consistency in rule-aggregation #266 * Tweaked accessibility support language #221 * Changed test subject from MUST to MAY #220 * Require rule IDs in atomic rules list #261 * Fix rule type example #230 * Add rule type to the rule structure #232 * Update from #274 * Add "satify" explanation for Rules to SCs. #227 * Example to "satisfy" WCAG SCs #250 * Scrub document to ensure correct use of "should" and "may" #267 * Break up the PR

WilcoFiers added the discussion topic label Aug 21, 2018

WilcoFiers mentioned this issue Aug 21, 2018

15.2. Accuracy Benchmarking - Update definition #236

Closed

WilcoFiers added a commit that referenced this issue Sep 21, 2018

Rewrote benchmark to non-normative section #236 #239 #163

315420c

WilcoFiers mentioned this issue Sep 21, 2018

Bunch of editorial edits #273

Merged

WilcoFiers closed this as completed in #273 Sep 24, 2018

WilcoFiers mentioned this issue Sep 24, 2018

Update rule accuracy section #277

Closed

nitedog added For CR and removed For CR labels Mar 29, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

False negatives definition #239

False negatives definition #239

audreymaniez commented Jul 26, 2018

WilcoFiers commented Jul 30, 2018

maryjom commented Jul 31, 2018

WilcoFiers commented Aug 20, 2018

maryjom commented Sep 6, 2018

False negatives definition #239

False negatives definition #239

Comments

audreymaniez commented Jul 26, 2018

WilcoFiers commented Jul 30, 2018

maryjom commented Jul 31, 2018

WilcoFiers commented Aug 20, 2018

maryjom commented Sep 6, 2018