Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

False negatives definition #239

Closed
audreymaniez opened this issue Jul 26, 2018 · 4 comments
Closed

False negatives definition #239

audreymaniez opened this issue Jul 26, 2018 · 4 comments

Comments

@audreymaniez
Copy link

In the section 15.2. Accuracy Benchmarking, false positive and negative are defined :

False positives: This is the percentage of test targets, that were failed by the rule, but were not failed by an accessibility expert.

False negatives: This is the percentage of test targets, that were passed by the rule, but were failed by an accessibility expert.

To be consistent with the definition of a false positive, I would say for false negatives : "not passed" instead of "failed" because a test target can be missed by the accessibility expert which will evaluate the rule to "innaplicable" and not "failed" ?

Proposal :

False negatives: This is the percentage of test targets, that were passed by the rule, but were not passed by an accessibility expert.

@WilcoFiers
Copy link
Collaborator

@audreymaniez Thanks for the feedback. This is intentional. I would not consider it a false negative if something was passed by a rule and called as inapplicable by an accessibility expert. I don't think that's a noteworthy distinction.

@maryjom
Copy link
Collaborator

maryjom commented Jul 31, 2018

More to Wilco's point - if an accessibility expert called it inapplicable, that is essentially equivalent to "pass" - which wouldn't be a false negative.

@WilcoFiers
Copy link
Collaborator

Discussed during the call today. General consensus seems to be that the way the Accuracy Benchmark section is written up today doesn't quite work. None of the contributors measure accuracy in the way proposed. Instead relying on bug reports to figure out false positives, and in some cases making "beta" versions of rules to try them out before finalising them.

Proposal on the table is to replace this section with a more generic section on how to approach rule quality. This should probably not be part of normative document but be place in notes that can be referenced from the rules format.

@maryjom
Copy link
Collaborator

maryjom commented Sep 6, 2018

Need a generic section talking about benchmarking, false positives, false negatives. Make it a "SHOULD" and link out to the Review Process document.

WilcoFiers added a commit that referenced this issue Sep 24, 2018
* Remove aspects MUST requirement #264

* Updated aspects based on feedback #257

* Make test cases required for all rules

* Change "local laws" to "laws" #231

* Add a paragraph on accessibility of rules #226

* Rewrote benchmark to non-normative section #236 #239 #163

* Consistency in rule-aggregation #266

* Tweaked accessibility support language #221

* Changed test subject from MUST to MAY #220

* Require rule IDs in atomic rules list #261

* Fix rule type example #230

* Add rule type to the rule structure #232

* Update from #274

* Add "satify" explanation for Rules to SCs. #227

* Example to "satisfy" WCAG SCs #250

* Scrub document to ensure correct use of "should" and "may" #267

* Break up the PR
@nitedog nitedog added For CR and removed For CR labels Mar 29, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants