Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Validity of independent review (retitled) #1622

Open
DavidMacDonald opened this issue Feb 6, 2021 · 17 comments
Open

Validity of independent review (retitled) #1622

DavidMacDonald opened this issue Feb 6, 2021 · 17 comments
Assignees
Labels
Challenges with Conformance Issues relating to the document at https://w3c.github.io/wcag/conformance-challenges/

Comments

@DavidMacDonald
Copy link
Contributor

DavidMacDonald commented Feb 6, 2021

The document cites a Brajnick et al., 2012 study which is not provided, and is not in the reference list and based on one bullet point it cites from that it states:

"This means that an 80% target for agreement, when audits are conducted without communication between evaluators, is not attainable, even with experienced evaluators." link to quote in Challenges doc

I think this is unnecessarily disparaging to WCAG. I think this sentence should be removed. I recently evaluated an international site. I was in Canada and another professional in Paris conducted an evaluation of the same pages without any communication. We had a strong correlation. Much higher than 80%.

@alastc
Copy link
Contributor

alastc commented Feb 6, 2021

Hi @DavidMacDonald, sorry, which document?

@alastc alastc added the Challenges with Conformance Issues relating to the document at https://w3c.github.io/wcag/conformance-challenges/ label Feb 7, 2021
@bruce-usab
Copy link
Contributor

bruce-usab commented Feb 8, 2021

That's a important catch @DavidMacDonald, so I hope you can track down the right GitHub page to be suggesting an edit!

The not attainable conclusion (as represented in Challenges doc) is just factually incorrect (1) because I am pretty sure it misrepresents (mathematically) what we as WG members understand of as 80% reliability for inter-rater agreement, and (2) is disproven (as it is currently stated) by a single counter-example (as you provided).

I will note that a core motivation for the methodology underpinning DHS Trusted Tester is have a repeatable process which is as unambiguous as humanly possible. They aim for much, much higher agreement on any one test than 80%! Moreover, the TT credentialing process is aimed at allowing inexperienced evaluators to achieve high inter-rater reliability.

Finally, I would note that the ACT rules aim for 100% inter-rater reliability.

I would love to read the article though.

EDIT: Here is a cite from ResearchGate. Title is Is accessibility conformance an elusive property? A study of validity and reliability of WCAG 2.0 and the authors last name is Brajnik (not Brajnick, so no c).

@bruce-usab
Copy link
Contributor

I am of the opinion that just deleting the last bullet (exactly the bit that David excerpted) is a reasonable fix for now.

My pull request also corrects the spelling of author's last name, adds the article title, and provides a link. I don't think this article needs to be added to the References section at this time.

@bruce-usab
Copy link
Contributor

I ran the article abstract by my colleague @kengdoj and though I would share her observations:

  • The research that yielded this rating evaluated testers using different methodologies, which we know is the source of varying test results. Not surprising but it further supports the work of the ACT and ICT Baseline.
  • It would be interesting if the researchers had administered their test pages to testers all following the same methodology. I bet these scores would be much higher. I would hope that TTs would be around 90%, but 80%+ should be easy assuming the TTs have been consistently testing.

@sajkaj sajkaj changed the title The claim that 80% agreement is not attainable seems inaccurate and editorial Hi: Thanks all for this most helpful set of comments. Unfortunately, I'm loathe to accept the PR at #629 as is for the simple reason that the relevant section of the Challenges document being discussed in this issue is a straight copy and paste from [Silver Problem Statements](https://www.w3.org/WAI/GL/task-forces/silver/wiki/Problem_Statements#Conformance_Model). Yes, it seems there was some data loss event that killed all the hyperlinks in the original, as well as in the copy submitted into the Challenges doc. Mar 3, 2021
@sajkaj sajkaj self-assigned this Mar 3, 2021
@jspellman
Copy link
Contributor

I think we need a different solution, as I have spent several hours searching old archives to see if I can find the original paper. We had a data loss when the structure of Google Drive that belonged to a no-longer-active member was deleted. The data still exists (so I am told), but I can't find it. As a side note, we are encouraging W3C to find a Google Drive solution, because Drive is accessible to some people with disabilities and we expect to keep using it in the future.

I have been thinking about the possibilities of addressing the problem. First, the paper with the 80% figure is dated. It would be helpful to find the date, but I remember it as being associated with the release of WCAG 2.0, so I would suspect it is in the 2008-2012 time frame. If there is more recent research with a different percentage, then I would recommend using it. I don't think the Silver Task Force would object to using updated research. Otherwise, use the 80% with the note that the research is associated with the release of WCAG 2.0.

Members of the Silver Task Force (myself included) have been loath to see the Silver Problem Statements submerged in the Challenges document because they were the result of research with academic and corporate researchers. However, I would like to propose a way forward. I would be amenable to paraphrasing the Silver research results as long as there are frequent references to the Silver Problem Statements.

The Silver research was broader in scope than the Challenges, because the Silver research addressed a wider population than large organizations. I still do not want to see the Challenges document used to justify changes to the WCAG3 Requirements or to WCAG3 itself. The Challenges document is the opinion of a relatively small (but influential) group of people and should not be considered of greater importance than the research.

A paragraph in the Introduction could explain that. I am open to further discussion and ideas of a way forward. I would also like to hear from @slauriat on this issue. I have flagged it as a topic for a Silver leadership discussion.

@bruce-usab
Copy link
Contributor

bruce-usab commented Mar 8, 2021

@jspellman I am pretty sure I linked to the article in question in my first reply in this issue thread. Here is that URL:
https://www.researchgate.net/publication/235339930_Is_accessibility_conformance_an_elusive_property_A_study_of_validity_and_reliability_of_WCAG_20

March 2012 is the date. I tried to get the article text directly via ResearchGate but they have not approved my request (even after I ticked the boxes for reconsideration). I choose to believe that it is an automaton making that choice!

@sajkaj - I think you may have renamed this issue with maybe what was supposed to be a comment. I cannot quite tell what is going on.

@johnfoliot
Copy link

johnfoliot commented Mar 8, 2021 via email

@bruce-usab
Copy link
Contributor

bruce-usab commented Mar 8, 2021

@johnfoliot et al., the problematic bullet @DavidMacDonald cites in this issue is a direct excerpt from that abstract you pasted in: This means that an 80% target for agreement, when audits are conducted without communication between evaluators, is not attainable, even with experienced evaluators. See Conformance Challenges — Themes from Research.

From the abstract, it does seem to be true that these researchers came to that conclusion.

It is not, however, a factually correct statement. It does not IMHO belong in the Challenges document. Moreover, the formatting ascribes more authority than the bullet warrants. Maybe it is just me, but before digging out that citation, I didn't realize the bullet was a quotation. After reading the abstract, I would argue that characterizing that bullet as a theme from research really overstates what is really just one data point. It is an assertion from one study, where the authors own abstract provides evidence of a flawed methodology.

@johnfoliot
Copy link

johnfoliot commented Mar 8, 2021 via email

@bruce-usab
Copy link
Contributor

bruce-usab commented Mar 8, 2021

@johnfoliot GitHub put my addy in plain text so I edited your comment. (Not that my email is hard to find, but who needs the extra spam?) FWIW, I don't seem to have your current email (or I would have asked you to edit your comment). Also, I find it surprising that I could edit your comment!

Correct, I am saying that the conclusion is not fact.

It may be a fact that the paper authors made such a conclusion, but I regard that as irrelevant to the premise that AG WG should include this particular bullet in the Conformance Issues document. But without reading the article, I am not confident that the authors reach this conclusion. The phrasing used in the abstract is not entirely unambiguous.

FWIW, I agree that inter-rating reliability is not what we want it to be. But I strongly disagree that an 80% target for inter-rater reliability is not attainable. The assertion that an 80% target for agreement is not attainable is simply not credible. Assertions which are not credible should not be repeated verbatim in an AG WG document. (Or at least not without lots of context and/or caveats.)

@johnfoliot
Copy link

johnfoliot commented Mar 8, 2021 via email

@bruce-usab
Copy link
Contributor

@johnfoliot , sticking to the very narrow issue raised by @DavidMacDonald at the start of this thread, which I now find that @sajkaj has (accidently) over-written, we can recognize the real concern raised by these researchers without this particular quotation from the abstract. Further, I would argue that including the quote (because it is so easily debunked) is counter-productive to the important lesson that inter-rater reliability needs to be improved.

@michael-n-cooper michael-n-cooper changed the title Hi: Thanks all for this most helpful set of comments. Unfortunately, I'm loathe to accept the PR at #629 as is for the simple reason that the relevant section of the Challenges document being discussed in this issue is a straight copy and paste from [Silver Problem Statements](https://www.w3.org/WAI/GL/task-forces/silver/wiki/Problem_Statements#Conformance_Model). Yes, it seems there was some data loss event that killed all the hyperlinks in the original, as well as in the copy submitted into the Challenges doc. Validity of independent review (retitled) Mar 9, 2021
@sajkaj
Copy link
Contributor

sajkaj commented Mar 9, 2021

My apologies to everyone, and especially to @DavidMacDonald, for over-writing
the head of this issue. David's original text has now been restored. It had
been my intent to comment, but I misused the hub command. My apologies.

@sajkaj sajkaj closed this as completed Mar 9, 2021
@alastc alastc reopened this Mar 9, 2021
@sajkaj
Copy link
Contributor

sajkaj commented Mar 9, 2021

The section of the Challenges document being discussed in this issue is a straight
copy and paste from Silver Problem
Statements
.
As @jspellman notes above, there was a data loss event that resulted in a loss
of all the hyperlinks in
the original, as well as in the copy submitted into the Challenges doc.
So, I am gratefully accepting the citation on behalf of the Challenges doc,
and I'm sure a PR against the original would also be welcome. If you can help
with additional citations missing from Challenges (and from the upstream doc),
I'm confident we'd all appreciate having those.
I am, however, leaving the conclusion drawn from Silver Research for further discussion
in Silver and AGWG. I don't feel it's appropriate for me, as document Editor,
to make that substantive change on my own.
Meanwhile, please note the current Editor's Draft for Challenges has moved
Section 5 to an
[https://raw.githack.com/w3c/wcag/conformance-challenges-5aside/conformance-challenges/index.html#silver-research-problem-statements](Appendix
C in the latest Challenges draft). Please now create PR against that draft.

@sajkaj sajkaj closed this as completed Mar 9, 2021
@bruce-usab bruce-usab reopened this Mar 9, 2021
@bruce-usab
Copy link
Contributor

Reopening because the pull request did not actually address the issue raised by @DavidMacDonald. In my opinion, this is something the AG WG would appreciate having called to their attention.

@alastc
Copy link
Contributor

alastc commented Mar 9, 2021

Hi @sajkaj, as the Silver problem statements are not an official (draft) note, there is a higher bar. This issue should remain open until the original point is addressed, or it comes to the group to agree not to address it.

@sajkaj
Copy link
Contributor

sajkaj commented Mar 9, 2021

Agreed. My bad--yet again in this issue. I meant to hit "Comment," not "comment and close." But, I was in too much of a hurry to post before being late to the Silver call. I agree the underlying question remains unresolved, even if the citation is now available.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Challenges with Conformance Issues relating to the document at https://w3c.github.io/wcag/conformance-challenges/
Projects
None yet
Development

Successfully merging a pull request may close this issue.

6 participants