Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Should one tester be able to assign verdicts to AT responses collected by another? #936

Open
jugglinmike opened this issue Apr 27, 2023 · 3 comments

Comments

@jugglinmike
Copy link
Contributor

jugglinmike commented Apr 27, 2023

Context: explanation of the term "verdict"

Currently, the process of "running a test" involves two steps:

  1. collecting AT responses, and
  2. interpreting whether the collected AT responses satisfy the given assertions

The result is a series of "verdicts" (we'll be adding a definition along these lines to the glossary soon).

Historically, we haven't discussed these as separate operations because we've expected every tester to perform both. It will soon be necessary to consider them as distinct steps because we will support automated AT response collection, but we will not support automated verdict assignment.

We know that a human tester will one day need to assign verdicts to AT responses collected by the automated system. However, we have not discussed whether a human tester should be able to assign verdicts to AT responses collected by other human testers. Even if we ignore automation for a moment, there are reasons to think this capability may be valuable. With it, we could:

  1. Reduce the effort required by individual contributors (by further segmenting and distributing the work)
  2. Recognize the difference in authority/experience required for each step (i.e. by reserving verdict assignment for contributors with more expertise)

An understanding of these use cases (plus any others) will help us refine both ARIA-AT App and the nascent automation system.

This brings me to the question I posed in the title of this issue: should one tester be able to assign verdicts to AT responses collected by another?

@jsanthoz
Copy link

Yes, it should be allowed. Allowing one tester to assign verdicts to AT responses collected by another can help properly segment and distribute work. This can reduce the effort required by individual contributors, as different testers can specialize in various aspects of the testing process. It also allows for more efficient utilization of resources and potentially faster turnaround times. Secondly, evaluating the AT responses and assigning verdicts requires a certain level of expertise and allowing testers with more experience and specialized knowledge to assign verdicts can only ensure that the results are accurate and reliable.

@mcking65
Copy link
Contributor

We know that a human tester will one day need to assign verdicts to AT responses collected by the automated system.

Yes. However, to mitigate the need to do this more than is necessary, I believe we agreed the initial implementation of the automated system will assign verdicts when the AT response is identical to a previously analyzed response to the same command in the same test.

However, we have not discussed whether a human tester should be able to assign verdicts to AT responses collected by other human testers.

Admins have this capability. They can open a test plan run as another user and modify that user's results.

Even if we ignore automation for a moment, there are reasons to think this capability may be valuable. With it, we could:

  1. Reduce the effort required by individual contributors (by further segmenting and distributing the work)

In practice, I have found it difficult to look at the content in the response field and assign verdicts without experiencing the test case itself. Perhaps that could change with practice. Regardless, if my job is to assign verdicts to previously collected responses, I think I would always do a better job if I were to open the test page, run set up, and run the test for most of the commands.

That is to say, I think the primary value of this use case is that the AT responses are already recorded. Recording them is arduous and time consuming.

So, the question I have is whether we would ever want people to be response collectors if we have systems that can collect responses. The primary role of people would be to verify the system-recorded response is accurate and to assign the verdicts.

  1. Recognize the difference in authority/experience required for each step (i.e. by reserving verdict assignment for contributors with more expertise)

I don't anticipate much value from the use case of assigning some people to only collect responses. We currently have a mentoring model already built into the tool:

  • A mix of senior and new testers are assigned to run a plan
  • All record responses and assign verdicts
  • If results are different, we have identified the places where new people might have a different understanding

I think this part of the approach is working well.

An understanding of these use cases (plus any others) will help us refine both ARIA-AT App and the nascent automation system.

This brings me to the question I posed in the title of this issue: should one tester be able to assign verdicts to AT responses collected by another?

We have a form of this in the admin role. I don't see much value in extending the current UI to support the additional use case of having some CG members only collect responses.

@jugglinmike
Copy link
Contributor Author

We know that a human tester will one day need to assign verdicts to AT responses collected by the automated system.

Yes. However, to mitigate the need to do this more than is necessary, I believe we agreed the initial implementation of the automated system will assign verdicts when the AT response is identical to a previously analyzed response to the same command in the same test.

I do recall our discussion about verdict reuse, though I'm not sure if it needs to happen in the automation subsystem. Since reuse seems like an optimization which is orthogonal to the capability discussed here, I'll refrain from going into detail in this thread. (This will certainly come up as we continue to refine the design of the automation system, but if you'd like to continue the discussion right away, I'm happy to open a new issue.)

In practice, I have found it difficult to look at the content in the response field and assign verdicts without experiencing the test case itself. Perhaps that could change with practice. Regardless, if my job is to assign verdicts to previously collected responses, I think I would always do a better job if I were to open the test page, run set up, and run the test for most of the commands.

This reminds me of a discussion from a recent CG meeting. To the extent that "you had to be there" to explain the assignment of a verdict, that seems like a problem for the public's consumption of ARIA-AT's reports. If we could more concretely describe the kinds of situations for which the AT response data is insufficient, that might uncover improvements to ARIA-AT's processes/systems which ultimately promote transparency.

Do you think we could include an agenda item for the next CG meeting to ask the Testers present about their experience along these lines?

An understanding of these use cases (plus any others) will help us refine both ARIA-AT App and the nascent automation system. This brings me to the question I posed in the title of this issue: should one tester be able to assign verdicts to AT responses collected by another?

We have a form of this in the admin role. I don't see much value in extending the current UI to support the additional use case of having some CG members only collect responses.

Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants