Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Should it be possible for multiple testers to assign verdicts to a single set of AT responses? #937

Open
jugglinmike opened this issue Apr 27, 2023 · 2 comments

Comments

@jugglinmike
Copy link
Contributor

jugglinmike commented Apr 27, 2023

Context: explanation of the term "verdict"

Currently, the process of "running a test" involves two steps:

  1. collecting AT responses, and
  2. interpreting whether the collected AT responses satisfy the given assertions

The result is a series of "verdicts" (we'll be adding a definition along these lines to the glossary soon).

Historically, we haven't discussed these as separate operations because we've expected every tester to perform both. It will soon be necessary to consider them as distinct steps because we will support automated AT response collection, but we will not support automated verdict assignment.

Thinking forward to a world where we have a robust system for collecting AT responses, we might trust the data reported by that system as the one-and-only source of AT responses. Then, we would still likely want more than one person to assign verdicts.

Even before we have an automated system, though, this capability may be desirable. Will we ever want different levels of corroboration for the two steps in running a test? For example, would we ever want to require two people to collect equivalent AT responses while requiring three people to assign equivalent verdicts?

@mcking65
Copy link
Contributor

Thinking forward to a world where we have a robust system for collecting AT responses, we might trust the data reported by that system as the one-and-only source of AT responses.

That will be lovely, although I imagine it will take some time and experience to get there. First, we need to have people verifying that the system-recorded response is accurate. I assume we will eventually get to a high level of trust as the automated systems mature.

Then, we would still likely want more than one person to assign verdicts.

Definitely

Even before we have an automated system, though, this capability may be desirable. Will we ever want different levels of corroboration for the two steps in running a test? For example, would we ever want to require two people to collect equivalent AT responses while requiring three people to assign equivalent verdicts?

This could be extremely useful! I can imagine taking advantage of app support for a scenario like:

  1. Assign two people to run a plan
  2. Responses are all good but there are concerns that we want a broader set of views on how to interpret some of the tests.
  3. Assign additional people to go through the plan and only assign verdicts to the previously collected responses.

It would be a lot easier to get the additional people for step 3 if those people didn't have to record responses.

We have manual ways of working around this now. In the near term, it is more important that we have a way to see the report from a draft plan without publishing it to the reports page or candidate review page. If we had that, a third reviewer could just review the report and raise issues where there are concerns. This use case comes up in the working mode scenario analysis.

@jugglinmike
Copy link
Contributor Author

Thanks, @mcking65! There are two places in the Working Mode where Testers are assigned to Test Plans, so I'm wondering where that scenario applies. Do you envision it occurring before a "Draft" Test Plan advances to the "Candidate" phase? Is it something that might happen when reporting on "Recommended" Test Plans?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants