Under what conditions are machine oracles preferable to human oracles? #39

anupamck · 2026-06-24T05:51:59Z

anupamck
Jun 24, 2026
Maintainer

In Testing, we rightly assign a special place to human judgement. Human judgement should play a central role in qualifying software for human use.

Yet, certain quality assurance tasks are much better suited for machines, like spell-check or linting.

In Testing, an oracle is the means by which we identify if there is a problem. The intuition by which a human user detects that formatting on a site is broken, or the algorithm used by a spell-check software to detect misspelt words and draw a red squiggly line under them are oracles.

The aim of this enquiry is to step back and understand under what conditions machines outperform humans as oracles. By exploring this question deeper, my hope is that we can extrapolate our findings to GenAI to determine the kinds of quality assurance tasks we can delegate to GenAI based systems, and where we will continue to retain our edge as humans.

So here is my concrete ask:

Can you think of other examples where machine oracles outperform humans?
Can you think of such examples where GenAI systems are involved?
What characterises the conditions under which these machines outperform human judgement?

PS: This enquiry was motivated by reflecting on the work of behaviour scientists like Daniel Kahneman, who have explored the fallibility of human judgement, the conditions under algorithms produce superior results as compared to human judgement.

anupamck · 2026-06-24T06:15:05Z

anupamck
Jun 24, 2026
Maintainer Author

At the recent EuroSTAR conference, I initiated a round-table discussion on this topic and this led to some interesting findings.

When we consider programs like spell-check, they often involve performing a long list of rule based checks
We humans often find the act of verifying these rules repetitive and cumbersome
The rules need to be applied consistently every time

On the topic of AI systems, we discussed their use in anomaly detection. Two examples were specifically discussed: a software in cars that detect if automobile drivers are distracted by detecting signs such as their eyes wandering off the road, or the vehicle drifting off its lane. The software then alerts the driver to take a break. A second example is the use of AI software in train platforms on Swedish subway to detect and prevent suicides by detecting anomalous behaviour and alerting personnel at the stations to intervene. A third example is the successful use of AI in financial fraud detection.

In all these cases, a further set of characteristics surfaced:

The inputs often comprised of unstructured information that was clustered into structured patterns by an AI system. In certain situations, Machine learning systems are much more adept than human beings at identifying and flagging such patterns.
The cost of a false positive is quite low, while a true positive can actually end up saving lives. A false positive such as the wrong person on the platform being flagged by the AI system as a potential suicide victim is relatively harmless considering the life-saving upside of such a system.

Does reading this trigger any ideas of similar examples? Or some other interesting observations here?

Credits: Sanne Visser, Tara Walton, @sharovatov, @maryiat (and a few others on whose names I am blanking out) participated in this discussion.

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Under what conditions are machine oracles preferable to human oracles? #39

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{editor}}'s edit

{{editor}}'s edit

Uh oh!

Replies: 1 comment

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{editor}}'s edit

{{editor}}'s edit

Uh oh!

Select a reply

Uh oh!

Under what conditions are machine oracles preferable to human oracles? #39

Uh oh!

Uh oh!

anupamck Jun 24, 2026 Maintainer

Replies: 1 comment

Uh oh!

Uh oh!

anupamck Jun 24, 2026 Maintainer Author

anupamck
Jun 24, 2026
Maintainer

anupamck
Jun 24, 2026
Maintainer Author