… in some cases. r=Bert

The semantics of the inaccuracy reftest mode are that the overall
test succeeds as long as one of the multiple test images mismatches
the reference image.

However, the previous implementation would print an unexpected result
if any of the comparisons were equal (even though the overall test
result would be reported correctly).

This patch changes wrench so that it only reports an unexpected
failure for the overall test, not each individual reference.

Differential Revision: https://phabricator.services.mozilla.com/D60564

[ghsync] From https://hg.mozilla.org/mozilla-central/rev/3d076ae68a9411e4f412c90d2743987f7b916850