BERT simple true / false comparison scoring seems wrong #177

james-deee · 2024-02-01T13:41:50Z

I am using a very basic call around a simple response of "true" and "false" variations, and am getting a very odd result that I don't quite understand.

Here's the code snippet that makes the BERT call

        bert_score: Dict[str, Tensor] = score(
            ['false.'],
            ['false'], 
            model_type='microsoft/deberta-xlarge-mnli',
            lang="en",
        )
        print(f'Score: {bert_score}')

The result for this call is:

Score: (tensor([0.7015]), tensor([0.7298]), tensor([0.7153]))

But here's the odd part, if I make the call with an input of ['True'], this actually scores higher:

        bert_score: Dict[str, Tensor] = score(
            ['True'],
            ['false'], 
            model_type='microsoft/deberta-xlarge-mnli',
            lang="en",
        )
        print(f'Score: {bert_score}')

result:

Score: (tensor([0.7599]), tensor([0.7599]), tensor([0.7599]))

This just seems flat out wrong to me, and am wondering if someone can give me insight into what is happening. Thanks.

The text was updated successfully, but these errors were encountered:

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

BERT simple true / false comparison scoring seems wrong #177

BERT simple true / false comparison scoring seems wrong #177

james-deee commented Feb 1, 2024

BERT simple true / false comparison scoring seems wrong #177

BERT simple true / false comparison scoring seems wrong #177

Comments

james-deee commented Feb 1, 2024