# Error analysis

This notebook contains the source code used for the error analysis described in the project's report, where wrong answers are taken from the SQuAD v1.1 dev set for each and every model.

## Imports

In order to import source files, we have to add the `src` folder to the Python `PATH`$\dots$ 

In [1]:
import sys

sys.path.insert(0, "src")

Then, we can import packages as usual$\dots$

In [2]:
import os
import json

import numpy as np
from transformers.trainer_utils import set_seed

import config

%load_ext autoreload
%autoreload 2

## Initialization

### PyTorch and numpy

Set the random seed to a fixed number for reproducible results$\dots$

In [8]:
set_seed(config.RANDOM_SEED)

## Wrong answers loading

Load the errors made on the SQuAD v1.1 dev set for each and every model$\dots$

In [9]:
with open('results/wrong/baseline.json') as f:
    baseline_errors = json.load(f)
with open('results/wrong/bidaf.json') as f:
    bidaf_errors = json.load(f)
with open('results/wrong/bert.json') as f:
    bert_errors = json.load(f)
with open('results/wrong/distilbert.json') as f:
    distilbert_errors = json.load(f)
with open('results/wrong/electra.json') as f:
    electra_errors = json.load(f)

Observe how much errors are made with each model$\dots$

In [10]:
print(f"The Baseline model makes {len(baseline_errors)} errors")
print(f"The BiDAF model makes {len(bidaf_errors)} errors")
print(f"The BERT model makes {len(bert_errors)} errors")
print(f"The DistilBERT model makes {len(distilbert_errors)} errors")
print(f"The ELECTRA model makes {len(electra_errors)} errors")

The Baseline model makes 7700 errors
The BiDAF model makes 4222 errors
The BERT model makes 2731 errors
The DistilBERT model makes 2789 errors
The ELECTRA model makes 2062 errors


## Wrong answers analysis

Compute the common errors among all the models and the common errors among the best models, i.e. BiDAF and ELECTRA$\dots$

In [11]:
all_common_errors = list(
    set(electra_errors.keys())
    & set(bidaf_errors.keys())
    & set(baseline_errors.keys())
    & set(bert_errors.keys())
    & set(distilbert_errors.keys())
)
best_common_errors = list(set(electra_errors.keys()) & set(bidaf_errors.keys()))

In [12]:
print(f"The number of common errors between all the models is {len(all_common_errors)}")
print(f"The number of common errors between BiDAF and ELECTRA is {len(best_common_errors)}")

The number of common errors between all the models is 940
The number of common errors between BiDAF and ELECTRA is 1535


Take a random subset of errors (among the common ones between BiDAF and ELECTRA)$\dots$

In [13]:
random_best_common_errors = np.random.choice(best_common_errors, 50, replace=False)

Show the selected errors$\dots$

In [14]:
for e in random_best_common_errors:
    context = electra_errors[e]["context"]
    question = electra_errors[e]["question"]
    answers = electra_errors[e]["answers"]
    electra_pred = electra_errors[e]["prediction"]
    bidaf_pred = bidaf_errors[e]["prediction"]
    print(f"Context: {context}")
    print(f"Question: {question}")
    print(f"Answers: {answers}")
    print(f"Predictions: [ELECTRA] {electra_pred} [BiDAF] {bidaf_pred}")
    print()

Context: European Union law is applied by the courts of member states and the Court of Justice of the European Union. Where the laws of member states provide for lesser rights European Union law can be enforced by the courts of member states. In case of European Union law which should have been transposed into the laws of member states, such as Directives, the European Commission can take proceedings against the member state under the Treaty on the Functioning of the European Union. The European Court of Justice is the highest court able to interpret European Union law. Supplementary sources of European Union law include case law by the Court of Justice, international law and general principles of European Union law.
Question: What is one supplementary source of European Union law?
Answers: ['international law', 'international law', 'international law', 'international law']
Predictions: [ELECTRA] general principles of european union law [BiDAF] case law by court of justice internationa