<a href="https://colab.research.google.com/github/LCaravaggio/NLP/blob/main/notebooks/08a-BERTClfFineTuning.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

Vamos a predecir el sentimiento de reviews de películas usando dos enfoques.

1. **BERT + fine-tuning**: fine-tuneamos BERT + una capa de clasifición lineal en el dataset de reviews.
2. **BERT "pre-fine-tuneado"**: usamos un modelo (BERT + clasificador) previamente entrenado para análisis de sentimiento, sin entrenar en nuestros datos.

-----------------------

Tarea: entender todo el código y responder donde dice **PREGUNTA**

### Configuración del entorno


In [None]:
!pip install -qU transformers accelerate datasets watermark

In [None]:
%reload_ext watermark

In [None]:
%watermark -vmp transformers,datasets,torch,numpy,pandas,tqdm

Python implementation: CPython
Python version       : 3.11.12
IPython version      : 7.34.0

transformers: 4.51.3
datasets    : 3.5.1
torch       : 2.6.0+cu124
numpy       : 2.0.2
pandas      : 2.2.2
tqdm        : 4.67.1

Compiler    : GCC 11.4.0
OS          : Linux
Release     : 6.1.123+
Machine     : x86_64
Processor   : x86_64
CPU cores   : 2
Architecture: 64bit



Para usar GPU, arriba a la derecha seleccionar "Change runtime type" --> "T4 GPU"

In [None]:
import torch

device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
print(device)

cuda


## Dataset

Cargamos y exploramos el dataset de reviews de películas de imdb.

In [None]:
from datasets import load_dataset

dataset = load_dataset("rotten_tomatoes")

The secret `HF_TOKEN` does not exist in your Colab secrets.
To authenticate with the Hugging Face Hub, create a token in your settings tab (https://huggingface.co/settings/tokens), set it as secret in your Google Colab and restart your session.
You will be able to reuse this secret in all of your notebooks.
Please note that authentication is recommended but still optional to access public models or datasets.


In [None]:
dataset

DatasetDict({
    train: Dataset({
        features: ['text', 'label'],
        num_rows: 8530
    })
    validation: Dataset({
        features: ['text', 'label'],
        num_rows: 1066
    })
    test: Dataset({
        features: ['text', 'label'],
        num_rows: 1066
    })
})

In [None]:
import pandas as pd
import numpy as np
import datasets
from IPython.display import display, HTML

def show_random_elements(dataset, num_examples=10):
    """Muestra num_examples ejemplos aleatorios del dataset.
    """
    indices = np.random.randint(0, len(dataset), num_examples)
    df = pd.DataFrame(dataset[indices])
    for column, typ in dataset.features.items():
        if isinstance(typ, datasets.ClassLabel):
            df[column] = df[column].transform(lambda i: typ.names[i])
    display(HTML(df.to_html()))

np.random.seed(33)
show_random_elements(dataset["train"], num_examples=6)

Unnamed: 0,text,label
0,"drives for the same kind of bittersweet , conciliatory tone that three seasons achieved but loses its way in rhetorical excess and blatant sentimentality .",neg
1,"until it goes off the rails in its final 10 or 15 minutes , wendigo , larry fessenden's spooky new thriller , is a refreshingly smart and newfangled variation on several themes derived from far less sophisticated and knowing horror films .",pos
2,"excessive , profane , packed with cartoonish violence and comic-strip characters .",neg
3,haynes has so fanatically fetishized every bizarre old-movie idiosyncrasy with such monastic devotion you're not sure if you should applaud or look into having him committed .,pos
4,"singer/composer bryan adams contributes a slew of songs  a few potential hits , a few more simply intrusive to the story  but the whole package certainly captures the intended , er , spirit of the piece .",pos
5,a refreshing korean film about five female high school friends who face an uphill battle when they try to take their relationships into deeper waters .,pos


In [None]:
print("Distribucion de clases:")
for k in dataset.keys():
    print(k)
    print(pd.Series(dataset[k]["label"]).value_counts())
    print("-"*70)

Distribucion de clases:
train
1    4265
0    4265
Name: count, dtype: int64
----------------------------------------------------------------------
validation
1    533
0    533
Name: count, dtype: int64
----------------------------------------------------------------------
test
1    533
0    533
Name: count, dtype: int64
----------------------------------------------------------------------


In [None]:
print("Largo de los documentos (en palabras), deciles:")
for k in dataset.keys():
    print(k)
    largos = pd.Series(dataset[k]["text"]).str.split().apply(len)
    print(np.quantile(largos, q=np.arange(0, 1.1, .1)).astype(int))
    print("-"*70)

Largo de los documentos (en palabras), deciles:
train
[ 1  9 12 15 18 20 23 26 29 34 59]
----------------------------------------------------------------------
validation
[ 1  8 12 16 18 21 23 26 29 34 54]
----------------------------------------------------------------------
test
[ 3  9 13 15 18 20 23 26 29 34 52]
----------------------------------------------------------------------


In [None]:
# Esto nos va a servir para más adelante:
label_names = dataset["train"].features["label"].names
label2id = {name: dataset["train"].features["label"].str2int(name) for name in label_names}
id2label = {id: label for label, id in label2id.items()}

print(label_names)
print(id2label[0], id2label[1])
print(label2id["neg"] , label2id["pos"])

['neg', 'pos']
neg pos
0 1


## _Fine-tuning_ de BERT

Vamos a usar BERT para extraer una representación vectorial de cada secuencia y entrenar un clasificador lineal por encima. Entrenamos _toda_ la arquitectura en simultáneo en nuestros datos. Como partimos de pesos pre-entrenados, a esto se le llama **fine-tuning**.

Vamos a usar funciones de Hugging Face que van a automatizar muchas de las tareas que hicimos manualmente cuando usamos el modelo con embeddings estáticos word2vec.

Empezamos cargando el tokenizador y el modelo pre-entrenado de HF.

In [None]:
from transformers import AutoTokenizer, AutoModelForSequenceClassification

model_checkpoint = "distilbert-base-uncased"

tokenizer = AutoTokenizer.from_pretrained(model_checkpoint)
bert_model = AutoModelForSequenceClassification.from_pretrained(
    model_checkpoint, num_labels=2, id2label=id2label, label2id=label2id
)
bert_model = bert_model.to(device)

Some weights of DistilBertForSequenceClassification were not initialized from the model checkpoint at distilbert-base-uncased and are newly initialized: ['classifier.bias', 'classifier.weight', 'pre_classifier.bias', 'pre_classifier.weight']
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


**PREGUNTA**: ¿qué hace .to(device)?

El primer paso es la **tokenización**:

convertir cada ejemplo en una secuencia de tokens que el modelo pueda procesar. En particular, cada ejemplo queda representado como un diccionario del tipo `{'input_ids': ..., 'attention_mask': ..., 'label': ...}`.

In [None]:
def tokenize_fn(examples):
    """Tokenización **sin aplicar padding** --> Lo aplicamos luego dinámicamente,
    en cada batch de entrenamiento
    """
    return tokenizer(
        examples["text"], truncation=True, max_length=tokenizer.model_max_length
    )

In [None]:
# Ejemplo:
subset_example = dataset["train"][:3]
tokenized_subset = tokenize_fn(subset_example)

for k, v in tokenized_subset.items():
    print(k)
    print(v)
    print(len(v))
    print("Largo de cada input:", [len(x) for x in v])
    print("-"*70)

input_ids
[[101, 1996, 2600, 2003, 16036, 2000, 2022, 1996, 7398, 2301, 1005, 1055, 2047, 1000, 16608, 1000, 1998, 2008, 2002, 1005, 1055, 2183, 2000, 2191, 1037, 17624, 2130, 3618, 2084, 7779, 29058, 8625, 13327, 1010, 3744, 1011, 18856, 19513, 3158, 5477, 4168, 2030, 7112, 16562, 2140, 1012, 102], [101, 1996, 9882, 2135, 9603, 13633, 1997, 1000, 1996, 2935, 1997, 1996, 7635, 1000, 11544, 2003, 2061, 4121, 2008, 1037, 5930, 1997, 2616, 3685, 23613, 6235, 2522, 1011, 3213, 1013, 2472, 2848, 4027, 1005, 1055, 4423, 4432, 1997, 1046, 1012, 1054, 1012, 1054, 1012, 23602, 1005, 1055, 2690, 1011, 3011, 1012, 102], [101, 4621, 2021, 2205, 1011, 8915, 23267, 16012, 24330, 102]]
3
Largo de cada input: [47, 52, 10]
----------------------------------------------------------------------
attention_mask
[[1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1], [1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,

In [None]:
tokenized_dataset = dataset.map(tokenize_fn)
tokenized_dataset.set_format("torch", columns=["input_ids", "attention_mask", "label"])

In [None]:
print(tokenized_dataset["train"][0])

{'label': tensor(1), 'input_ids': tensor([  101,  1996,  2600,  2003, 16036,  2000,  2022,  1996,  7398,  2301,
         1005,  1055,  2047,  1000, 16608,  1000,  1998,  2008,  2002,  1005,
         1055,  2183,  2000,  2191,  1037, 17624,  2130,  3618,  2084,  7779,
        29058,  8625, 13327,  1010,  3744,  1011, 18856, 19513,  3158,  5477,
         4168,  2030,  7112, 16562,  2140,  1012,   102]), 'attention_mask': tensor([1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
        1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1])}


In [None]:
print(bert_model)

DistilBertForSequenceClassification(
  (distilbert): DistilBertModel(
    (embeddings): Embeddings(
      (word_embeddings): Embedding(30522, 768, padding_idx=0)
      (position_embeddings): Embedding(512, 768)
      (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
      (dropout): Dropout(p=0.1, inplace=False)
    )
    (transformer): Transformer(
      (layer): ModuleList(
        (0-5): 6 x TransformerBlock(
          (attention): DistilBertSdpaAttention(
            (dropout): Dropout(p=0.1, inplace=False)
            (q_lin): Linear(in_features=768, out_features=768, bias=True)
            (k_lin): Linear(in_features=768, out_features=768, bias=True)
            (v_lin): Linear(in_features=768, out_features=768, bias=True)
            (out_lin): Linear(in_features=768, out_features=768, bias=True)
          )
          (sa_layer_norm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
          (ffn): FFN(
            (dropout): Dropout(p=0.1, inplace=False)


In [None]:
param_names = [name for name, _ in bert_model.named_parameters()]
print(len(param_names))
print(param_names[:5])
print(param_names[-5:])

104
['distilbert.embeddings.word_embeddings.weight', 'distilbert.embeddings.position_embeddings.weight', 'distilbert.embeddings.LayerNorm.weight', 'distilbert.embeddings.LayerNorm.bias', 'distilbert.transformer.layer.0.attention.q_lin.weight']
['distilbert.transformer.layer.5.output_layer_norm.bias', 'pre_classifier.weight', 'pre_classifier.bias', 'classifier.weight', 'classifier.bias']


Vamos a hacer _fine-tuning_ de todos los pesos del modelo.

Alternativamente, podríamos entrenar la capa de clasificación y las últimas N capas de BERT, dejando las demás capas _congeladas_, corriendo esto:



In [None]:
if False:
    # freeze todas las capas
    for param in bert_model.parameters():
        param.requires_grad = False
    # descongelar las ultimas 2 capas
    for param in bert_model.pre_classifier.parameters():
        param.requires_grad = True
    for param in bert_model.classifier.parameters():
        param.requires_grad = True
    # y los N ultimos transformer blocks:
    for param in bert_model.distilbert.transformer.layer[-2:].parameters():
        param.requires_grad = True

**PREGUNTA**: ¿qué quiere decir "congelar una capa"?

Usamos un **data collator** de HF que se encarga de agrupar los ejemplos en batches y hacer padding dinámicamente (esto es, padding solo hasta la longitud del ejemplo más largo en cada batch).

In [None]:
from transformers import DataCollatorWithPadding

data_collator = DataCollatorWithPadding(tokenizer=tokenizer)

Hacemos una función para evaluar métricas durante el entrenamiento.

In [None]:
from transformers import EvalPrediction
from torch import nn

loss_fn = nn.CrossEntropyLoss()

def compute_metrics(logits, labels):
    """Args:
        logits: array shape (batch_size, num_labels)
        labels: array shape (batch_size,)
    """
    # Usamos torch para usar loss_fn, pero podriamos usar cpu y numpy
    if not isinstance(logits, torch.Tensor):
        logits = torch.tensor(logits)
    if not isinstance(labels, torch.Tensor):
        labels = torch.tensor(labels)
    predictions = torch.argmax(logits, dim=-1)
    accuracy = (predictions == labels).float().mean().item()
    cross_entropy = loss_fn(logits, labels).item()
    return {"accuracy": accuracy, "cross_entropy": cross_entropy}

def compute_metrics_for_hf(pred: EvalPrediction) -> dict:
    """EvalPrediction: tupla con dos elementos: predictions y label_ids
    NOTE Trainer will put in EvalPrediction everything the model returns.
    """
    logits = pred.predictions
    labels = pred.label_ids
    return compute_metrics(logits, labels)

Para hacer el entrenamiento, usamos la clase `Trainer` de HF: funciona como un _wrapper_ que se encarga de hacer el loop de entrenamiento y evaluación que hicimos manualmente antes.

In [None]:
n_epochs = 2
batch_size = 32
optimization_steps = int(np.ceil(len(tokenized_dataset["train"]) * n_epochs / batch_size))

print(f"N epochs: {n_epochs}")
print(f"Batch size: {batch_size}")
print(f"Optimization steps: {optimization_steps}")

N epochs: 2
Batch size: 32
Optimization steps: 534


In [None]:
from transformers import Trainer, TrainingArguments

args = TrainingArguments(
    "distilbert-ft-reviews",
    learning_rate=2e-5,
    per_device_train_batch_size=batch_size,
    per_device_eval_batch_size=batch_size,
    max_steps=optimization_steps,
    weight_decay=0.01,
    eval_strategy="steps",
    logging_strategy="steps",
    eval_steps=50,
    logging_steps=50,
    load_best_model_at_end=True,
    metric_for_best_model="accuracy", # el nombre de la metrica en compute_metrics()
    push_to_hub=False,
    seed=33,
    report_to="none",
)

trainer = Trainer(
    bert_model, args,
    train_dataset=tokenized_dataset["train"],
    eval_dataset=tokenized_dataset["validation"],
    tokenizer=tokenizer,
    data_collator=data_collator,
    compute_metrics=compute_metrics_for_hf,
)


  trainer = Trainer(


In [None]:
trainer.train()

Step,Training Loss,Validation Loss,Accuracy,Cross Entropy
50,0.6017,0.467732,0.790807,0.467732
100,0.4312,0.391461,0.829268,0.391461
150,0.3963,0.385668,0.826454,0.385668
200,0.3777,0.377991,0.837711,0.377991
250,0.3877,0.362356,0.834897,0.362356
300,0.3287,0.375857,0.838649,0.375857
350,0.2737,0.367597,0.846154,0.367597
400,0.2924,0.363814,0.839587,0.363814
450,0.2576,0.363992,0.842402,0.363992
500,0.2585,0.364503,0.849906,0.364503


TrainOutput(global_step=534, training_loss=0.353304343277149, metrics={'train_runtime': 103.7809, 'train_samples_per_second': 164.655, 'train_steps_per_second': 5.145, 'total_flos': 232905111728376.0, 'train_loss': 0.353304343277149, 'epoch': 2.0})

In [None]:
# Evaluar en test:
test_results = trainer.evaluate(tokenized_dataset["test"])

In [None]:
print(test_results)

{'eval_loss': 0.39136093854904175, 'eval_accuracy': 0.8499062061309814, 'eval_cross_entropy': 0.39136090874671936, 'eval_runtime': 1.5881, 'eval_samples_per_second': 671.236, 'eval_steps_per_second': 21.409, 'epoch': 2.0}


## BERT _pre-fine-tuned_

En lugar de hacer _fine-tuning_ de BERT en el dataset de reviews, podemos usar un modelo BERT que ya haya sido _fine-tuneado_ para resolver esta tarea, aunque sea en un dataset distinto.

Esto se conoce como "zero-shot", y es útil cuando no tenemos datos anotados para entrenar. Es como usar un modelo "en producción".

In [None]:
from transformers import pipeline

sentiment_clf = pipeline(
    model="distilbert/distilbert-base-uncased-finetuned-sst-2-english",
    device=device, batch_size=32
)

config.json:   0%|          | 0.00/629 [00:00<?, ?B/s]

Xet Storage is enabled for this repo, but the 'hf_xet' package is not installed. Falling back to regular HTTP download. For better performance, install the package with: `pip install huggingface_hub[hf_xet]` or `pip install hf_xet`


model.safetensors:   0%|          | 0.00/268M [00:00<?, ?B/s]

tokenizer_config.json:   0%|          | 0.00/48.0 [00:00<?, ?B/s]

vocab.txt:   0%|          | 0.00/232k [00:00<?, ?B/s]

Device set to use cuda


In [None]:
# Inferencia en dataset de test:
from transformers.pipelines.pt_utils import KeyDataset

test_outputs = []
for output in sentiment_clf(KeyDataset(dataset["test"], "text"), top_k=None):
    test_outputs.append(output)

# Usamos KeyDataset para trabajar con el input que nos interesa como si fuera un
# dataset. Esto optimiza el cómputo. Ver:
# https://huggingface.co/docs/transformers/en/pipeline_tutorial#using-pipelines-on-a-dataset

In [None]:
len(test_outputs)

1066

In [None]:
test_outputs[0]

[{'label': 'POSITIVE', 'score': 0.9998145699501038},
 {'label': 'NEGATIVE', 'score': 0.00018543035548646003}]

Queremos los logits para calcular la pérdida. Para eso, cargamos el pipeline con el argumento `function_to_apply="none"`.



In [None]:
sentiment_clf = pipeline(
    model="distilbert/distilbert-base-uncased-finetuned-sst-2-english",
    device=device, batch_size=32, function_to_apply="none"
)

test_outputs = []
for output in sentiment_clf(KeyDataset(dataset["test"], "text"), top_k=None):
    test_outputs.append(output)

print(test_outputs[0])

Device set to use cuda


[{'label': 'POSITIVE', 'score': 4.434323787689209}, {'label': 'NEGATIVE', 'score': -4.158322334289551}]


In [None]:
# Colocamos los logits en un np array (n_samples x num_classes)
test_logits = []

for output in test_outputs:
    logits = [0] * len(label_names)
    for item in output:
        label_ = item["label"][:3].lower()
        id_ = label2id[label_]
        logits[id_] = item["score"]
    logits_arr = np.array(logits)
    test_logits.append(logits_arr)

test_logits = np.vstack(test_logits)

In [None]:
print(test_logits)
print(test_logits.shape)

[[-4.15832233  4.43432379]
 [-4.34086657  4.70104599]
 [ 2.54996228 -2.14845586]
 ...
 [ 4.27816534 -3.49481678]
 [ 4.19412088 -3.3804338 ]
 [ 4.05372381 -3.39171791]]
(1066, 2)


In [None]:
test_labels = dataset["test"]["label"]

print(test_labels[:5])
print(len(test_labels))

[1, 1, 1, 1, 1]
1066


In [None]:
compute_metrics(test_logits, test_labels)

{'accuracy': 0.8968105316162109, 'cross_entropy': 0.5667334676437694}

## Análisis de errores

Suele ser útil hacer un análisis de errores de los modelos para detectar oportunidades de mejora, así como también errores en los datos.

En este caso, vamos a inspeccionar los falsos positivos y negativos _más groseros_ del primer modelo i.e. los ejemplos donde la pérdida es más alta.

In [None]:
data_collator = trainer.data_collator

def run_inference(examples, model):
    """Agrega a un batch la proba, prediccion y loss de cada ejemplo de examples
    """
    examples = {k: v for k, v in examples.items() if k in ['label', 'input_ids', 'attention_mask']}
    batch = data_collator(examples)
    input_ids = batch["input_ids"].to(device)
    attention_mask = batch["attention_mask"].to(device)
    labels = batch["labels"].to(device)
    with torch.inference_mode():
        output = model(input_ids, attention_mask)
        batch["proba"] = torch.softmax(output.logits, dim=1)[:, 1]
        batch["predicted_label"] = torch.argmax(output.logits, axis=1)
    # reduction="none" --> loss por example
    loss = torch.nn.functional.cross_entropy(output.logits, labels, reduction="none")
    batch["loss"] = loss
    return batch

In [None]:
# Ejemplo:
subset_example = tokenized_dataset["validation"][:3]
run_inference(subset_example, bert_model)

{'input_ids': tensor([[  101, 29353,  2135, 15102,  1996,  9428, 20868,  2890,  8663,  6895,
         20470,  2571,  3663,  2090,  4603,  3017,  3008,  1998,  2037, 24211,
          5637,  1998, 11690,  2336,  1012,   102,     0,     0,     0,     0],
        [  101,  1996,  6050,  2894,  2003,  4276,  1996,  3976,  1997,  9634,
          1012,   102,     0,     0,     0,     0,     0,     0,     0,     0,
             0,     0,     0,     0,     0,     0,     0,     0,     0,     0],
        [  101,  9172,  2515,  1037, 21459,  3105,  1997,  5762, 11268, 16281,
          5365,  2806,  1011,  1011,  9179,  6581,  3763,  5889,  1997,  2035,
          5535,  1011,  1011,  1037,  9874,  2146,  2058, 20041,  1012,   102]]), 'attention_mask': tensor([[1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
         1, 1, 0, 0, 0, 0],
        [1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
         0, 0, 0, 0, 0, 0],
        [1, 1, 1, 1, 1, 1, 1, 1, 1

In [None]:
bert_model.eval()
errors_dataset = tokenized_dataset['validation'].map(
    lambda examples: run_inference(examples, bert_model), batched=True, batch_size=32)

Map:   0%|          | 0/1066 [00:00<?, ? examples/s]

In [None]:
errors_df = errors_dataset.to_pandas()[['text', 'label', 'proba', 'predicted_label', 'loss']]

In [None]:
pd.set_option("display.max_colwidth", None)

In [None]:
# falsos positivos
errors_df.query("label == 0").sort_values("loss", ascending=False).head()

Unnamed: 0,text,label,proba,predicted_label,loss
705,"an uplifting drama . . . what antwone fisher isn't , however , is original .",0,0.984112,1,4.142165
632,how much you are moved by the emotional tumult of [françois and michèle's] relationship depends a lot on how interesting and likable you find them .,0,0.972672,1,3.599855
748,"meticulously mounted , exasperatingly well-behaved film , which ticks off kahlo's lifetime milestones with the dutiful precision of a tax accountant .",0,0.972529,1,3.594626
702,"a bizarre piece of work , with premise and dialogue at the level of kids' television and plot threads as morose as teen pregnancy , rape and suspected murder",0,0.969385,1,3.486253
942,"one of those based-on-truth stories that persuades you , with every scene , that it could never really have happened this way .",0,0.969061,1,3.475724


In [None]:
# falsos negativos
errors_df.query("label == 1").sort_values("loss", ascending=False).head()

Unnamed: 0,text,label,proba,predicted_label,loss
258,idiotic and ugly .,1,0.02682,0,3.618592
428,like these russo guys lookin' for their mamet instead found their sturges .,1,0.032658,0,3.421658
236,"if the plot seems a bit on the skinny side , that's because panic room is interested in nothing more than sucking you inand making you sweat .",1,0.035095,0,3.349692
485,underachieves only in not taking the shakespeare parallels quite far enough .,1,0.035536,0,3.337212
198,the ending feels at odds with the rest of the film .,1,0.036081,0,3.321985


**PREGUNTA**: ¿cómo se interpreta el falso negativo más grosero de la tabla anterior?