## Model Classification Correction Method using Feedback with LwF (Learning without Forgetting) and Continual Learning with Transformers

In [1]:
import numpy as np
import torch
from datasets import Dataset, load_dataset, load_metric
from transformers import (AdamW, AutoModelForSequenceClassification,
                          AutoTokenizer, Trainer, TrainingArguments)


In [None]:

raw_datasets = load_dataset("imdb")

In [2]:

tokenizer = AutoTokenizer.from_pretrained("distilbert-base-uncased")
model = AutoModelForSequenceClassification.from_pretrained("distilbert-base-uncased", num_labels=2)


Some weights of the model checkpoint at distilbert-base-uncased were not used when initializing DistilBertForSequenceClassification: ['vocab_projector.bias', 'vocab_projector.weight', 'vocab_transform.weight', 'vocab_layer_norm.weight', 'vocab_transform.bias', 'vocab_layer_norm.bias']
- This IS expected if you are initializing DistilBertForSequenceClassification from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing DistilBertForSequenceClassification from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
Some weights of DistilBertForSequenceClassification were not initialized from the model checkpoint at distilbert-base-uncased and are newly initialized: ['classifier.bias', 'pre_classifier.weight', 'classifier

In [3]:
def tokenize_function(examples):
    return tokenizer(examples["text"], padding="max_length", truncation=True)

tokenized_datasets = raw_datasets.map(tokenize_function, batched=True)

Loading cached processed dataset at C:\Users\Subha\.cache\huggingface\datasets\imdb\plain_text\1.0.0\2fdd8b9bcadd6e7055e742a706876ba43f19faee861df134affd7a3f60fc38a1\cache-96e59664766de491.arrow


  0%|          | 0/25 [00:00<?, ?ba/s]

Loading cached processed dataset at C:\Users\Subha\.cache\huggingface\datasets\imdb\plain_text\1.0.0\2fdd8b9bcadd6e7055e742a706876ba43f19faee861df134affd7a3f60fc38a1\cache-dd32903ddb52e6d8.arrow


In [5]:


# Define training arguments
training_args = TrainingArguments(
    output_dir='./results',          # output directory
    num_train_epochs=3,              # total # of training epochs
    per_device_train_batch_size=4,  # batch size per device during training
    per_device_eval_batch_size=4,   # batch size for evaluation
    warmup_steps=500,                # number of warmup steps for learning rate scheduler
    weight_decay=0.01,               # strength of weight decay
    logging_dir='./logs',            # directory for storing logs
    logging_steps=10,
    # only save the best model
    fp16=True,
    save_strategy="epoch",
    save_total_limit=1,
    # evaluate the model after each epoch
    evaluation_strategy="epoch",
)



metric = load_metric("accuracy")

def compute_metrics(eval_pred):
    logits, labels = eval_pred
    predictions = np.argmax(logits, axis=-1)
    return metric.compute(predictions=predictions, references=labels)

# Define trainer
from transformers import Trainer

trainer = Trainer(
    model=model, args=training_args, train_dataset=small_train_dataset, eval_dataset=small_eval_dataset,compute_metrics=compute_metrics 
)

# Train model
trainer.train()

Using cuda_amp half precision backend
The following columns in the training set don't have a corresponding argument in `DistilBertForSequenceClassification.forward` and have been ignored: text. If text are not expected by `DistilBertForSequenceClassification.forward`,  you can safely ignore this message.
***** Running training *****
  Num examples = 1000
  Num Epochs = 3
  Instantaneous batch size per device = 4
  Total train batch size (w. parallel, distributed & accumulation) = 4
  Gradient Accumulation steps = 1
  Total optimization steps = 750
  Number of trainable parameters = 66955010


  0%|          | 0/750 [00:00<?, ?it/s]

{'loss': 0.6918, 'learning_rate': 1.0000000000000002e-06, 'epoch': 0.04}
{'loss': 0.6979, 'learning_rate': 2.0000000000000003e-06, 'epoch': 0.08}
{'loss': 0.6892, 'learning_rate': 3e-06, 'epoch': 0.12}
{'loss': 0.6881, 'learning_rate': 4.000000000000001e-06, 'epoch': 0.16}
{'loss': 0.6997, 'learning_rate': 5e-06, 'epoch': 0.2}
{'loss': 0.689, 'learning_rate': 6e-06, 'epoch': 0.24}
{'loss': 0.6753, 'learning_rate': 7.000000000000001e-06, 'epoch': 0.28}
{'loss': 0.6734, 'learning_rate': 8.000000000000001e-06, 'epoch': 0.32}
{'loss': 0.6902, 'learning_rate': 9e-06, 'epoch': 0.36}
{'loss': 0.6554, 'learning_rate': 1e-05, 'epoch': 0.4}
{'loss': 0.6507, 'learning_rate': 1.1000000000000001e-05, 'epoch': 0.44}
{'loss': 0.6614, 'learning_rate': 1.2e-05, 'epoch': 0.48}
{'loss': 0.6119, 'learning_rate': 1.3000000000000001e-05, 'epoch': 0.52}
{'loss': 0.5973, 'learning_rate': 1.3900000000000002e-05, 'epoch': 0.56}
{'loss': 0.4696, 'learning_rate': 1.49e-05, 'epoch': 0.6}
{'loss': 0.6368, 'learning

The following columns in the evaluation set don't have a corresponding argument in `DistilBertForSequenceClassification.forward` and have been ignored: text. If text are not expected by `DistilBertForSequenceClassification.forward`,  you can safely ignore this message.
***** Running Evaluation *****
  Num examples = 1000
  Batch size = 4


{'loss': 0.2127, 'learning_rate': 2.47e-05, 'epoch': 1.0}


  0%|          | 0/250 [00:00<?, ?it/s]

Saving model checkpoint to ./results\checkpoint-250
Configuration saved in ./results\checkpoint-250\config.json


{'eval_loss': 0.5845013856887817, 'eval_accuracy': 0.821, 'eval_runtime': 8.4396, 'eval_samples_per_second': 118.489, 'eval_steps_per_second': 29.622, 'epoch': 1.0}


Model weights saved in ./results\checkpoint-250\pytorch_model.bin


{'loss': 0.3982, 'learning_rate': 2.57e-05, 'epoch': 1.04}
{'loss': 0.3194, 'learning_rate': 2.6600000000000003e-05, 'epoch': 1.08}
{'loss': 0.4687, 'learning_rate': 2.7600000000000003e-05, 'epoch': 1.12}
{'loss': 0.2347, 'learning_rate': 2.86e-05, 'epoch': 1.16}
{'loss': 0.3041, 'learning_rate': 2.96e-05, 'epoch': 1.2}
{'loss': 0.4541, 'learning_rate': 3.06e-05, 'epoch': 1.24}
{'loss': 0.2949, 'learning_rate': 3.16e-05, 'epoch': 1.28}
{'loss': 0.6417, 'learning_rate': 3.26e-05, 'epoch': 1.32}
{'loss': 0.6843, 'learning_rate': 3.3600000000000004e-05, 'epoch': 1.36}
{'loss': 0.2406, 'learning_rate': 3.46e-05, 'epoch': 1.4}
{'loss': 0.4716, 'learning_rate': 3.56e-05, 'epoch': 1.44}
{'loss': 0.3938, 'learning_rate': 3.66e-05, 'epoch': 1.48}
{'loss': 0.4684, 'learning_rate': 3.76e-05, 'epoch': 1.52}
{'loss': 0.3668, 'learning_rate': 3.86e-05, 'epoch': 1.56}
{'loss': 0.4193, 'learning_rate': 3.960000000000001e-05, 'epoch': 1.6}
{'loss': 0.1935, 'learning_rate': 4.0600000000000004e-05, 'epoc

The following columns in the evaluation set don't have a corresponding argument in `DistilBertForSequenceClassification.forward` and have been ignored: text. If text are not expected by `DistilBertForSequenceClassification.forward`,  you can safely ignore this message.
***** Running Evaluation *****
  Num examples = 1000
  Batch size = 4


{'loss': 0.5307, 'learning_rate': 4.96e-05, 'epoch': 2.0}


  0%|          | 0/250 [00:00<?, ?it/s]

Saving model checkpoint to ./results\checkpoint-500
Configuration saved in ./results\checkpoint-500\config.json


{'eval_loss': 0.5543596148490906, 'eval_accuracy': 0.853, 'eval_runtime': 8.5054, 'eval_samples_per_second': 117.572, 'eval_steps_per_second': 29.393, 'epoch': 2.0}


Model weights saved in ./results\checkpoint-500\pytorch_model.bin
Deleting older checkpoint [results\checkpoint-250] due to args.save_total_limit


{'loss': 0.2747, 'learning_rate': 4.88e-05, 'epoch': 2.04}
{'loss': 0.2469, 'learning_rate': 4.6800000000000006e-05, 'epoch': 2.08}
{'loss': 0.4055, 'learning_rate': 4.4800000000000005e-05, 'epoch': 2.12}
{'loss': 0.0329, 'learning_rate': 4.2800000000000004e-05, 'epoch': 2.16}
{'loss': 0.3212, 'learning_rate': 4.08e-05, 'epoch': 2.2}
{'loss': 0.2497, 'learning_rate': 3.88e-05, 'epoch': 2.24}
{'loss': 0.4263, 'learning_rate': 3.68e-05, 'epoch': 2.28}
{'loss': 0.6302, 'learning_rate': 3.48e-05, 'epoch': 2.32}
{'loss': 0.3214, 'learning_rate': 3.2800000000000004e-05, 'epoch': 2.36}
{'loss': 0.1071, 'learning_rate': 3.08e-05, 'epoch': 2.4}
{'loss': 0.5213, 'learning_rate': 2.88e-05, 'epoch': 2.44}
{'loss': 0.0042, 'learning_rate': 2.6800000000000004e-05, 'epoch': 2.48}
{'loss': 0.4518, 'learning_rate': 2.48e-05, 'epoch': 2.52}
{'loss': 0.1712, 'learning_rate': 2.2800000000000002e-05, 'epoch': 2.56}
{'loss': 0.0789, 'learning_rate': 2.08e-05, 'epoch': 2.6}
{'loss': 0.2197, 'learning_rate': 

The following columns in the evaluation set don't have a corresponding argument in `DistilBertForSequenceClassification.forward` and have been ignored: text. If text are not expected by `DistilBertForSequenceClassification.forward`,  you can safely ignore this message.
***** Running Evaluation *****
  Num examples = 1000
  Batch size = 4


{'loss': 0.1178, 'learning_rate': 8.000000000000001e-07, 'epoch': 3.0}


  0%|          | 0/250 [00:00<?, ?it/s]

Saving model checkpoint to ./results\checkpoint-750
Configuration saved in ./results\checkpoint-750\config.json


{'eval_loss': 0.6955428719520569, 'eval_accuracy': 0.852, 'eval_runtime': 8.5212, 'eval_samples_per_second': 117.354, 'eval_steps_per_second': 29.338, 'epoch': 3.0}


Model weights saved in ./results\checkpoint-750\pytorch_model.bin
Deleting older checkpoint [results\checkpoint-500] due to args.save_total_limit


Training completed. Do not forget to share your model on huggingface.co/models =)




{'train_runtime': 150.808, 'train_samples_per_second': 19.893, 'train_steps_per_second': 4.973, 'train_loss': 0.41781923739115395, 'epoch': 3.0}


TrainOutput(global_step=750, training_loss=0.41781923739115395, metrics={'train_runtime': 150.808, 'train_samples_per_second': 19.893, 'train_steps_per_second': 4.973, 'train_loss': 0.41781923739115395, 'epoch': 3.0})

In [6]:
# Evaluate the model on the test data
trainer.evaluate()

# Save the model
trainer.save_model("./models/IMDB-Transformer")

The following columns in the evaluation set don't have a corresponding argument in `DistilBertForSequenceClassification.forward` and have been ignored: text. If text are not expected by `DistilBertForSequenceClassification.forward`,  you can safely ignore this message.
***** Running Evaluation *****
  Num examples = 1000
  Batch size = 4


  0%|          | 0/250 [00:00<?, ?it/s]

Saving model checkpoint to ./models/IMDB-Transformer
Configuration saved in ./models/IMDB-Transformer\config.json
Model weights saved in ./models/IMDB-Transformer\pytorch_model.bin


In [9]:
# Error Analysis for the model on the small_eval_dataset
# Show the instances of the small_eval_dataset on which the model made a mistake

# Get the predictions on the small_eval_dataset
pred = trainer.predict(small_eval_dataset)

# Get the predictions and labels
predictions = np.argmax(pred.predictions, axis=-1)
labels = small_eval_dataset["label"]

# Get the instances on which the model made a mistake
mistakes = np.where(predictions != labels)[0]

# Show the instances on which the model made a mistake
for i in mistakes:
    print("Text: ", small_eval_dataset["text"][i])
    print("Label: ", small_eval_dataset["label"][i])
    print("Prediction: ", predictions[i])
    print(" ")
    

The following columns in the test set don't have a corresponding argument in `DistilBertForSequenceClassification.forward` and have been ignored: text. If text are not expected by `DistilBertForSequenceClassification.forward`,  you can safely ignore this message.
***** Running Prediction *****
  Num examples = 1000
  Batch size = 4


  0%|          | 0/250 [00:00<?, ?it/s]

Text:  Intended as light entertainment, this film is indeed successful as such during its first half, but then succumbs to a rapidly foundering script that drops it down. Harry (Judd Nelson), a "reformed" burglar, and Daphne (Gina Gershon), an aspiring actress, are employed as live window mannequins at a department store where one evening they are late in leaving and are locked within, whereupon they witness, from their less than protective glass observation point, an apparent homicide occurring on the street. The ostensible murderer, Miles Raymond (Nick Mancuso), a local sculptor, returns the following day to observe the mannequins since he realizes that they are the only possible witnesses to the prior night's violent event and, when one of the posing pair "flinches", the fun begins. Daphne and Harry report their observations at a local police station, but when the detective taking a crime report remembers Harry's criminal background, he becomes cynical. There are a great many ways i

## Feedback Loop - Continual Learning

In [12]:
## Feedback Loop 
# Take a few examples of mistakes and we will try continua learning on the model with these examples as feedback loop

# Create a dataset with the mistakes
mistakes_dataset = Dataset.from_dict({"text": [small_eval_dataset["text"][i] for i in mistakes], "label": [small_eval_dataset["label"][i] for i in mistakes]})
# Select 5 examples from the mistakes_dataset
mistakes_dataset = mistakes_dataset.select(range(5))

In [14]:
# Loop at mistakes_dataset and show the instances on which the model made a mistake
for i in range(len(mistakes_dataset)):
    print("Text: ", mistakes_dataset["text"][i])
    print("Label: ", mistakes_dataset["label"][i])
    print(" ")

Text:  Intended as light entertainment, this film is indeed successful as such during its first half, but then succumbs to a rapidly foundering script that drops it down. Harry (Judd Nelson), a "reformed" burglar, and Daphne (Gina Gershon), an aspiring actress, are employed as live window mannequins at a department store where one evening they are late in leaving and are locked within, whereupon they witness, from their less than protective glass observation point, an apparent homicide occurring on the street. The ostensible murderer, Miles Raymond (Nick Mancuso), a local sculptor, returns the following day to observe the mannequins since he realizes that they are the only possible witnesses to the prior night's violent event and, when one of the posing pair "flinches", the fun begins. Daphne and Harry report their observations at a local police station, but when the detective taking a crime report remembers Harry's criminal background, he becomes cynical. There are a great many ways i

In [17]:
# Change the label of the mistakes_dataset to the opposite of the original label and 
# create a feedback data in the form of [("example input 1", 1), ("example input 2", 0)]
feedback_data = []
for i in range(len(mistakes_dataset)):
    feedback_data.append((mistakes_dataset["text"][i], mistakes_dataset["label"][i]))

feedback_data

[('Intended as light entertainment, this film is indeed successful as such during its first half, but then succumbs to a rapidly foundering script that drops it down. Harry (Judd Nelson), a "reformed" burglar, and Daphne (Gina Gershon), an aspiring actress, are employed as live window mannequins at a department store where one evening they are late in leaving and are locked within, whereupon they witness, from their less than protective glass observation point, an apparent homicide occurring on the street. The ostensible murderer, Miles Raymond (Nick Mancuso), a local sculptor, returns the following day to observe the mannequins since he realizes that they are the only possible witnesses to the prior night\'s violent event and, when one of the posing pair "flinches", the fun begins. Daphne and Harry report their observations at a local police station, but when the detective taking a crime report remembers Harry\'s criminal background, he becomes cynical. There are a great many ways in 

In [19]:
# Save the feedback data
import pickle
with open('feedback_data.pkl', 'wb') as f:
    pickle.dump(feedback_data, f)

# Save the small_eval_dataset using save_to_disk
small_eval_dataset.save_to_disk("small_eval_dataset")

Flattening the indices:   0%|          | 0/1 [00:00<?, ?ba/s]

In [2]:
# Load the saved model
model = AutoModelForSequenceClassification.from_pretrained("./models/IMDB-Transformer", num_labels=2)
tokenizer = AutoTokenizer.from_pretrained("distilbert-base-uncased")
# Load the saved evak dataset
small_eval_dataset = Dataset.load_from_disk("./small_eval_dataset")

# Load the saved feedback data pickle file
import pickle
with open('feedback_data.pkl', 'rb') as f:
    feedback_data = pickle.load(f)


In [3]:
feedback_data

[('Intended as light entertainment, this film is indeed successful as such during its first half, but then succumbs to a rapidly foundering script that drops it down. Harry (Judd Nelson), a "reformed" burglar, and Daphne (Gina Gershon), an aspiring actress, are employed as live window mannequins at a department store where one evening they are late in leaving and are locked within, whereupon they witness, from their less than protective glass observation point, an apparent homicide occurring on the street. The ostensible murderer, Miles Raymond (Nick Mancuso), a local sculptor, returns the following day to observe the mannequins since he realizes that they are the only possible witnesses to the prior night\'s violent event and, when one of the posing pair "flinches", the fun begins. Daphne and Harry report their observations at a local police station, but when the detective taking a crime report remembers Harry\'s criminal background, he becomes cynical. There are a great many ways in 

In [4]:
# Save original parameters of the model
previous_param = {name: param.detach().clone() for name, param in model.named_parameters()}

In [5]:
test_sentence = "This movie is great"
test_output = model(**tokenizer(test_sentence, return_tensors="pt"))

In [6]:
test_output

SequenceClassifierOutput(loss=None, logits=tensor([[-3.0744,  2.5791]], grad_fn=<AddmmBackward0>), hidden_states=None, attentions=None)

In [7]:
test_output.logits.to("cpu").detach().numpy()

array([[-3.0744128,  2.579098 ]], dtype=float32)

In [12]:
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
# Define the loss function
criterion = torch.nn.CrossEntropyLoss()
# Define the optimizer  
optimizer = AdamW(model.parameters(), lr=5e-5)
model.to(device)
mse_loss = torch.nn.MSELoss()

# update the model with feedback
def update_feedback(feedback_data, lambda_value):
    total_loss = 0

    model.train()
    for text, target in feedback_data:
        # Zero the gradients
        target = torch.tensor([target]).to(device)
        optimizer.zero_grad()
        # Forward pass
        inputs = tokenizer(text, return_tensors="pt")
        inputs = {k: v.to(device) for k, v in inputs.items()}
        output = model(**inputs, labels=target)
        # Compute the loss
        loss, logits = output[:2]
        # lambda_value = torch.tensor(lambda_value).to(device).view(1, -1)
        for name, param in model.named_parameters():
            if name not in previous_param:
                continue
            print(f"Param Name: {name}")
            print(f"lambda_value: {lambda_value}")
            param = param.to(device)
            previous_param[name] = previous_param[name].to(device)
            param_diff = param - previous_param[name]
            # print(param_diff.shape)
            # print(lambda_value.shape)
            # broadcast the lambda_value to the shape of param_diff
           
            # lambda_value = lambda_value.expand_as(param_diff)
            loss += mse_loss(param_diff, torch.zeros_like(param_diff))
            # loss = loss.clone()
            # loss = loss.expand_as(param_diff)
            # total_loss += lambda_value * (param_diff) ** 2
            # print(f"Total Loss shape: {total_loss.shape}")
        # Backward pass and optimization
        loss.backward()
        optimizer.step()
    # Save the current parameters
    for name, param in model.named_parameters():
        previous_param[name] = param.detach().clone()


# Update the model with feedback
update_feedback(feedback_data, 0.1)



Param Name: distilbert.embeddings.word_embeddings.weight
lambda_value: 0.1
Param Name: distilbert.embeddings.position_embeddings.weight
lambda_value: 0.1
Param Name: distilbert.embeddings.LayerNorm.weight
lambda_value: 0.1
Param Name: distilbert.embeddings.LayerNorm.bias
lambda_value: 0.1
Param Name: distilbert.transformer.layer.0.attention.q_lin.weight
lambda_value: 0.1
Param Name: distilbert.transformer.layer.0.attention.q_lin.bias
lambda_value: 0.1
Param Name: distilbert.transformer.layer.0.attention.k_lin.weight
lambda_value: 0.1
Param Name: distilbert.transformer.layer.0.attention.k_lin.bias
lambda_value: 0.1
Param Name: distilbert.transformer.layer.0.attention.v_lin.weight
lambda_value: 0.1
Param Name: distilbert.transformer.layer.0.attention.v_lin.bias
lambda_value: 0.1
Param Name: distilbert.transformer.layer.0.attention.out_lin.weight
lambda_value: 0.1
Param Name: distilbert.transformer.layer.0.attention.out_lin.bias
lambda_value: 0.1
Param Name: distilbert.transformer.layer.0

In [13]:
model.eval()

DistilBertForSequenceClassification(
  (distilbert): DistilBertModel(
    (embeddings): Embeddings(
      (word_embeddings): Embedding(30522, 768, padding_idx=0)
      (position_embeddings): Embedding(512, 768)
      (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
      (dropout): Dropout(p=0.1, inplace=False)
    )
    (transformer): Transformer(
      (layer): ModuleList(
        (0): TransformerBlock(
          (attention): MultiHeadSelfAttention(
            (dropout): Dropout(p=0.1, inplace=False)
            (q_lin): Linear(in_features=768, out_features=768, bias=True)
            (k_lin): Linear(in_features=768, out_features=768, bias=True)
            (v_lin): Linear(in_features=768, out_features=768, bias=True)
            (out_lin): Linear(in_features=768, out_features=768, bias=True)
          )
          (sa_layer_norm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
          (ffn): FFN(
            (dropout): Dropout(p=0.1, inplace=False)
       

In [17]:
feedback_data[1]

("It's really too bad that nobody knows about this movie. I think if it were just spruced up a little and if it weren't so low-budget, I think one of the major film companies might have wanted to take it. I first saw this movie when I was 11, and I thought it was so powerful with the many great, yet illegal lengths that Mitchell goes to just to keep his family together. It inspired me then and it amazes me now. If you're lucky enough to find a copy of this movie, don't miss it!",
 1)

In [24]:
import numpy as np

for i in range(len(feedback_data)):
    print(f"Text: {feedback_data[i][0]}")
    print(f"Label: {feedback_data[i][1]}")
    print(f"Prediction: {np.argmax(torch.nn.functional.softmax(model(**tokenizer(feedback_data[i][0], return_tensors='pt').to(device)).logits).to('cpu').detach().numpy())}")
    print(" ")
# np.argmax(torch.nn.functional.softmax(model(**tokenizer(feedback_data[1][0], return_tensors="pt").to(device)).logits).to("cpu").detach().numpy())

Text: Intended as light entertainment, this film is indeed successful as such during its first half, but then succumbs to a rapidly foundering script that drops it down. Harry (Judd Nelson), a "reformed" burglar, and Daphne (Gina Gershon), an aspiring actress, are employed as live window mannequins at a department store where one evening they are late in leaving and are locked within, whereupon they witness, from their less than protective glass observation point, an apparent homicide occurring on the street. The ostensible murderer, Miles Raymond (Nick Mancuso), a local sculptor, returns the following day to observe the mannequins since he realizes that they are the only possible witnesses to the prior night's violent event and, when one of the posing pair "flinches", the fun begins. Daphne and Harry report their observations at a local police station, but when the detective taking a crime report remembers Harry's criminal background, he becomes cynical. There are a great many ways in

  print(f"Prediction: {np.argmax(torch.nn.functional.softmax(model(**tokenizer(feedback_data[i][0], return_tensors='pt').to(device)).logits).to('cpu').detach().numpy())}")


Prediction: 0
 
Text: It's really too bad that nobody knows about this movie. I think if it were just spruced up a little and if it weren't so low-budget, I think one of the major film companies might have wanted to take it. I first saw this movie when I was 11, and I thought it was so powerful with the many great, yet illegal lengths that Mitchell goes to just to keep his family together. It inspired me then and it amazes me now. If you're lucky enough to find a copy of this movie, don't miss it!
Label: 1
Prediction: 1
 
Text: "An astronaut (Michael Emmet) dies while returning from a mission and his body is recovered by the military. The base where the dead astronaut is taken to becomes the scene of a bizarre invasion plan from outer space. Alien embryos inside the dead astronaut resurrect the corpse and begin a terrifying assault on the military staff in the hopes of conquering the world," according to the DVD sleeve's synopsis.<br /><br />A Roger Corman "American International" prod