<a href="https://colab.research.google.com/github/Saputoa21/ADS_2024_Saputoa/blob/master/exercises/HomeExercise3.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Home Exericse 3: Hyperparameters and Evaluation
In this third home exercise, you will use the knowledge from Tutorial 4 to experiment with hyperparameters, create a test set, and evaluate your final model on the created test set.

In this notebook, please complete all instructions starting with 👋 ⚒ in the code cell after the sign or provide your analysis in the text cell after the sign.

## **Distilbert: Hyperparameters and Evaluation**

Use the code of Tutorial 4 to load and fine-tune the `distilbert-base-cased`model on the small subset of the `imdb`Movie Review Dataset. For convenience, the code of Tutorial 4 required for this exercise is already provided in the code cells below.

👋 ⚒ When creating the dataset splits in the code cell below, additionally create a test set to be used after thet training. Make sure that your test set does not contain any of the sentences contained in the training or validation set and is approximately of the same size as the validation set.

In [2]:
!pip install transformers
!pip install datasets
!pip install evaluate
!pip install accelerate --upgrade



In [3]:
from transformers import AutoTokenizer

tokenizer = AutoTokenizer.from_pretrained("distilbert/distilbert-base-cased")
print(tokenizer)

DistilBertTokenizerFast(name_or_path='distilbert/distilbert-base-cased', vocab_size=28996, model_max_length=512, is_fast=True, padding_side='right', truncation_side='right', special_tokens={'unk_token': '[UNK]', 'sep_token': '[SEP]', 'pad_token': '[PAD]', 'cls_token': '[CLS]', 'mask_token': '[MASK]'}, clean_up_tokenization_spaces=False),  added_tokens_decoder={
	0: AddedToken("[PAD]", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	100: AddedToken("[UNK]", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	101: AddedToken("[CLS]", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	102: AddedToken("[SEP]", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	103: AddedToken("[MASK]", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
}


In [4]:
from datasets import load_dataset, DatasetDict
from transformers import DataCollatorWithPadding

imdb_dataset = load_dataset("imdb")
# we had loaded the imdb dataset already above - if not, outcomment this line
# Make sure you have the right tokenizer
tokenizer = AutoTokenizer.from_pretrained("distilbert/distilbert-base-cased")


# Just take the first 50 tokens for speed on CPU
def truncate(example):
    return {
        'text': " ".join(example['text'].split()[:50]),
        'label': example['label']
    }

# Take 128 random examples for train and 32 validation
small_imdb_dataset = DatasetDict(
    train=imdb_dataset['train'].shuffle(seed=24).select(range(128)).map(truncate),
    val=imdb_dataset['train'].shuffle(seed=24).select(range(128, 160)).map(truncate),
    test=imdb_dataset['test'].shuffle(seed=24).select(range(160, 192)).map(truncate)
)

def tokenize_function(examples):
    return tokenizer(examples["text"], padding=True, truncation=True)

small_tokenized_dataset = small_imdb_dataset.map(tokenize_function, batched=True, batch_size=16)
data_collator = DataCollatorWithPadding(tokenizer=tokenizer)

train-00000-of-00001.parquet:   0%|          | 0.00/21.0M [00:00<?, ?B/s]

test-00000-of-00001.parquet:   0%|          | 0.00/20.5M [00:00<?, ?B/s]

unsupervised-00000-of-00001.parquet:   0%|          | 0.00/42.0M [00:00<?, ?B/s]

Generating train split:   0%|          | 0/25000 [00:00<?, ? examples/s]

Generating test split:   0%|          | 0/25000 [00:00<?, ? examples/s]

Generating unsupervised split:   0%|          | 0/50000 [00:00<?, ? examples/s]

Map:   0%|          | 0/128 [00:00<?, ? examples/s]

Map:   0%|          | 0/32 [00:00<?, ? examples/s]

Map:   0%|          | 0/32 [00:00<?, ? examples/s]

Map:   0%|          | 0/128 [00:00<?, ? examples/s]

Map:   0%|          | 0/32 [00:00<?, ? examples/s]

Map:   0%|          | 0/32 [00:00<?, ? examples/s]

In [5]:
small_tokenized_dataset

DatasetDict({
    train: Dataset({
        features: ['text', 'label', 'input_ids', 'attention_mask'],
        num_rows: 128
    })
    val: Dataset({
        features: ['text', 'label', 'input_ids', 'attention_mask'],
        num_rows: 32
    })
    test: Dataset({
        features: ['text', 'label', 'input_ids', 'attention_mask'],
        num_rows: 32
    })
})

In [8]:
small_tokenized_dataset["train"]["text"][:10]

["I didn't know this was a silent movie with narration. I don't care for silent movies - the corny humor, flickering lighting and film, etc. I'm sure that attributes to the low score I assigned it. It was about chapter 8 before I found any interest in this story and",
 'Well, first off, if you\'re checking out Revolt of the Zombies as some very early Night of the Living Dead (1968)-type film, forget it. This is about "zombies" in a more psychological sense, where that term merely denotes someone who is not in control of their will, but who must',
 'John Thaw, of Inspector Morse fame, plays old Tom Oakley in this movie. Tom lives in a tiny English village during 1939 and the start of the Second World War. A bit of a recluse, Tom has not yet recovered from the death of his wife and son while he',
 'I view probably 200 movies a year both at theaters and at home and I can say with confidence that this movie is by far the worst I have seen this year (If not ever, however I have not actually

In [6]:
small_imdb_dataset

DatasetDict({
    train: Dataset({
        features: ['text', 'label'],
        num_rows: 128
    })
    val: Dataset({
        features: ['text', 'label'],
        num_rows: 32
    })
    test: Dataset({
        features: ['text', 'label'],
        num_rows: 32
    })
})

In [7]:
small_imdb_dataset["train"]["text"][:10]

["I didn't know this was a silent movie with narration. I don't care for silent movies - the corny humor, flickering lighting and film, etc. I'm sure that attributes to the low score I assigned it. It was about chapter 8 before I found any interest in this story and",
 'Well, first off, if you\'re checking out Revolt of the Zombies as some very early Night of the Living Dead (1968)-type film, forget it. This is about "zombies" in a more psychological sense, where that term merely denotes someone who is not in control of their will, but who must',
 'John Thaw, of Inspector Morse fame, plays old Tom Oakley in this movie. Tom lives in a tiny English village during 1939 and the start of the Second World War. A bit of a recluse, Tom has not yet recovered from the death of his wife and son while he',
 'I view probably 200 movies a year both at theaters and at home and I can say with confidence that this movie is by far the worst I have seen this year (If not ever, however I have not actually

👋 ⚒ For this exercise, we will use the Hugging Face Trainer class to play with hyperparamters. Try to find a set of hyperparameter settings that achieves the highest possilbe accuracy on the **validation set** with the small dataset and model in this setup.

**Optional:** If you want to follow a more systematic route, feel free to use available frameworks for hyperparameter optimization, such as [Optuna](https://optuna.org/).

In [9]:
import numpy as np
import evaluate
from transformers import TrainingArguments, Trainer
from transformers import AutoModelForSequenceClassification

model = AutoModelForSequenceClassification.from_pretrained('distilbert/distilbert-base-cased', num_labels=2)
accuracy = evaluate.load("accuracy")

arguments = TrainingArguments(
    output_dir="sample_cl_trainer",
    per_device_train_batch_size=16,
    per_device_eval_batch_size=16,
    logging_steps=8,
    num_train_epochs=5,
    eval_strategy="epoch", # run validation at the end of each epoch
    save_strategy="epoch",
    learning_rate=2e-5,
    weight_decay=0.01,
    load_best_model_at_end=True,
    report_to='none',
    seed=224
)

def compute_metrics(eval_pred):
    """Called at the end of validation. Gives accuracy"""
    logits, labels = eval_pred
    predictions = np.argmax(logits, axis=-1)
    # calculates the accuracy
    return accuracy.compute(predictions=predictions, references=labels)


trainer = Trainer(
    model=model,
    args=arguments,
    train_dataset=small_tokenized_dataset['train'],
    eval_dataset=small_tokenized_dataset['val'], # change to test when you do your final evaluation!
    processing_class=tokenizer,
    data_collator=data_collator,
    compute_metrics=compute_metrics
)

model.safetensors:   0%|          | 0.00/263M [00:00<?, ?B/s]

Some weights of DistilBertForSequenceClassification were not initialized from the model checkpoint at distilbert/distilbert-base-cased and are newly initialized: ['classifier.bias', 'classifier.weight', 'pre_classifier.bias', 'pre_classifier.weight']
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Downloading builder script:   0%|          | 0.00/4.20k [00:00<?, ?B/s]

In [10]:
trainer.train()

Epoch,Training Loss,Validation Loss,Accuracy
1,0.6893,0.697243,0.46875
2,0.6675,0.681305,0.5625
3,0.6352,0.668048,0.5625
4,0.6122,0.657491,0.65625
5,0.587,0.650964,0.625


TrainOutput(global_step=40, training_loss=0.6382334113121033, metrics={'train_runtime': 634.4002, 'train_samples_per_second': 1.009, 'train_steps_per_second': 0.063, 'total_flos': 18636507148416.0, 'train_loss': 0.6382334113121033, 'epoch': 5.0})

In [12]:
results = trainer.predict(small_tokenized_dataset['val'])
print(results)

PredictionOutput(predictions=array([[ 0.10912199, -0.24862015],
       [-0.08390947, -0.09545843],
       [-0.23216808,  0.02856027],
       [-0.1546615 , -0.06547712],
       [ 0.03445987, -0.20733963],
       [-0.05204806, -0.17613503],
       [ 0.11700454, -0.24513961],
       [ 0.19191882, -0.32213825],
       [-0.04224282, -0.10301159],
       [ 0.13488218, -0.23724899],
       [-0.12211907, -0.05076962],
       [ 0.17487013, -0.32186297],
       [ 0.23023017, -0.31963074],
       [-0.05891467, -0.1700373 ],
       [-0.14595997, -0.04894198],
       [ 0.058415  , -0.21517295],
       [ 0.10162449, -0.18161692],
       [-0.07434604, -0.15198843],
       [-0.15634504, -0.03174475],
       [ 0.07476897, -0.24102265],
       [-0.15211006, -0.04129795],
       [-0.06554891, -0.13556752],
       [-0.08547176, -0.11775363],
       [ 0.06410259, -0.27182096],
       [-0.17509888, -0.00364114],
       [ 0.2685417 , -0.3666491 ],
       [-0.13844222, -0.11180785],
       [-0.19865847,  0.05

In [13]:
import torch

In [51]:
test_str = "I love this movie!"

fine_tuned_model = AutoModelForSequenceClassification.from_pretrained("sample_cl_trainer/checkpoint-32") #the best result in the 5th epoch
model_inputs = tokenizer(test_str, return_tensors="pt")
prediction = torch.argmax(fine_tuned_model(**model_inputs).logits)
print(["NEGATIVE", "POSITIVE"][prediction])

NEGATIVE


In [60]:
val_set = small_tokenized_dataset["val"]["text"]

fine_tuned_model = AutoModelForSequenceClassification.from_pretrained("sample_cl_trainer/checkpoint-32")
tokenizer = AutoTokenizer.from_pretrained("sample_cl_trainer/checkpoint-32")
model_inputs = tokenizer(val_set, return_tensors="pt", padding=True, truncation=True)
model_outputs = fine_tuned_model(**model_inputs)
predictions = torch.softmax(model_outputs.logits, dim=1)
predicted_indices = torch.argmax(predictions, dim=1)
labels = ["NEGATIVE", "POSITIVE"]
predicted_labels = [labels[idx] for idx in predicted_indices]
print(predicted_labels)

['NEGATIVE', 'NEGATIVE', 'POSITIVE', 'POSITIVE', 'NEGATIVE', 'NEGATIVE', 'NEGATIVE', 'NEGATIVE', 'NEGATIVE', 'NEGATIVE', 'POSITIVE', 'NEGATIVE', 'NEGATIVE', 'NEGATIVE', 'POSITIVE', 'NEGATIVE', 'NEGATIVE', 'NEGATIVE', 'POSITIVE', 'NEGATIVE', 'POSITIVE', 'NEGATIVE', 'NEGATIVE', 'NEGATIVE', 'POSITIVE', 'NEGATIVE', 'NEGATIVE', 'POSITIVE', 'NEGATIVE', 'NEGATIVE', 'NEGATIVE', 'NEGATIVE']


In [54]:
print(model_outputs.logits.shape)

torch.Size([32, 2])


In [55]:
print(model_inputs)

print(tokenizer.batch_decode(model_inputs.input_ids, skip_special_tokens=True))

{'input_ids': tensor([[ 101, 1135,  112,  ...,    0,    0,    0],
        [ 101,  146, 1138,  ...,    0,    0,    0],
        [ 101,  119,  119,  ...,    0,    0,    0],
        ...,
        [ 101, 1987,  119,  ...,    0,    0,    0],
        [ 101, 5203, 1136,  ...,    0,    0,    0],
        [ 101, 1192, 1221,  ..., 8431, 1193,  102]]), 'attention_mask': tensor([[1, 1, 1,  ..., 0, 0, 0],
        [1, 1, 1,  ..., 0, 0, 0],
        [1, 1, 1,  ..., 0, 0, 0],
        ...,
        [1, 1, 1,  ..., 0, 0, 0],
        [1, 1, 1,  ..., 0, 0, 0],
        [1, 1, 1,  ..., 1, 1, 1]])}
["It ' s a good thing I didn ' t watch this while i was pregnant. I definitely would have cried my eyes out and / or vomit. It was Kind of gruesome mainly disturbing. I personally thought the baby was adorable in its own twisted little way. However as a mom I cringed when Beth stabbed herself", 'I have been a fan of Pushing Daisies since the very beginning. It is wonderfully thought up, and Bryan Fuller has the most re

In [57]:
print(predictions)

print(len(predictions))

tensor([[0.5872, 0.4128],
        [0.5246, 0.4754],
        [0.4478, 0.5522],
        [0.4944, 0.5056],
        [0.5694, 0.4306],
        [0.5419, 0.4581],
        [0.5879, 0.4121],
        [0.6222, 0.3778],
        [0.5242, 0.4758],
        [0.5949, 0.4051],
        [0.4965, 0.5035],
        [0.6206, 0.3794],
        [0.6269, 0.3731],
        [0.5402, 0.4598],
        [0.4950, 0.5050],
        [0.5729, 0.4271],
        [0.5784, 0.4216],
        [0.5311, 0.4689],
        [0.4788, 0.5212],
        [0.5823, 0.4177],
        [0.4842, 0.5158],
        [0.5234, 0.4766],
        [0.5121, 0.4879],
        [0.5827, 0.4173],
        [0.4745, 0.5255],
        [0.6465, 0.3535],
        [0.5026, 0.4974],
        [0.4509, 0.5491],
        [0.6206, 0.3794],
        [0.5639, 0.4361],
        [0.5588, 0.4412],
        [0.6259, 0.3741]], grad_fn=<SoftmaxBackward0>)
32


In [61]:
print(val_set)
print(len(val_set))

["It's a good thing I didn't watch this while i was pregnant.I definitely would have cried my eyes out and/or vomit. It was Kind of gruesome mainly disturbing. I personally thought the baby was adorable in its own twisted little way.However as a mom I cringed when Beth stabbed herself", 'I have been a fan of Pushing Daisies since the very beginning. It is wonderfully thought up, and Bryan Fuller has the most remarkable ideas for this show.<br /><br />It is unbelievable on how much TV has been needing a creative, original show like Pushing Daisies. It is a huge', "... but the trouble of this production is that it's very far from a good musical.<br /><br />Granted, one can't always expect the witty masters like Sondheim or Bernstein or Porter; yet the music of this piece makes even Andrew Lloyd Webber look witty. It's deadly dull and uninventive (with", 'This film moved me beyond comprehension, it is and will remain my favourite film of all time, mainly because it has almost every emotio

Fine-tuned model

In [4]:
from datasets import load_dataset, DatasetDict
from transformers import DataCollatorWithPadding

imdb_dataset = load_dataset("imdb")
# we had loaded the imdb dataset already above - if not, outcomment this line
# Make sure you have the right tokenizer
tokenizer = AutoTokenizer.from_pretrained("distilbert/distilbert-base-cased")

def truncate100(example):
    return {
        'text': " ".join(example['text'].split()[:100]), # I took more tokens
        'label': example['label']
    }

# Take 128 random examples for train and 32 validation
small_imdb_dataset = DatasetDict(
    train=imdb_dataset['train'].shuffle(seed=24).select(range(128)).map(truncate100),
    val=imdb_dataset['train'].shuffle(seed=24).select(range(128, 160)).map(truncate100),
    test=imdb_dataset['test'].shuffle(seed=24).select(range(160, 192)).map(truncate100)
)

def tokenize_function(examples):
    return tokenizer(examples["text"], padding=True, truncation=True)

small_tokenized_dataset = small_imdb_dataset.map(tokenize_function, batched=True, batch_size=16)
data_collator = DataCollatorWithPadding(tokenizer=tokenizer)

Map:   0%|          | 0/32 [00:00<?, ? examples/s]

In [None]:
import numpy as np
import evaluate
from transformers import TrainingArguments, Trainer
from transformers import AutoModelForSequenceClassification

model = AutoModelForSequenceClassification.from_pretrained('distilbert/distilbert-base-cased', num_labels=2)
accuracy = evaluate.load("accuracy")

arguments2 = TrainingArguments(
    output_dir="sample_cl_trainer",
    per_device_train_batch_size=16,
    per_device_eval_batch_size=16,
    logging_steps=4,
    num_train_epochs=5,
    eval_strategy="epoch",
    save_strategy="epoch",
    learning_rate=2e-3,
    weight_decay=0.01,
    load_best_model_at_end=True,
    report_to='none',
    seed=224
)

def compute_metrics(eval_pred):
    """Called at the end of validation. Gives accuracy"""
    logits, labels = eval_pred
    predictions = np.argmax(logits, axis=-1)
    # calculates the accuracy
    return accuracy.compute(predictions=predictions, references=labels)


trainer = Trainer(
    model=model,
    args=arguments2,
    train_dataset=small_tokenized_dataset['train'],
    eval_dataset=small_tokenized_dataset['val'], # change to test when you do your final evaluation!
    processing_class=tokenizer,
    data_collator=data_collator,
    compute_metrics=compute_metrics
)

In [None]:
trainer.train()

👋 ⚒ Change the following code cell in a way that not only a single sentence is evaluated on your trained model (!make sure to use the correct checkpoint!) but the evaluation is performaned on the entire newly created test set.

This might also be a good occassion to get familiar with the [Hugging Face documentation and tutorials](https://huggingface.co/docs/transformers/index).

In [None]:
test_trainer = Trainer(
    model=model,
    args=arguments,
    train_dataset=small_tokenized_dataset['train'],
    eval_dataset=small_tokenized_dataset['test'],
    processing_class=tokenizer,
    data_collator=data_collator,
    compute_metrics=compute_metrics
)

In [None]:
test_trainer.train()

Epoch,Training Loss,Validation Loss,Accuracy
1,0.6068,0.636106,0.65625
2,0.4987,0.58742,0.71875
3,0.387,0.55904,0.6875
4,0.3037,0.549808,0.71875
5,0.2407,0.537601,0.71875


TrainOutput(global_step=40, training_loss=0.40737428665161135, metrics={'train_runtime': 447.7848, 'train_samples_per_second': 1.429, 'train_steps_per_second': 0.089, 'total_flos': 18636507148416.0, 'train_loss': 0.40737428665161135, 'epoch': 5.0})

In [None]:
test_str = "I love this movie!"

fine_tuned_model = AutoModelForSequenceClassification.from_pretrained("sample_cl_trainer/checkpoint-40")
model_inputs = tokenizer(test_str, return_tensors="pt")
prediction = torch.argmax(fine_tuned_model(**model_inputs).logits)
print(["NEGATIVE", "POSITIVE"][prediction])

OSError: sample_cl_trainer/checkpoint-40 is not a local folder and is not a valid model identifier listed on 'https://huggingface.co/models'
If this is a private repository, make sure to pass a token having permission to this repo either by logging in with `huggingface-cli login` or by passing `token=<your_token>`

In [None]:
test_set_string = small_imdb_dataset["test"]["text"][0]

fine_tuned_model = AutoModelForSequenceClassification.from_pretrained("sample_cl_trainer/checkpoint-32")
model_inputs = tokenizer(test_set_string, return_tensors="pt", padding=True, truncation=True)
prediction = torch.argmax(fine_tuned_model(**model_inputs).logits)
print(["NEGATIVE", "POSITIVE"][prediction])

NEGATIVE


In [None]:
print(test_set_string)

If you are 10 years old and never seen a movie before, maybe this film may be entertainment for you, but if you've seen several movies, this one will be a silly fully-cliched cheap and predictable for you. Don't waste your time with this.


In [None]:
test_set = small_imdb_dataset["test"]["text"]

fine_tuned_model = AutoModelForSequenceClassification.from_pretrained("sample_cl_trainer/checkpoint-32")
tokenizer = AutoTokenizer.from_pretrained("sample_cl_trainer/checkpoint-32")

tokenized_test = tokenizer(
    test_set,  # All test sentences
    return_tensors="pt",  # PyTorch tensors
    padding=True,  # Pad to the longest sequence
    truncation=True,  # Truncate sequences longer than the model's max length
    max_length=128  # Optional: limit to the model's max token limit
)

from torch.utils.data import DataLoader, TensorDataset

# Create a DataLoader for batching
batch_size = 16  # Set an appropriate batch size
test_dataset = TensorDataset(tokenized_test["input_ids"], tokenized_test["attention_mask"])
test_loader = DataLoader(test_dataset, batch_size=batch_size)

# Make predictions
predictions = []
fine_tuned_model.eval()  # Set the model to evaluation mode
with torch.no_grad():  # Disable gradient computation for testing
    for batch in test_loader:
        input_ids, attention_mask = batch  # Unpack batch
        logits = fine_tuned_model(input_ids=input_ids, attention_mask=attention_mask).logits
        batch_predictions = torch.argmax(logits, dim=1)  # Get predicted class indices
        predictions.extend(batch_predictions.numpy())

label_map = ["NEGATIVE", "POSITIVE"]

# Map predictions to labels
mapped_predictions = [label_map[p] for p in predictions]

# Print sample predictions
print(mapped_predictions)

['NEGATIVE', 'NEGATIVE', 'NEGATIVE', 'POSITIVE', 'NEGATIVE', 'POSITIVE', 'POSITIVE', 'POSITIVE', 'POSITIVE', 'POSITIVE', 'POSITIVE', 'POSITIVE', 'NEGATIVE', 'POSITIVE', 'POSITIVE', 'POSITIVE', 'POSITIVE', 'POSITIVE', 'NEGATIVE', 'NEGATIVE', 'POSITIVE', 'NEGATIVE', 'NEGATIVE', 'NEGATIVE', 'POSITIVE', 'POSITIVE', 'NEGATIVE', 'POSITIVE', 'NEGATIVE', 'POSITIVE', 'POSITIVE', 'NEGATIVE']


In [None]:
test_set_results = trainer.predict(small_tokenized_dataset['test'])
print(test_set_results)

PredictionOutput(predictions=array([[-0.12628499, -0.1814078 ],
       [-0.13654362, -0.16797242],
       [-0.1676146 , -0.12522896],
       [-0.1624929 , -0.08154882],
       [-0.15949064, -0.15483914],
       [-0.2699575 , -0.04173411],
       [-0.20386572, -0.01026182],
       [-0.2156885 , -0.05279515],
       [-0.25099704, -0.0090025 ],
       [-0.19909038, -0.10893662],
       [-0.22393948, -0.10907632],
       [-0.22485451, -0.00032955],
       [-0.21493222, -0.07440689],
       [-0.1967145 , -0.05077852],
       [-0.22332624, -0.06344146],
       [-0.26871625, -0.04528964],
       [-0.2432943 , -0.05067471],
       [-0.26071727, -0.00381552],
       [-0.19610348, -0.11476539],
       [-0.19854954, -0.10746909],
       [-0.16860248, -0.0921507 ],
       [-0.1962165 , -0.08704866],
       [-0.13990955, -0.17649779],
       [-0.1305032 , -0.17894071],
       [-0.20959505, -0.06493679],
       [-0.2210635 , -0.05317314],
       [-0.15122974, -0.17080607],
       [-0.22553392, -0.04