# Llama 3.2 fine tuning with "chopped" dataset

2025-01-06 10:33

Over a week of fine-tuning on the size-color-text data set. Unfortunately the training froze several times without an error message and had to be restarted. I might need to put the loss calculation on a web service. The loss did not improve over time and the output is not picking up on the colors and sizes. It's a little odd, and it might have something to do with the restarted traning although not sure. The loss is flat. The trained parameter size should not be an issue for a relatively simple data set like this one.

In [2]:
!apt-get install build-essential -y

Reading package lists... Done
Building dependency tree... Done
Reading state information... Done
The following additional packages will be installed:
  binutils binutils-common binutils-x86-64-linux-gnu bzip2 cpp cpp-11 dirmngr
  dpkg-dev fakeroot g++ g++-11 gcc gcc-11 gcc-11-base gnupg gnupg-l10n
  gnupg-utils gpg-agent gpg-wks-client gpg-wks-server gpgsm
  libalgorithm-diff-perl libalgorithm-diff-xs-perl libalgorithm-merge-perl
  libasan6 libbinutils libcc1-0 libctf-nobfd0 libctf0 libdpkg-perl libfakeroot
  libfile-fcntllock-perl libgcc-11-dev libisl23 libitm1 libksba8
  liblocale-gettext-perl liblsan0 libmpc3 libmpfr6 libnpth0 libstdc++-11-dev
  libtsan0 libubsan1 lto-disabled-list make patch pinentry-curses xz-utils
Suggested packages:
  binutils-doc bzip2-doc cpp-doc gcc-11-locales pinentry-gnome3 tor
  debian-keyring g++-multilib g++-11-multilib gcc-11-doc gcc-multilib
  manpages-dev autoconf automake libtool flex bison gdb gcc-doc
  gcc-11-multilib parcimonie xloadimage scdaemon

In [3]:
!pip install unsloth
!pip uninstall unsloth -y && pip install --upgrade --no-cache-dir "unsloth[colab-new] @ git+https://github.com/unslothai/unsloth.git"

!pip install sacrebleu
!pip install pytest-playwright
!playwright install
!pip install matplotlib
!pip install pillow
!pip install torchvision
!pip install lpips

!playwright install-deps  

!pip install -U numpy
!pip install tensorboard

Collecting unsloth
  Using cached unsloth-2024.12.11-py3-none-any.whl.metadata (59 kB)
Collecting unsloth_zoo>=2024.12.5 (from unsloth)
  Using cached unsloth_zoo-2024.12.6-py3-none-any.whl.metadata (16 kB)
Collecting xformers>=0.0.27.post2 (from unsloth)
  Using cached xformers-0.0.29-cp311-cp311-manylinux_2_28_x86_64.whl.metadata (1.0 kB)
Collecting bitsandbytes (from unsloth)
  Using cached bitsandbytes-0.45.0-py3-none-manylinux_2_24_x86_64.whl.metadata (2.9 kB)
Collecting tyro (from unsloth)
  Using cached tyro-0.9.5-py3-none-any.whl.metadata (9.4 kB)
Collecting transformers!=4.47.0,>=4.46.1 (from unsloth)
  Using cached transformers-4.47.1-py3-none-any.whl.metadata (44 kB)
Collecting datasets>=2.16.0 (from unsloth)
  Using cached datasets-3.2.0-py3-none-any.whl.metadata (20 kB)
Collecting sentencepiece>=0.2.0 (from unsloth)
  Using cached sentencepiece-0.2.0-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.metadata (7.7 kB)
Collecting accelerate>=0.34.1 (from unsloth)
  

In [1]:
import os
import numpy as np
import pandas as pd

import torch
from trl import SFTTrainer
from transformers import TrainingArguments, TextStreamer
from unsloth.chat_templates import get_chat_template
from unsloth import FastLanguageModel
from datasets import Dataset
from unsloth import is_bfloat16_supported

# Saving model
from transformers import AutoTokenizer, AutoModelForSequenceClassification

# Warnings
import warnings
warnings.filterwarnings("ignore")

🦥 Unsloth: Will patch your computer to enable 2x faster free finetuning.
🦥 Unsloth Zoo will now patch everything to make training faster!


In [2]:
max_seq_length = 131_072

def load_model():
    model, tokenizer = FastLanguageModel.from_pretrained(
        model_name="unsloth/Llama-3.2-1B-bnb-4bit",
        max_seq_length=max_seq_length,
        load_in_4bit=True,
        dtype=None,
    )
    
    model = FastLanguageModel.get_peft_model(
        model,
        r=16,
        lora_alpha=16,
        lora_dropout=0,
        target_modules=["q_proj", "k_proj", "v_proj", "up_proj", "down_proj", "o_proj", "gate_proj"],
        use_rslora=True,
        use_gradient_checkpointing="unsloth",
        random_state = 32,
        loftq_config = None,
    )
    return model, tokenizer

In [3]:
def create_trainer(model, tokenizer, training_data, max_steps):
    training_arguments = TrainingArguments(
        learning_rate=3e-4,
        lr_scheduler_type="linear",
        per_device_train_batch_size=2,
        gradient_accumulation_steps=64,
        num_train_epochs=40,
        fp16=not is_bfloat16_supported(),
        bf16=is_bfloat16_supported(),
        logging_steps=1,
        # max_steps=max_steps,
        optim="adamw_8bit",
        weight_decay=0.01,
        warmup_steps=10,
        output_dir="output",
        seed=0,
        save_total_limit=3,
    )

    if max_steps is not None:
        training_arguments.max_steps = max_steps
    
    return SFTTrainer(
        model=model,
        tokenizer=tokenizer,
        train_dataset=training_data,
        dataset_text_field="text",
        max_seq_length=max_seq_length,
        dataset_num_proc=10,
        packing=True,
        args=training_arguments,
    )

In [4]:
import numpy as np
from utils.similarity import calculate_metrics
from torch.utils.tensorboard import SummaryWriter
from PIL import Image
import torch

log_dir = 'output/runs'

def add_image_to_tensorboard(name, step, img_path):
    image = Image.open(img_path)
    image = image.convert('RGB')
    image_array = np.array(image)
    image_tensor = torch.from_numpy(image_array)
    image_tensor = image_tensor.permute(2, 0, 1)
    image_tensor = image_tensor.float() / 255.0
    
    writer = SummaryWriter(log_dir=log_dir)
    writer.add_image(name, image_tensor, step)
    
def add_text_to_tensorboard(name, step, text):
    writer = SummaryWriter(log_dir=log_dir)
    writer.add_text(name, text, step)

def postprocess_text(preds, labels):
    preds = [pred.strip().replace('<unk>', '') for pred in preds]
    labels = [[label.strip().replace('<unk>', '')] for label in labels]

    return preds, labels

def compute_metrics(decoded_predictions, decoded_labels, steps):
    similarity_scores = []
    perceptual_losses = []
    index = 1
    
    for prediction, label in zip(decoded_predictions, decoded_labels):
        prediction = prediction.replace(tokenizer.eos_token, '')
        
        add_text_to_tensorboard(f'valid_{index}_label_text', steps, label)
        add_text_to_tensorboard(f'valid_{index}_prediction_text', steps, prediction)
        
        metrics = calculate_metrics(prediction, label)
        
        if metrics is not None:
            similarity_scores.append(metrics['similarity'])
            perceptual_losses.append(metrics['perceptual_loss'])
            
            add_image_to_tensorboard(f'valid_{index}_expectation', steps, metrics['expected_screenshot_path'])
            add_image_to_tensorboard(f'valid_{index}_prediction', steps, metrics['predicted_screenshot_path'])
        
        index += 1

    results = {
        "similarity": float(np.mean(similarity_scores)),
        "perceptual_loss": float(np.mean(perceptual_losses)),
    }
    
    writer = SummaryWriter(log_dir=log_dir)
    writer.add_scalar('similarity', results['similarity'], steps)
    writer.add_scalar('perceptual_loss', results['perceptual_loss'], steps)
    
    print("Similarity:", results['similarity'])
    print("Perceptual loss:", results['perceptual_loss'])

    return results

def test_prediction(model, data, steps):
    answers = []
    labels = []
    print("Generating predictions...")
    for row in data:
        inputs = tokenizer(
        [
            data_prompt.format(
                #instructions
                row['svg'],
                #answer
                "",
            )
        ], return_tensors = "pt").to("cuda")
        
        outputs = model.generate(**inputs, max_new_tokens = 5020, use_cache = True)
        answer = tokenizer.batch_decode(outputs)
        answers.append(answer[0].split("### Response:")[-1])
        labels.append(row['html'])

    print("Computing metrics...")
    compute_metrics(answers, labels, steps)

In [5]:
!rm -rf output

In [13]:
!apt install zip -y
!rm -rf data-rb-size-color-text
!mkdir -p data-rb-size-color-text
!wget "https://www.dropbox.com/scl/fi/689uan4tngw5z1b38hlgt/data-rb-size-color-text.zip?rlkey=tpl5lin2hh2vyn5k3c4dcdanw&dl=0" -O model.zip
!unzip model.zip -d data-rb-size-color-text

!rm -rf data-rb-validate
!mkdir -p data-rb-validate

Reading package lists... Done
Building dependency tree... Done
Reading state information... Done
zip is already the newest version (3.0-12build2).
0 upgraded, 0 newly installed, 0 to remove and 29 not upgraded.
--2024-12-28 14:01:38--  https://www.dropbox.com/scl/fi/689uan4tngw5z1b38hlgt/data-rb-size-color-text.zip?rlkey=tpl5lin2hh2vyn5k3c4dcdanw&dl=0
Resolving www.dropbox.com (www.dropbox.com)... 162.125.13.18, 2620:100:6057:18::a27d:d12
Connecting to www.dropbox.com (www.dropbox.com)|162.125.13.18|:443... connected.
HTTP request sent, awaiting response... 302 Found
Location: https://uc508029b6da911debf9a83ba374.dl.dropboxusercontent.com/cd/0/inline/ChFJuhnuQI87BhArxW-kT2t4STu2b0DY6PWVHI0ArIEn_Jw-9gE9prU7XmdWHXW8exEbGFCIkjdeWLvuH7Op_llRJyJBEkJ_CN5I6FoWTb5f4SZRj4vqxzUq8NOqD-y6FR0/file# [following]
--2024-12-28 14:01:39--  https://uc508029b6da911debf9a83ba374.dl.dropboxusercontent.com/cd/0/inline/ChFJuhnuQI87BhArxW-kT2t4STu2b0DY6PWVHI0ArIEn_Jw-9gE9prU7XmdWHXW8exEbGFCIkjdeWLvuH7Op_llRJyJ

In [14]:
from datasets import load_from_disk
dataset = load_from_disk('data-rb-size-color-text')

dataset = dataset.train_test_split(test_size=4/len(dataset))

dataset

DatasetDict({
    train: Dataset({
        features: ['svg', 'html'],
        num_rows: 99849
    })
    test: Dataset({
        features: ['svg', 'html'],
        num_rows: 4
    })
})

In [5]:
model, tokenizer = load_model()

data_prompt = """Your job is to take an SVG file of a web design and convert it into a pixel-perfect HTML and CSS markup and stylesheet.

### Input:
{}

### Response:
{}"""

EOS_TOKEN = tokenizer.eos_token
def formatting_prompt(examples):
    inputs       = examples["svg"]
    outputs      = examples["html"]
    texts = []
    for input_, output in zip(inputs, outputs):
        text = data_prompt.format(input_, output) + EOS_TOKEN
        texts.append(text)
    return { "text" : texts, }



==((====))==  Unsloth 2024.12.11: Fast Llama patching. Transformers: 4.47.1.
   \\   /|    GPU: NVIDIA H100 NVL. Max memory: 93.003 GB. Platform: Linux.
O^O/ \_/ \    Torch: 2.5.1+cu124. CUDA: 9.0. CUDA Toolkit: 12.4. Triton: 3.1.0
\        /    Bfloat16 = TRUE. FA [Xformers = 0.0.29. FA2 = False]
 "-____-"     Free Apache license: http://github.com/unslothai/unsloth
Unsloth: Fast downloading is enabled - ignore downloading bars which are red colored!


Unsloth 2024.12.11 patched 16 layers with 16 QKV layers, 16 O layers and 16 MLP layers.


In [16]:
training_data = dataset.map(formatting_prompt, batched=True)

Map:   0%|          | 0/99849 [00:00<?, ? examples/s]

Map:   0%|          | 0/4 [00:00<?, ? examples/s]

In [17]:
training_data

DatasetDict({
    train: Dataset({
        features: ['svg', 'html', 'text'],
        num_rows: 99849
    })
    test: Dataset({
        features: ['svg', 'html', 'text'],
        num_rows: 4
    })
})

In [18]:
def get_token_lengths(examples):
    inputs = tokenizer(
        examples['text'],
        truncation=False,  # Don't truncate yet
        padding=False,     # Don't pad yet
        return_length=True,
    )

    return inputs

tokenized_data = training_data.map(get_token_lengths, batched=True)

def filter_function(example):
    return example['length'] <= max_seq_length

filtered_data = tokenized_data.filter(filter_function)

print(filtered_data)

Map:   0%|          | 0/99849 [00:00<?, ? examples/s]

Map:   0%|          | 0/4 [00:00<?, ? examples/s]

Filter:   0%|          | 0/99849 [00:00<?, ? examples/s]

Filter:   0%|          | 0/4 [00:00<?, ? examples/s]

DatasetDict({
    train: Dataset({
        features: ['svg', 'html', 'text', 'input_ids', 'attention_mask', 'length'],
        num_rows: 99849
    })
    test: Dataset({
        features: ['svg', 'html', 'text', 'input_ids', 'attention_mask', 'length'],
        num_rows: 4
    })
})


In [19]:
filtered_data = filtered_data.remove_columns(["input_ids", "attention_mask", "length"])
filtered_data.save_to_disk('data-rb-size-color-text-filtered-' + str(max_seq_length))

Saving the dataset (0/3 shards):   0%|          | 0/99849 [00:00<?, ? examples/s]

Saving the dataset (0/1 shards):   0%|          | 0/4 [00:00<?, ? examples/s]

In [6]:
from datasets import load_from_disk

filtered_data = load_from_disk('data-rb-size-color-text-filtered-' + str(max_seq_length))

filtered_data

DatasetDict({
    train: Dataset({
        features: ['svg', 'html', 'text'],
        num_rows: 99849
    })
    test: Dataset({
        features: ['svg', 'html', 'text'],
        num_rows: 4
    })
})

In [7]:
import torch
from tqdm import tqdm

# resume = False
resume = True
for steps in tqdm(range(159, 360, 1)):
    print(f"Steps: {steps}")

    if steps > 0:
        trainer = create_trainer(model, tokenizer, filtered_data['train'], steps)
        if resume:
            trainer.train(resume_from_checkpoint=True)
        else:
            trainer.train()
            resume = True
        
    model = FastLanguageModel.for_inference(model)

    results = test_prediction(model, filtered_data['test'], steps)

    if results is not None and results['perceptual_loss'] == 0.0:
        break

    model = FastLanguageModel.for_training(model)

    

  0%|          | 0/201 [00:00<?, ?it/s]

Steps: 159


==((====))==  Unsloth - 2x faster free finetuning | Num GPUs = 1
   \\   /|    Num examples = 1,219 | Num Epochs = 18
O^O/ \_/ \    Batch size per device = 2 | Gradient Accumulation steps = 64
\        /    Total batch size = 128 | Total steps = 159
 "-____-"     Number of trainable parameters = 11,272,192


Step,Training Loss
161,0.286


Generating predictions...
Computing metrics...


  0%|          | 1/201 [1:00:58<203:14:38, 3658.39s/it]

Similarity: 0.6557506948709488
Perceptual loss: 0.6466260254383087
Steps: 160


==((====))==  Unsloth - 2x faster free finetuning | Num GPUs = 1
   \\   /|    Num examples = 1,219 | Num Epochs = 18
O^O/ \_/ \    Batch size per device = 2 | Gradient Accumulation steps = 64
\        /    Total batch size = 128 | Total steps = 160
 "-____-"     Number of trainable parameters = 11,272,192


Step,Training Loss
162,0.2858


Generating predictions...
Computing metrics...


  1%|          | 2/201 [2:01:49<201:59:13, 3654.04s/it]

Similarity: 0.6557506948709488
Perceptual loss: 0.6466260254383087
Steps: 161


==((====))==  Unsloth - 2x faster free finetuning | Num GPUs = 1
   \\   /|    Num examples = 1,219 | Num Epochs = 18
O^O/ \_/ \    Batch size per device = 2 | Gradient Accumulation steps = 64
\        /    Total batch size = 128 | Total steps = 161
 "-____-"     Number of trainable parameters = 11,272,192


Step,Training Loss


Generating predictions...
Computing metrics...


  1%|▏         | 3/201 [2:04:55<113:52:04, 2070.33s/it]

Similarity: 0.6557506948709488
Perceptual loss: 0.6466260254383087
Steps: 162


==((====))==  Unsloth - 2x faster free finetuning | Num GPUs = 1
   \\   /|    Num examples = 1,219 | Num Epochs = 18
O^O/ \_/ \    Batch size per device = 2 | Gradient Accumulation steps = 64
\        /    Total batch size = 128 | Total steps = 162
 "-____-"     Number of trainable parameters = 11,272,192


Step,Training Loss


Generating predictions...
Computing metrics...


  2%|▏         | 4/201 [2:08:02<72:37:15, 1327.08s/it] 

Similarity: 0.6557506948709488
Perceptual loss: 0.6466260254383087
Steps: 163


==((====))==  Unsloth - 2x faster free finetuning | Num GPUs = 1
   \\   /|    Num examples = 1,219 | Num Epochs = 19
O^O/ \_/ \    Batch size per device = 2 | Gradient Accumulation steps = 64
\        /    Total batch size = 128 | Total steps = 163
 "-____-"     Number of trainable parameters = 11,272,192


Step,Training Loss
163,0.2859


Generating predictions...
Computing metrics...


  2%|▏         | 5/201 [3:08:49<117:48:12, 2163.74s/it]

Similarity: 0.6557506948709488
Perceptual loss: 0.6466260254383087
Steps: 164


==((====))==  Unsloth - 2x faster free finetuning | Num GPUs = 1
   \\   /|    Num examples = 1,219 | Num Epochs = 19
O^O/ \_/ \    Batch size per device = 2 | Gradient Accumulation steps = 64
\        /    Total batch size = 128 | Total steps = 164
 "-____-"     Number of trainable parameters = 11,272,192


Step,Training Loss
164,0.2859


Generating predictions...
Computing metrics...


  3%|▎         | 6/201 [4:09:44<144:39:20, 2670.57s/it]

Similarity: 0.6557506948709488
Perceptual loss: 0.6466260254383087
Steps: 165


==((====))==  Unsloth - 2x faster free finetuning | Num GPUs = 1
   \\   /|    Num examples = 1,219 | Num Epochs = 19
O^O/ \_/ \    Batch size per device = 2 | Gradient Accumulation steps = 64
\        /    Total batch size = 128 | Total steps = 165
 "-____-"     Number of trainable parameters = 11,272,192


Step,Training Loss
165,0.2859


Generating predictions...
Computing metrics...


  3%|▎         | 7/201 [5:10:32<161:08:08, 2990.14s/it]

Similarity: 0.6557506948709488
Perceptual loss: 0.6466260254383087
Steps: 166


==((====))==  Unsloth - 2x faster free finetuning | Num GPUs = 1
   \\   /|    Num examples = 1,219 | Num Epochs = 19
O^O/ \_/ \    Batch size per device = 2 | Gradient Accumulation steps = 64
\        /    Total batch size = 128 | Total steps = 166
 "-____-"     Number of trainable parameters = 11,272,192


Step,Training Loss
166,0.2858


Generating predictions...
Computing metrics...


  4%|▍         | 8/201 [6:11:31<171:43:32, 3203.17s/it]

Similarity: 0.6557506948709488
Perceptual loss: 0.6466260254383087
Steps: 167


==((====))==  Unsloth - 2x faster free finetuning | Num GPUs = 1
   \\   /|    Num examples = 1,219 | Num Epochs = 19
O^O/ \_/ \    Batch size per device = 2 | Gradient Accumulation steps = 64
\        /    Total batch size = 128 | Total steps = 167
 "-____-"     Number of trainable parameters = 11,272,192


Step,Training Loss
167,0.2857


Generating predictions...
Computing metrics...


  4%|▍         | 9/201 [7:12:17<178:13:01, 3341.57s/it]

Similarity: 0.6557506948709488
Perceptual loss: 0.6466260254383087
Steps: 168


==((====))==  Unsloth - 2x faster free finetuning | Num GPUs = 1
   \\   /|    Num examples = 1,219 | Num Epochs = 19
O^O/ \_/ \    Batch size per device = 2 | Gradient Accumulation steps = 64
\        /    Total batch size = 128 | Total steps = 168
 "-____-"     Number of trainable parameters = 11,272,192


Step,Training Loss
168,0.2859


Generating predictions...
Computing metrics...


  5%|▍         | 10/201 [8:13:17<182:30:28, 3439.94s/it]

Similarity: 0.6557506948709488
Perceptual loss: 0.6466260254383087
Steps: 169


==((====))==  Unsloth - 2x faster free finetuning | Num GPUs = 1
   \\   /|    Num examples = 1,219 | Num Epochs = 19
O^O/ \_/ \    Batch size per device = 2 | Gradient Accumulation steps = 64
\        /    Total batch size = 128 | Total steps = 169
 "-____-"     Number of trainable parameters = 11,272,192


Step,Training Loss
169,0.286


Generating predictions...
Computing metrics...


  5%|▌         | 11/201 [9:14:06<184:55:04, 3503.71s/it]

Similarity: 0.6557506948709488
Perceptual loss: 0.6466260254383087
Steps: 170


==((====))==  Unsloth - 2x faster free finetuning | Num GPUs = 1
   \\   /|    Num examples = 1,219 | Num Epochs = 19
O^O/ \_/ \    Batch size per device = 2 | Gradient Accumulation steps = 64
\        /    Total batch size = 128 | Total steps = 170
 "-____-"     Number of trainable parameters = 11,272,192


Step,Training Loss
170,0.2861


Generating predictions...
Computing metrics...


  6%|▌         | 12/201 [10:15:00<186:21:10, 3549.58s/it]

Similarity: 0.6557506948709488
Perceptual loss: 0.6466260254383087
Steps: 171


==((====))==  Unsloth - 2x faster free finetuning | Num GPUs = 1
   \\   /|    Num examples = 1,219 | Num Epochs = 19
O^O/ \_/ \    Batch size per device = 2 | Gradient Accumulation steps = 64
\        /    Total batch size = 128 | Total steps = 171
 "-____-"     Number of trainable parameters = 11,272,192


Step,Training Loss
171,0.2859


Generating predictions...
Computing metrics...


  6%|▋         | 13/201 [11:15:52<186:59:08, 3580.58s/it]

Similarity: 0.6557506948709488
Perceptual loss: 0.6466260254383087
Steps: 172


==((====))==  Unsloth - 2x faster free finetuning | Num GPUs = 1
   \\   /|    Num examples = 1,219 | Num Epochs = 20
O^O/ \_/ \    Batch size per device = 2 | Gradient Accumulation steps = 64
\        /    Total batch size = 128 | Total steps = 172
 "-____-"     Number of trainable parameters = 11,272,192


Step,Training Loss
172,0.286


Generating predictions...
Computing metrics...


  7%|▋         | 14/201 [12:16:50<187:12:32, 3604.02s/it]

Similarity: 0.6557506948709488
Perceptual loss: 0.6466260254383087
Steps: 173


==((====))==  Unsloth - 2x faster free finetuning | Num GPUs = 1
   \\   /|    Num examples = 1,219 | Num Epochs = 20
O^O/ \_/ \    Batch size per device = 2 | Gradient Accumulation steps = 64
\        /    Total batch size = 128 | Total steps = 173
 "-____-"     Number of trainable parameters = 11,272,192


Step,Training Loss
173,0.286


Generating predictions...
Computing metrics...


  7%|▋         | 15/201 [13:17:42<186:56:50, 3618.34s/it]

Similarity: 0.6557506948709488
Perceptual loss: 0.6466260254383087
Steps: 174


==((====))==  Unsloth - 2x faster free finetuning | Num GPUs = 1
   \\   /|    Num examples = 1,219 | Num Epochs = 20
O^O/ \_/ \    Batch size per device = 2 | Gradient Accumulation steps = 64
\        /    Total batch size = 128 | Total steps = 174
 "-____-"     Number of trainable parameters = 11,272,192
 12%|█▏        | 25/201 [24:03:32<169:22:35, 3464.52s/it]


KeyboardInterrupt: 

In [9]:
test_index = 0
text = filtered_data['test'][test_index]['svg']
model = FastLanguageModel.for_inference(model)
inputs = tokenizer(
[
    data_prompt.format(
        #instructions
        text,
        #answer
        "",
    )
], return_tensors = "pt").to("cuda")

outputs = model.generate(**inputs, max_new_tokens = 5020, use_cache = True)
answer=tokenizer.batch_decode(outputs)
answer = answer[0].split("### Response:")[-1]

print(filtered_data['test'][test_index]['html'])
print("Answer of the question is:", answer)

<body><div id="top"><div id="top-left">NIGHT</div><div id="top-right">OCCUR</div></div><div id="bottom"><div id="bottom-left">QUEEN</div><div id="bottom-right">QUEEN</div></div></body>

<style>


        body {
            margin: 0;
            display: flex;
            flex-direction: column;
            min-height: 100vh;
            font-weight: bold;
        }

        #top {
            background-color: #7dc826;
            flex-basis: 52vh;
            flex-grow: 0;
            flex-shrink: 0;
            color: #5ade23;
            font-size: 27px;
        }

        #bottom {
            background-color: #0ce6ce;
            flex: 1 1 auto;
            color: #69150d;
            font-size: 18pt;
        }

        #top, #bottom {
            display: flex;
        }

        #top-left {
            flex-basis: 68vw;
            background: #64190b;
            color: #7018f0;
            font-size: 3em;
        }

        #bottom-left {
            flex-basis: 163px;
     

In [10]:
test_prediction(model, filtered_data['test'], steps)

Generating predictions...
Computing metrics...
Similarity: 0.6557506948709488
Perceptual loss: 0.6466260254383087
