**Fine-tuning the LLaMa model with the reasoning strategy we added**

Goal is to fine the a pre-trained LLaMa 3.1 8B model. The data used for fine-tuning is the training-data from https://arcprize.org expanded by humanly created reasonings in the following format:  <br>
**-Overall Rule**: Based on which rules does one get the output from the input? <br>
**-Dimension**: Which implications has the described rule on the dimensions the output grid must have? <br>
**-Values**: Which implications has the described rule on the values used in the output ? <br>
**-Changed values**: Which implications has the described rule on the entries that have to change their values? (Mostly applicable in cases where the starting point of the output is the copied input) <br> <br>
The goal is to push the fine-tuned model towards a reasoning strategy. It should therefore be fine-tuned on approaching the ARC puzzle problems with a similar strategy. I.e. think about the overall rule to get from the input to the output, then think about the actual implications that has on the real output. <br> <br>
For fine-tuning the LLaMa model efficiently Unsloth (https://huggingface.co/blog/mlabonne/sft-llama3) is used.


Installing relevant packages

In [1]:
#!pip install "unsloth[colab-new] @ git+https://github.com/unslothai/unsloth.git"
#!pip install --no-deps "xformers<0.0.27" "trl<0.9.0" peft accelerate bitsandbytes

Import packages

In [2]:
# For fine-tuning
import torch
from trl import SFTTrainer, SFTConfig
from transformers import TrainingArguments, TextStreamer, AutoTokenizer, DataCollatorForSeq2Seq
from unsloth.chat_templates import get_chat_template
from unsloth import FastLanguageModel, is_bfloat16_supported

# For dataset preperation
import json
from datasets import load_dataset
from datasets import Dataset
import pandas as pd

# For HP-Tuning
import itertools
import time

import os

  from .autonotebook import tqdm as notebook_tqdm


🦥 Unsloth: Will patch your computer to enable 2x faster free finetuning.
🦥 Unsloth Zoo will now patch everything to make training faster!


Preparing the train JSON-file with reasonings such that it is a list of lists of dictionaries as shown in https://docs.unsloth.ai/basics/chat-templates

In [3]:
file_path_solutions = "arc-agi_training_solutions.json"
file_path_challenges = "arc-agi_training_challenges.json"

# Open and load the JSON file
with open(file_path_solutions, "r") as file:
    data_solutions = json.load(file)

with open(file_path_challenges, "r") as file:
    data_challenges = json.load(file)


reasonings = pd.read_csv("reasonings_final.csv", delimiter=";", engine="python")
reasonings_dict = reasonings.set_index(reasonings.columns[0])[reasonings.columns[1]].to_dict()

Visualize Data

In [4]:
#print(data_solutions)

In [5]:
#print(data_challenges)
#print(len(data_challenges))

In [6]:
#print(reasonings_dict)
#print(len(reasonings_dict))

In [None]:
# final list of list of dictionaries the model will be fine-tuned on
training_data = []

for id in data_challenges:

    #list representing the conversation between LLM and user - for each training example there will be one such list appended to the training_data list
    conversation = []


    #examples of inputs and output which are given the model such that it can observe the input -> output reasoning rule
    human_input = "Following are examples which all follow the same transformation rule from input to output."

    train_example_count = 1
    for train_example in data_challenges[id]["train"]:
        human_input += f'''
        Input{train_example_count}:
        {train_example['input']}
        Output{train_example_count}:
        {train_example['output']}
        '''
        train_example_count += 1

    human_input += "Please solve the following inputs with the same transformation rule observed in the examples above."
    test_example_count = 1
    for test_example in data_challenges[id]['test']:
        human_input += f'''
        Input{test_example_count}:
        {test_example['input']}
        '''
        test_example_count += 1

    conversation.append({'from' : 'human', 'value' : human_input})

    assistant_output = 'Sure, following my reasoning strategy and the predicted outputs.\n'
    try:
        assistant_output += reasonings_dict[id] + "\n"
    except:
        print(f'ID: {id} not found in reasonings_dict')
    
    solutions_count = 1
    for sol in data_solutions[id]:
        assistant_output += f'''Solution to input{solutions_count}:
        {sol}
        '''
        solutions_count += 1
    print('Human input now')
    print(human_input)
    print('Assistant output now')
    print(assistant_output)
    conversation.append({'from': 'gpt', 'value': assistant_output})

    training_data.append({'conversations' : conversation})

Human input now
Following are examples which all follow the same transformation rule from input to output.
        Input1:
        [[0, 7, 7], [7, 7, 7], [0, 7, 7]]
        Output1:
        [[0, 0, 0, 0, 7, 7, 0, 7, 7], [0, 0, 0, 7, 7, 7, 7, 7, 7], [0, 0, 0, 0, 7, 7, 0, 7, 7], [0, 7, 7, 0, 7, 7, 0, 7, 7], [7, 7, 7, 7, 7, 7, 7, 7, 7], [0, 7, 7, 0, 7, 7, 0, 7, 7], [0, 0, 0, 0, 7, 7, 0, 7, 7], [0, 0, 0, 7, 7, 7, 7, 7, 7], [0, 0, 0, 0, 7, 7, 0, 7, 7]]
        
        Input2:
        [[4, 0, 4], [0, 0, 0], [0, 4, 0]]
        Output2:
        [[4, 0, 4, 0, 0, 0, 4, 0, 4], [0, 0, 0, 0, 0, 0, 0, 0, 0], [0, 4, 0, 0, 0, 0, 0, 4, 0], [0, 0, 0, 0, 0, 0, 0, 0, 0], [0, 0, 0, 0, 0, 0, 0, 0, 0], [0, 0, 0, 0, 0, 0, 0, 0, 0], [0, 0, 0, 4, 0, 4, 0, 0, 0], [0, 0, 0, 0, 0, 0, 0, 0, 0], [0, 0, 0, 0, 4, 0, 0, 0, 0]]
        
        Input3:
        [[0, 0, 0], [0, 0, 2], [2, 0, 2]]
        Output3:
        [[0, 0, 0, 0, 0, 0, 0, 0, 0], [0, 0, 0, 0, 0, 0, 0, 0, 0], [0, 0, 0, 0, 0, 0, 0, 0, 0], [0, 0, 0, 0, 0

In [8]:
#print(training_data)

**Loading Model**

In [9]:
#del model
#del tokenizer
#torch.cuda.empty_cache()

#torch.cuda.memory_summary(device=None, abbreviated=False)

In [10]:
max_seq_length = 2048
load_in_4bit = False
instruct = True
model_name = ''

if load_in_4bit and instruct:
    model_name = "unsloth/Meta-Llama-3.1-8B-Instruct-bnb-4bit"
if load_in_4bit and not instruct:
    model_name = "unsloth/Meta-Llama-3.1-8B-bnb-4bit"
if not load_in_4bit and not instruct:
    model_name = "unsloth/Meta-Llama-3.1-8B"
if not load_in_4bit and instruct:
    model_name = "unsloth/Meta-Llama-3.1-8B-Instruct"

description = ""
if load_in_4bit:
    description += "_load_in_4_bit"
if instruct: 
    description += "_instruct"


model, tokenizer = FastLanguageModel.from_pretrained(
    model_name=model_name,
    max_seq_length=max_seq_length,
    load_in_4bit=load_in_4bit,
    dtype=None,
)

==((====))==  Unsloth 2025.2.12: Fast Llama patching. Transformers: 4.48.3.
   \\   /|    GPU: Tesla V100-SXM2-32GB. Max memory: 31.733 GB. Platform: Linux.
O^O/ \_/ \    Torch: 2.5.1. CUDA: 7.0. CUDA Toolkit: 12.1. Triton: 3.1.0
\        /    Bfloat16 = FALSE. FA [Xformers = 0.0.28.post3. FA2 = False]
 "-____-"     Free Apache license: http://github.com/unslothai/unsloth
Unsloth: Fast downloading is enabled - ignore downloading bars which are red colored!


Loading checkpoint shards: 100%|██████████| 4/4 [00:19<00:00,  4.96s/it]


**Hyperparameter Tuning Setup**

In [11]:
# Hyperparameter search space
param_grid = {
    "learning_rate": [5e-5, 5e-4, 1e-4],
    "per_device_train_batch_size": [4, 8, 16],  
    "num_train_epochs": [1, 3, 5],
    "weight_decay": [0.01, 0.1],
}

# Generate all combinations
param_combinations = list(itertools.product(*param_grid.values()))

# Store results
results = []
model_save_paths = {}

Ranked stabilized LoRA (rsLoRA: https://arxiv.org/abs/2312.03732) used for training

**Training**

In [12]:
model = FastLanguageModel.get_peft_model(
            model,
            r=16,
            lora_alpha=16,
            lora_dropout=0,
            target_modules=["q_proj", "k_proj", "v_proj", "up_proj", "down_proj", "o_proj", "gate_proj"],
            use_rslora=True,
            use_gradient_checkpointing="unsloth"
        )

Unsloth 2025.2.12 patched 32 layers with 32 QKV layers, 32 O layers and 32 MLP layers.


Parse dataset in ShareGPT style (not really necessary but seemed convenient) using a chat template

In [13]:
tokenizer = get_chat_template(
    tokenizer,
    mapping={"role": "from", "content": "value", "user": "human", "assistant": "gpt"},
    chat_template="chatml",
)

def apply_template(examples):
    messages = examples["conversations"]
    text = [tokenizer.apply_chat_template(message, tokenize=False, add_generation_prompt=False) for message in messages]
    return {"text": text}

dataset = Dataset.from_list(training_data)
dataset = dataset.map(apply_template, batched=True)

split_dataset = dataset.train_test_split(test_size=0.2)

train_dataset = split_dataset['train']
eval_dataset = split_dataset['test']

Unsloth: Will map <|im_end|> to EOS = <|eot_id|>.
Map: 100%|██████████| 400/400 [00:00<00:00, 13897.74 examples/s]


In [None]:
# Visualize one instruction-answer pair
train_dataset[0]

In [14]:
# CSV file where results are stored
csv_file = "grid_search_results.csv"

# Load previous results if file exists, otherwise create empty DataFrame
if os.path.exists(csv_file):
    df_results = pd.read_csv(csv_file)
else:
    df_results = pd.DataFrame(columns=["model_name", "learning_rate", "batch_size", "epochs", "weight_decay", 
                                       "eval_loss", "train_time", "status", "save_path"])

for params in param_combinations:
    lr, batch_size, epochs, wd = params
    model_name = f"FineLlama-3.1-8B{description}_eval_lr{lr}_batch{batch_size}_epochs{epochs}_wd{wd}"

       # Check if the current HP combination has already been trained
    if ((df_results["learning_rate"] == lr) & 
        (df_results["batch_size"] == batch_size) & 
        (df_results["epochs"] == epochs) & 
        (df_results["weight_decay"] == wd)).any():
        print(f"Skipping {model_name} (Already trained with this HP setting)")
        continue  # Skip to the next HP configuration
    
    save_path = f"trained_models/{model_name}"
    print(f"\nTrying: {model_name}")

    
    start_time = time.time()

    try:
        
        trainer = SFTTrainer(
            model=model,
            tokenizer=tokenizer,
            train_dataset=train_dataset,
            eval_dataset=eval_dataset,
            args=TrainingArguments(
                max_steps=-1,
                learning_rate=lr,
                lr_scheduler_type="linear",
                per_device_train_batch_size=batch_size,
                per_device_eval_batch_size=batch_size,
                gradient_accumulation_steps=2,
                num_train_epochs=epochs,
                fp16=not is_bfloat16_supported(),
                bf16=is_bfloat16_supported(),
                logging_steps=1,
                save_strategy="steps",
                save_steps=20,
                save_total_limit=2,
                eval_strategy="steps",
                eval_steps=20,
                load_best_model_at_end=False,
                optim="adamw_8bit",
                weight_decay=wd,
                warmup_steps=10,
                output_dir="output",
                seed=0,
            ),
        )

        # Train the model
        trainer.train(resume_from_checkpoint=False)

        end_time = time.time()
        train_time = end_time - start_time

        # Evaluate model
        eval_results = trainer.evaluate()
        eval_loss = eval_results["eval_loss"]

        # Log results
        new_result = {
            "model_name": model_name,
            "learning_rate": lr,
            "batch_size": batch_size,
            "epochs": epochs,
            "weight_decay": wd,
            "eval_loss": eval_loss,
            "train_time": train_time,
            "status": "Success",
            "save_path": save_path  # Save path is recorded but model is saved only if in top 3
        }

        df_results.loc[len(df_results)] = new_result
        

        # Sort results by eval_loss and keep only top 3 models
        df_results_top3 = df_results.sort_values(by="eval_loss").head(3)

        # Save only if this model is in the top 3
        if model_name in df_results_top3["model_name"].values:
            os.makedirs(save_path, exist_ok=True)
            model.save_pretrained_merged(save_path, tokenizer, save_method="merged_16bit", safe_serialization=None)
            model_save_paths[model_name] = save_path
            print(f"Model {model_name} saved as one of the top 3 models.")

            # If there are now more than 3 saved models, remove the oldest (rank #4 before update) - otherwise runs into disk space issues
            previously_saved_models = set(model_save_paths.keys())
            new_top3_models = set(df_results_top3["model_name"].values)
            
            # Identify model that dropped out of the top 3 (should always just be one at a time)
            models_to_remove = previously_saved_models - new_top3_models

            for old_model in models_to_remove:
                old_save_path = model_save_paths.get(old_model, None)
                if old_save_path and os.path.exists(old_save_path):
                    print(f"Removing old model: {old_model} to free disk space.")
                    os.system(f"rm -rf {old_save_path}")  # Delete directory
                    del model_save_paths[old_model]  # Remove from tracking list

            

    except torch.cuda.OutOfMemoryError:
        print(f"OOM Error: Skipping {model_name} due to VRAM limits!")
        torch.cuda.empty_cache()

        new_result = {
            "model_name": model_name,
            "learning_rate": lr,
            "batch_size": batch_size,
            "epochs": epochs,
            "weight_decay": wd,
            "eval_loss": None,
            "train_time": None,
            "status": "OOM Error",
            "save_path": None
        }
        df_results.loc[len(df_results)] = new_result
    

    except Exception as e:
        print(f"Other Error: {str(e)} - Skipping {model_name}")

        new_result = {
            "model_name": model_name,
            "learning_rate": lr,
            "batch_size": batch_size,
            "epochs": epochs,
            "weight_decay": wd,
            "eval_loss": None,
            "train_time": None,
            "status": f"Error: {str(e)}",
            "save_path": None
        }
        df_results.loc[len(df_results)] = new_result
    
    # Append the new results to CSV without overwriting previous ones
    df_results.to_csv(csv_file, index=False)

    print(f"Results saved. Current combinations checked: {len(df_results)}")

print("Grid search completed!")



Skipping FineLlama-3.1-8B_instruct_eval_lr5e-05_batch4_epochs1_wd0.01 (Already trained with this HP setting)
Skipping FineLlama-3.1-8B_instruct_eval_lr5e-05_batch4_epochs1_wd0.1 (Already trained with this HP setting)
Skipping FineLlama-3.1-8B_instruct_eval_lr5e-05_batch4_epochs3_wd0.01 (Already trained with this HP setting)
Skipping FineLlama-3.1-8B_instruct_eval_lr5e-05_batch4_epochs3_wd0.1 (Already trained with this HP setting)
Skipping FineLlama-3.1-8B_instruct_eval_lr5e-05_batch4_epochs5_wd0.01 (Already trained with this HP setting)
Skipping FineLlama-3.1-8B_instruct_eval_lr5e-05_batch4_epochs5_wd0.1 (Already trained with this HP setting)
Skipping FineLlama-3.1-8B_instruct_eval_lr5e-05_batch8_epochs1_wd0.01 (Already trained with this HP setting)
Skipping FineLlama-3.1-8B_instruct_eval_lr5e-05_batch8_epochs1_wd0.1 (Already trained with this HP setting)
Skipping FineLlama-3.1-8B_instruct_eval_lr5e-05_batch8_epochs3_wd0.01 (Already trained with this HP setting)
Skipping FineLlama-3.1-

**Safeing best models**

Uploading the best three models (ranked by eval_loss) to HF

In [17]:
#Upload Best Trained Models

# Load results
df_results = pd.read_csv("grid_search_results.csv")

# Drop failed runs (OOM or errors)
df_results = df_results[df_results["status"] == "Success"]

# Sort by eval_loss (lower is better)
df_results = df_results.sort_values(by="eval_loss")

# Select the best three trained models
top_models = df_results.head(3)

print("Uploading the top 3 trained models based on eval_loss...")

for idx, row in top_models.iterrows():
    model_name = row["model_name"]
    save_path = row["save_path"]

    if save_path and os.path.exists(save_path):

        print(f"Reloading model: {model_name}")

        # Load the trained model from `save_path`
        model, tokenizer = FastLanguageModel.from_pretrained(
            model_name=save_path,  
            max_seq_length=max_seq_length,  
            load_in_4bit=load_in_4bit,     
            dtype=None
        )
        
        print(f"Uploading model: {model_name}")

        # Upload trained model to Hugging Face
        repo_id = f"phogen/{model_name}"
        model.push_to_hub_merged(repo_id, tokenizer, save_method="merged_16bit", token="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", safe_serialization = None)

        print(f"Model {model_name} uploaded successfully!")
    else:
        print(f"Skipping {model_name} because no trained model was found.")

print("Upload of top 3 trained models completed!")

Uploading the top 3 trained models based on eval_loss...
Reloading model: FineLlama-3.1-8B_instruct_eval_lr5e-05_batch4_epochs1_wd0.01
==((====))==  Unsloth 2025.2.12: Fast Llama patching. Transformers: 4.48.3.
   \\   /|    GPU: Tesla V100-SXM2-32GB. Max memory: 31.733 GB. Platform: Linux.
O^O/ \_/ \    Torch: 2.5.1. CUDA: 7.0. CUDA Toolkit: 12.1. Triton: 3.1.0
\        /    Bfloat16 = FALSE. FA [Xformers = 0.0.28.post3. FA2 = False]
 "-____-"     Free Apache license: http://github.com/unslothai/unsloth
Unsloth: Fast downloading is enabled - ignore downloading bars which are red colored!


Loading checkpoint shards: 100%|██████████| 4/4 [00:24<00:00,  6.04s/it]


Uploading model: FineLlama-3.1-8B_instruct_eval_lr5e-05_batch4_epochs1_wd0.01


Unsloth: You are pushing to hub, but you passed your HF username = phogen.
We shall truncate phogen/FineLlama-3.1-8B_instruct_eval_lr5e-05_batch4_epochs1_wd0.01 to FineLlama-3.1-8B_instruct_eval_lr5e-05_batch4_epochs1_wd0.01


Unsloth: Merging 4bit and LoRA weights to 16bit...
Unsloth: Will use up to 265.17 out of 377.54 RAM for saving.
Unsloth: Saving model... This might take 5 minutes ...


100%|██████████| 32/32 [00:00<00:00, 1129.33it/s]


Unsloth: Saving tokenizer...

tokenizer.json: 100%|██████████| 17.2M/17.2M [00:01<00:00, 11.7MB/s]


 Done.




model-00001-of-00004.safetensors:   0%|          | 0.00/4.98G [00:00<?, ?B/s][A[A
model-00003-of-00004.safetensors:   0%|          | 0.00/4.92G [00:00<?, ?B/s]


model-00002-of-00004.safetensors:   0%|          | 0.00/5.00G [00:00<?, ?B/s][A[A[A



model-00004-of-00004.safetensors:   0%|          | 0.00/1.17G [00:00<?, ?B/s][A[A[A[A

model-00001-of-00004.safetensors:   0%|          | 852k/4.98G [00:00<10:00, 8.29MB/s][A[A


model-00002-of-00004.safetensors:   0%|          | 901k/5.00G [00:00<09:40, 8.61MB/s][A[A[A



model-00003-of-00004.safetensors:   0%|          | 1.62M/4.92G [00:00<07:07, 11.5MB/s][A[A[A[A

model-00001-of-00004.safetensors:   0%|          | 1.69M/4.98G [00:00<10:11, 8.13MB/s][A[A



model-00004-of-00004.safetensors:   0%|          | 2.26M/1.17G [00:00<01:46, 11.0MB/s][A[A[A[A


model-00002-of-00004.safetensors:   0%|          | 1.77M/5.00G [00:00<13:49, 6.03MB/s][A[A[A



model-00004-of-00004.safetensors:   0%|          | 3.87M/1.17G [0

Done.
Saved merged model to https://huggingface.co/phogen/FineLlama-3.1-8B_instruct_eval_lr5e-05_batch4_epochs1_wd0.01
Model FineLlama-3.1-8B_instruct_eval_lr5e-05_batch4_epochs1_wd0.01 uploaded successfully!
Reloading model: FineLlama-3.1-8B_instruct_eval_lr0.0005_batch4_epochs5_wd0.1
==((====))==  Unsloth 2025.2.12: Fast Llama patching. Transformers: 4.48.3.
   \\   /|    GPU: Tesla V100-SXM2-32GB. Max memory: 31.733 GB. Platform: Linux.
O^O/ \_/ \    Torch: 2.5.1. CUDA: 7.0. CUDA Toolkit: 12.1. Triton: 3.1.0
\        /    Bfloat16 = FALSE. FA [Xformers = 0.0.28.post3. FA2 = False]
 "-____-"     Free Apache license: http://github.com/unslothai/unsloth
Unsloth: Fast downloading is enabled - ignore downloading bars which are red colored!


Loading checkpoint shards: 100%|██████████| 4/4 [00:25<00:00,  6.45s/it]


Uploading model: FineLlama-3.1-8B_instruct_eval_lr0.0005_batch4_epochs5_wd0.1


Unsloth: You are pushing to hub, but you passed your HF username = phogen.
We shall truncate phogen/FineLlama-3.1-8B_instruct_eval_lr0.0005_batch4_epochs5_wd0.1 to FineLlama-3.1-8B_instruct_eval_lr0.0005_batch4_epochs5_wd0.1


Unsloth: Merging 4bit and LoRA weights to 16bit...
Unsloth: Will use up to 265.1 out of 377.54 RAM for saving.
Unsloth: Saving model... This might take 5 minutes ...


100%|██████████| 32/32 [00:00<00:00, 539.46it/s]


Unsloth: Saving tokenizer...

tokenizer.json: 100%|██████████| 17.2M/17.2M [00:01<00:00, 11.6MB/s]


 Done.





Upload 4 LFS files:   0%|          | 0/4 [00:00<?, ?it/s][A[A[A

model-00004-of-00004.safetensors:   0%|          | 0.00/1.17G [00:00<?, ?B/s][A[A
model-00003-of-00004.safetensors:   0%|          | 0.00/4.92G [00:00<?, ?B/s][A



model-00002-of-00004.safetensors:   0%|          | 0.00/5.00G [00:00<?, ?B/s][A[A[A[A



model-00003-of-00004.safetensors:   0%|          | 10.6M/4.92G [00:00<01:05, 75.0MB/s][A[A[A[A
model-00001-of-00004.safetensors:   0%|          | 11.2M/4.98G [00:00<01:27, 57.0MB/s][A

model-00004-of-00004.safetensors:   1%|▏         | 15.2M/1.17G [00:00<00:15, 73.8MB/s][A[A



model-00002-of-00004.safetensors:   0%|          | 15.2M/5.00G [00:00<01:44, 47.6MB/s][A[A[A[A

model-00003-of-00004.safetensors:   0%|          | 18.1M/4.92G [00:00<03:30, 23.3MB/s][A[A



model-00002-of-00004.safetensors:   0%|          | 20.4M/5.00G [00:00<03:25, 24.2MB/s][A[A[A[A

model-00004-of-00004.safetensors:   3%|▎         | 31.9M/1.17G [00:00<00:29, 38.5MB/s]

Done.
Saved merged model to https://huggingface.co/phogen/FineLlama-3.1-8B_instruct_eval_lr0.0005_batch4_epochs5_wd0.1
Model FineLlama-3.1-8B_instruct_eval_lr0.0005_batch4_epochs5_wd0.1 uploaded successfully!
Reloading model: FineLlama-3.1-8B_instruct_eval_lr5e-05_batch4_epochs1_wd0.1
==((====))==  Unsloth 2025.2.12: Fast Llama patching. Transformers: 4.48.3.
   \\   /|    GPU: Tesla V100-SXM2-32GB. Max memory: 31.733 GB. Platform: Linux.
O^O/ \_/ \    Torch: 2.5.1. CUDA: 7.0. CUDA Toolkit: 12.1. Triton: 3.1.0
\        /    Bfloat16 = FALSE. FA [Xformers = 0.0.28.post3. FA2 = False]
 "-____-"     Free Apache license: http://github.com/unslothai/unsloth
Unsloth: Fast downloading is enabled - ignore downloading bars which are red colored!


Loading checkpoint shards: 100%|██████████| 4/4 [00:24<00:00,  6.19s/it]


Uploading model: FineLlama-3.1-8B_instruct_eval_lr5e-05_batch4_epochs1_wd0.1


Unsloth: You are pushing to hub, but you passed your HF username = phogen.
We shall truncate phogen/FineLlama-3.1-8B_instruct_eval_lr5e-05_batch4_epochs1_wd0.1 to FineLlama-3.1-8B_instruct_eval_lr5e-05_batch4_epochs1_wd0.1


Unsloth: Merging 4bit and LoRA weights to 16bit...
Unsloth: Will use up to 265.18 out of 377.54 RAM for saving.
Unsloth: Saving model... This might take 5 minutes ...


100%|██████████| 32/32 [00:00<00:00, 1126.86it/s]


Unsloth: Saving tokenizer...

tokenizer.json: 100%|██████████| 17.2M/17.2M [00:01<00:00, 9.37MB/s]


 Done.


model-00003-of-00004.safetensors:   0%|          | 0.00/4.92G [00:00<?, ?B/s]

model-00004-of-00004.safetensors:   0%|          | 0.00/1.17G [00:00<?, ?B/s][A[A


Upload 4 LFS files:   0%|          | 0/4 [00:00<?, ?it/s][A[A[A
model-00001-of-00004.safetensors:   0%|          | 0.00/4.98G [00:00<?, ?B/s][A



model-00002-of-00004.safetensors:   0%|          | 0.00/5.00G [00:00<?, ?B/s][A[A[A[A

model-00004-of-00004.safetensors:   1%|          | 10.7M/1.17G [00:00<00:15, 74.3MB/s][A[A
model-00003-of-00004.safetensors:   0%|          | 13.5M/4.92G [00:00<01:18, 62.6MB/s][A



model-00002-of-00004.safetensors:   0%|          | 16.0M/5.00G [00:00<03:43, 22.3MB/s][A[A[A[A
model-00003-of-00004.safetensors:   0%|          | 19.8M/4.92G [00:00<03:39, 22.4MB/s][A
model-00003-of-00004.safetensors:   1%|          | 27.9M/4.92G [00:00<02:33, 31.7MB/s][A
model-00001-of-00004.safetensors:   1%|          | 34.9M/4.98G [00:01<03:20, 24.6MB/s][A

model-00004-of-00004.safetensors:   2

Done.
Saved merged model to https://huggingface.co/phogen/FineLlama-3.1-8B_instruct_eval_lr5e-05_batch4_epochs1_wd0.1
Model FineLlama-3.1-8B_instruct_eval_lr5e-05_batch4_epochs1_wd0.1 uploaded successfully!
Upload of top 3 trained models completed!
