HW3

Implement the “Self Alignment with Instruction Backtranslation” paper. When fine tuning the model, use LoRA or QLoRA. You will not be able to do full finetuning because there is not enough memory.


Link to paper: https://arxiv.org/pdf/2308.06259.pdf
If you are not able to connect to a GPU on colab, you can try to create a PyTorch Lightning Studio or a Kaggle notebook.


In particular:

Finetune the base language model (llama2 7B) with (output, instruction) pairs {(yi, xi)} from the seed data to obtain a backward model Myx := p(x|y). In other words, finetune a model that uses the output to predict the instruction. Use the openassistant-guanaco training set dataset. (25 points)
Push the backwards model to HF and paste url here.  https://huggingface.co/panchub/backward_model

Self-Augmentation -- generate instructions from the LIMA dataset’s completions and filtering out any mutli-turn examples (25 points)

Self curation (selecting high quality examples) using few shot prompting in addition to the prompt in Table 1 of the paper. (25 points)
Push the dataset to HF hub and paste the url here  https://huggingface.co/datasets/panchub/high_quality_dataset

Finetune base model on dataset generated by step 3 (25 points)
Push the instruction fine tuned model to HF hub and paste the url here
Please include a link to your colab notebook here: https://huggingface.co/panchub/fine_tuned_instruction


In [None]:
!nvidia-smi

Mon Feb 26 03:45:15 2024       
+---------------------------------------------------------------------------------------+
| NVIDIA-SMI 530.30.02              Driver Version: 530.30.02    CUDA Version: 12.1     |
|-----------------------------------------+----------------------+----------------------+
| GPU  Name                  Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf            Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|                                         |                      |               MIG M. |
|   0  Tesla T4                        On | 00000000:00:1E.0 Off |                    0 |
| N/A   21C    P8                9W /  70W|      0MiB / 15360MiB |      0%      Default |
|                                         |                      |                  N/A |
+-----------------------------------------+----------------------+----------------------+
                                                                    

In [None]:
!pip install -q -U git+https://github.com/huggingface/peft.git
!pip install -q -U git+https://github.com/huggingface/accelerate.git
!pip install bitsandbytes



In [None]:
!pip install -q datasets

In [None]:
import bitsandbytes
import transformers

In [None]:
%matplotlib inline
import matplotlib.pyplot as plt
import numpy as np
import torch
from torch import nn as nn
from torch.nn import functional as F
from torch import optim

Finetune the base language model (llama2 7B) with (output, instruction) pairs {(yi, xi)} from the seed data to obtain a backward model Myx := p(x|y). In other words, finetune a model that uses the output to predict the instruction. Use the openassistant-guanaco training set dataset. (25 points)
Push the backwards model to HF and paste url here


In [None]:
from transformers import AutoTokenizer
model_tag = "NousResearch/Llama-2-7b-chat-hf"
tokenizer = AutoTokenizer.from_pretrained(model_tag)

if tokenizer.pad_token is None:
    tokenizer.pad_token = tokenizer.eos_token

In [None]:
from datasets import load_dataset
dataset = load_dataset('timdettmers/openassistant-guanaco')
dataset = dataset['train'].train_test_split(test_size=0.1)
dataset = dataset.filter(lambda x: len(tokenizer.tokenize(x['text'])) < 256)

print(dataset)

alpaca_prompt = """Below is a response that appropriately completes the request. Write an instruction that describes a task.
### Response:
{}

### Instruction:
{}"""

EOS_TOKEN = tokenizer.eos_token



Filter:   0%|          | 0/8861 [00:00<?, ? examples/s]

Filter:   0%|          | 0/985 [00:00<?, ? examples/s]

DatasetDict({
    train: Dataset({
        features: ['text'],
        num_rows: 3050
    })
    test: Dataset({
        features: ['text'],
        num_rows: 317
    })
})


In [None]:
alpaca_prompt_M0 = """Below is an instruction that contains a task. Write a response that appropriately completes the request.
### Instruction:
{}

### Response:
{}"""


In [None]:
splits = ['train', 'test']
texts = []
texts_M0=[]
for split in splits:
    for text in dataset[split]["text"]:
        chunks = text.split("### ")
        chunks = [c for c in chunks if len(c) > 0]
        chunks = chunks[:2]
        if not (chunks[0].startswith('Human: ') or chunks[1].startswith('Assistant: ')):
            continue
        instruction = chunks[0].replace("Human: ", "")
        response = chunks[1].replace("Assistant: ", "")

        texts.append(alpaca_prompt.format(response,instruction) + EOS_TOKEN)
        texts_M0.append(alpaca_prompt_M0.format(instruction,response) + EOS_TOKEN)


In [None]:
from datasets import Dataset
dataset = Dataset.from_dict({"text": texts})
dataset_M0= Dataset.from_dict({"text": texts_M0})

In [None]:
dataset

Dataset({
    features: ['text'],
    num_rows: 3367
})

In [None]:
dataset_M0["text"][0]

'Below is an instruction that contains a task. Write a response that appropriately completes the request. \n### Instruction:\nPlease copy my next message to you into your reply for that message without removing or adding anything. are you ready?\n\n### Response:\nYes, i am ready. Please input your next message, and i will reply with that same message without changing anything.</s>'

In [None]:
max_length = 256
def generate_and_tokenize_prompt(prompt):
    result = tokenizer(
        prompt['text'],
        truncation=True,
        max_length=max_length,
        padding="max_length",
    )
    result["labels"] = result["input_ids"].copy()
    return result

In [None]:
tokenized_dataset = dataset.map(generate_and_tokenize_prompt)
tokenized_dataset_M0=dataset_M0.map(generate_and_tokenize_prompt)

Map:   0%|          | 0/3367 [00:00<?, ? examples/s]

Map:   0%|          | 0/3367 [00:00<?, ? examples/s]

In [None]:
import os
import torch
from datasets import load_dataset
from transformers import (
    AutoModelForCausalLM,
    AutoTokenizer,
    BitsAndBytesConfig,
    TrainingArguments,
    pipeline,
    logging,
)
from peft import LoraConfig

In [None]:
compute_dtype = getattr(torch, "float16")

quant_config = BitsAndBytesConfig(
    load_in_4bit=True,
    bnb_4bit_quant_type="nf4",
    bnb_4bit_compute_dtype=compute_dtype,
    bnb_4bit_use_double_quant=False,
)

In [None]:
from transformers import AutoTokenizer, AutoModelForCausalLM
model = AutoModelForCausalLM.from_pretrained(
    model_tag,
    quantization_config=quant_config,
    device_map="auto",
)
model.config.use_cache = False
model.config.pretraining_tp = 1

Loading checkpoint shards:   0%|          | 0/2 [00:00<?, ?it/s]



In [None]:
from peft import LoraConfig, get_peft_model

config = LoraConfig(
    r=32,
    lora_alpha=64,
    target_modules=[
        "q_proj",
        "k_proj",
        "v_proj",
        "fc1",
        "fc2",
        "dense",
        "lm_head"
    ],
    bias="none",
    lora_dropout=0.05,
    task_type="CAUSAL_LM",
)

model = get_peft_model(model, config)

train a backward model


In [None]:
from transformers import Trainer, TrainingArguments, DataCollatorForLanguageModeling



training_args = TrainingArguments(
    output_dir="./output",
    per_device_train_batch_size=1,
    gradient_accumulation_steps=1,
    lr_scheduler_type='cosine',
    max_steps=50,
    learning_rate=2e-5,
    optim="paged_adamw_8bit",
    logging_steps=5,
)


trainer = Trainer(
    model=model,
    train_dataset=tokenized_dataset,
    args=training_args,
    data_collator=DataCollatorForLanguageModeling(tokenizer, mlm=False)
)

trainer.train()

dataloader_config = DataLoaderConfiguration(dispatch_batches=None, split_batches=False, even_batches=True, use_seedable_sampler=True)


TrainOutput(global_step=50, training_loss=2.407985973358154, metrics={'train_runtime': 35.134, 'train_samples_per_second': 1.423, 'train_steps_per_second': 1.423, 'total_flos': 509465434521600.0, 'train_loss': 2.407985973358154, 'epoch': 0.01})

train a forward model

In [None]:


training_args = TrainingArguments(
    output_dir="./output_1",
    per_device_train_batch_size=1,
    gradient_accumulation_steps=1,
    lr_scheduler_type='cosine',
    max_steps=50,
    learning_rate=2e-5,
    optim="paged_adamw_8bit",
    logging_steps=5,
)
trainer_M0 = Trainer(
    model=model,
    train_dataset=tokenized_dataset_M0,
    args=training_args,
    data_collator=DataCollatorForLanguageModeling(tokenizer, mlm=False)
)
trainer_M0.train()

Step,Training Loss
5,2.0114
10,2.1256
15,1.9066
20,1.9059
25,2.122
30,1.8969
35,1.9151
40,1.9308
45,1.7498


TrainOutput(global_step=50, training_loss=1.9823297119140626, metrics={'train_runtime': 33.9612, 'train_samples_per_second': 1.472, 'train_steps_per_second': 1.472, 'total_flos': 509465434521600.0, 'train_loss': 1.9823297119140626, 'epoch': 0.01})

In [None]:
from huggingface_hub import notebook_login

# Use the notebook_login function to log in
notebook_login()

VBox(children=(HTML(value='<center> <img\nsrc=https://huggingface.co/front/assets/huggingface_logo-noborder.sv…

In [None]:
new_model_name = "llama-2-7b-chat-guanaco"
new_model_name_fwd= "llama-2-7b-chat-guanaco_fwd"

trainer.tokenizer = tokenizer
trainer_M0.tokenizer = tokenizer
# Save model and tokenizer
trainer.model.save_pretrained(new_model_name)
trainer.tokenizer.save_pretrained(new_model_name)
#push the model both backward and forward to hub
trainer.model.push_to_hub('panchub/backward_model')
trainer_M0.model.save_pretrained(new_model_name)
trainer_M0.tokenizer.save_pretrained(new_model_name)
trainer_M0.model.push_to_hub('panchub/forward_model')

print(trainer.model)
print(trainer.tokenizer)




adapter_model.safetensors:   0%|          | 0.00/365M [00:00<?, ?B/s]

adapter_model.safetensors:   0%|          | 0.00/365M [00:00<?, ?B/s]

PeftModelForCausalLM(
  (base_model): LoraModel(
    (model): LlamaForCausalLM(
      (model): LlamaModel(
        (embed_tokens): Embedding(32000, 4096, padding_idx=0)
        (layers): ModuleList(
          (0-31): 32 x LlamaDecoderLayer(
            (self_attn): LlamaSdpaAttention(
              (q_proj): lora.Linear4bit(
                (base_layer): Linear4bit(in_features=4096, out_features=4096, bias=False)
                (lora_dropout): ModuleDict(
                  (default): Dropout(p=0.05, inplace=False)
                )
                (lora_A): ModuleDict(
                  (default): Linear(in_features=4096, out_features=32, bias=False)
                )
                (lora_B): ModuleDict(
                  (default): Linear(in_features=32, out_features=4096, bias=False)
                )
                (lora_embedding_A): ParameterDict()
                (lora_embedding_B): ParameterDict()
              )
              (k_proj): lora.Linear4bit(
                (base_

Self-Augmentation -- generate instructions from the LIMA dataset’s completions and filtering out any mutli-turn examples

In [None]:
dataset_lima=load_dataset('GAIR/lima')

In [None]:
dataset_lima = dataset_lima.filter(lambda x: len((x['conversations'])) < 3)

print(dataset_lima)

DatasetDict({
    train: Dataset({
        features: ['conversations', 'source'],
        num_rows: 1000
    })
    test: Dataset({
        features: ['conversations', 'source'],
        num_rows: 300
    })
})


In [None]:
dataset_lima["train"][0]

{'conversations': ['Can brain cells move? By movement I mean long distance migration (preferably within the brain only).',
  'The question is relatively broad and one should take into account that the brain not only consists of neurons, but also glial cells (supportive cells) and pre-mitotic neuronal stem cells. Furthermore, as critical fellow-scientists have indicated, developmental stage is very important, as the developing embryonic brain is very different from the adult brain.\nHowever, after sifting through various publications, the answer to the question is actually remarkably simple: Yes, brain cells migrate.\nIn  the adult brain glial cells migrate in the brain (Klämbt, 2009). Glial cells are involved in a myriad of functions, but a notable example of migrating glial cells are the oligodendrocytes that migrate relative long distances to find their target axons onto which they wrap themselves to form the insulating myelin sheath (Tsai and Miller, 2002).\nNeuronal stem cells migr

In [None]:
# Given output
output_text = """Suppose you have a theorem that says "If $X$, then $Y$."  There are two ways to strengthen such a theorem:\n\n* Assume less.  If you can reduce the number of hypotheses, but still prove the same conclusion, then you have proved a more "powerful" result (in the sense that it applies in more situations).\n* Prove more.  If you can keep the same hypotheses, but add more information to the conclusion, then you have also produced a more "powerful" result.\n\nHere is an easy example from Geometry.\n\n  Let $ABCD$ be a (non-square) rectangle.  Then the internal angle bisectors of the vertices intersect at four points $WXYZ$, which are the vertices of a rectangle.\n\n(You need the condition that $ABCD$ is not a square because if it is a square then all four angle bisectors coincide at a single point.)\nHere are a few ways to strengthen the theorem:\n\n* The hypothesis "$ABCD$ is a (non-square) rectangle" can be relaxed to the more general "$ABCD$ is a (non-rhombic) parallelogram".  The conclusion that $WXYZ$ is a rectangle still holds.\n* Alternatively, you can keep the original hypothesis that $ABCD$ is a (non-square) rectangle, and strengthen to the conclusion to say that $WXYZ$ is not just a rectangle, but a square.\n* Having done that, you can then strengthen the conclusion of the theorem even more, by noting that the diagonal of square $WXYZ$ is equal in length to the difference of the lengths of the sides of $ABCD$.\n* Once you know that, you can now strengthen the theorem even more by (finally) removing the hypothesis that $ABCD$ is non-square, and including the  case in which the four angle bisectors coincide at a single point as forming a "degenerate" square with a diagonal of length zero.\n'"""

# Prepare input prompt for the backward model
prompt = f"Generate an instruction for the following output:\n{output_text}\n generated question:"
input_ids = trainer.tokenizer.encode(prompt, return_tensors="pt")

# Generate instruction using the fine-tuned backward model
generated_ids = trainer.model.generate(input_ids, max_length=500)  # Adjust max_length as needed
generated_instruction = trainer.tokenizer.decode(generated_ids[0], skip_special_tokens=True)

print(generated_instruction.split("generated question:")[-1].replace("\n", ""))




What are the different ways to strengthen the theorem?Answer:There are two ways to strengthen the theorem:1. Assume less: Relax the hypothesis that $ABCD


In [None]:
instructions = []
max_conv = 100
i = 0
for conv in dataset_lima["train"]:
    if i > max_conv:
        break
    ques, ans = conv["conversations"][0], conv["conversations"][1]

    # Prepare input prompt for the backward model
    prompt = f"Generate an instruction for the following output:\n{ans}\n generated question:"
    input_ids = trainer.tokenizer.encode(prompt, return_tensors="pt")

    # Generate instruction using the fine-tuned backward model
    generated_ids = trainer.model.generate(input_ids, max_length=2000)  # Adjust max_length as needed
    generated_instruction = trainer.tokenizer.decode(generated_ids[0], skip_special_tokens=True)

    #
    instructions.append({"question": ques, "ans": ans, "generated_ques": generated_instruction})

    if i % 10 == 0:
        print(f"Completed {i} predictions")

    i += 1

augmented_example=[]
# Print or use the generated instructions as needed
for instruction in instructions:
    augmented_example.append({'instruction':instruction["generated_ques"],'output':instruction["ans"]})


Completed 0 predictions
Completed 10 predictions
Completed 20 predictions
Completed 30 predictions
Completed 40 predictions
Completed 50 predictions
Completed 60 predictions
Completed 70 predictions
Completed 80 predictions
Completed 90 predictions
Completed 100 predictions


In [None]:
len(instructions)

101

In [None]:
new_augmented_example = []
for example in augmented_example:
    new_example = {
        'instruction': example['instruction'].split("generated question:")[-1].replace("\n", ""),
        'output': example['output']
    }
    new_augmented_example.append(new_example)
new_augmented_example[0]['instruction']


# The second part (if exists) is considered as the generated question


'What are the mechanisms that control the migration of brain cells?Answer:Yes, brain cells migrate. In the adult brain, glial cells migrate, and neuronal stem cells migrate over long distances in response to injury. Non-differentiated neurons have been shown to migrate in the adult brain in fish, mammals, and non-human primates. During embryonic development, post-mitotic neurons destined to fulfill peripheral functions migrate over relatively long distances from the neural crest to their target locations.'

Self curation (selecting high quality examples) using few shot prompting in addition to the prompt in Table 1 of the paper.

In [None]:
# Define the prompting mechanism
prompt = """Below is an instruction from an user and a candidate answer.  Evaluate whether or
not the answer is a good example of how AI Assistant should respond to the user’s
instruction.  Please assign a score using the following 5-point scale:
1:  It means the answer is incomplete, vague, off-topic, controversial, or not
exactly what the user asked for.  For example, some content seems missing, numbered
list does not start from the beginning, the opening sentence repeats user’s question.
Or the response is from another person’s perspective with their personal experience
(e.g.  taken from blog posts), or looks like an answer from a forum.  Or it contains
promotional text, navigation text, or other irrelevant information.
2:  It means the answer addresses most of the asks from the user.  It does not
directly address the user’s question.  For example, it only provides a high-level
methodology instead of the exact solution to user’s question.
3:  It means the answer is helpful but not written by an AI Assistant.  It addresses
all the basic asks from the user.  It is complete and self contained with the
drawback that the response is not written from an AI assistant’s perspective, but
from other people’s perspective.  The content looks like an excerpt from a blog post,
web page, or web search results.  For example, it contains personal experience or
opinion, mentions comments section, or share on social media, etc.
4:  It means the answer is written from an AI assistant’s perspective with a
clear focus of addressing the instruction.  It provide a complete, clear, and
comprehensive response to user’s question or instruction without missing or
irrelevant information.  It is well organized, self-contained, and written in a
helpful tone.  It has minor room for improvement, e.g.  more concise and focused.
5:  It means it is a perfect answer from an AI Assistant.  It has a clear focus on
being a helpful AI Assistant, where the response looks like intentionally written
to address the user’s question or instruction without any irrelevant sentences.  The
answer provides high quality content, demonstrating expert knowledge in the area, is
very well written, logical, easy-to-follow, engaging and insightful.
Please first provide a brief reasoning you used to derive the rating score, and
then write "Score:  <rating>" in the last line.

Instruction: {}\n
Answer: {}

"""




max_iterations = 100
i = 0
curated_set = []
for example in new_augmented_example:
    if i > max_iterations:
        break
    prompt = prompt.format(example["instruction"], example["output"])
    input_ids = trainer_M0.tokenizer.encode(prompt, return_tensors="pt")
    generated_ids = trainer_M0.model.generate(input_ids, max_new_tokens=512)  # Adjust max_length as needed
    score = tokenizer.decode(generated_ids[0], skip_special_tokens=True)
    curated_set.append({"generated_question": example["instruction"], "answer": example["output"], "score":score})
    i+=1




In [None]:
new_scores = []
print(float(curated_set[1]["score"].rsplit("Score:", 1)[-1].strip(" <>")[:1]))
for sample in curated_set:
    try:
        score = float(sample["score"].rsplit("Score:", 1)[-1].strip(" <>")[:1])
        new_scores.append({
        "generated_question": sample["generated_question"],
        "answer_to_question": sample["answer"],
        "score": score
    })
    except ValueError as ve:
        score = 0
        print(f"Error processing score for {sample}: {ve}")




5.0


In [None]:
high_ranking_score = float(3)
filtered_scores = [score for score in new_scores if score["score"] >= high_ranking_score]
filtered_scores



[{'generated_question': 'What are the mechanisms that control the migration of brain cells?Answer:Yes, brain cells migrate. In the adult brain, glial cells migrate, and neuronal stem cells migrate over long distances in response to injury. Non-differentiated neurons have been shown to migrate in the adult brain in fish, mammals, and non-human primates. During embryonic development, post-mitotic neurons destined to fulfill peripheral functions migrate over relatively long distances from the neural crest to their target locations.',
  'answer_to_question': 'The question is relatively broad and one should take into account that the brain not only consists of neurons, but also glial cells (supportive cells) and pre-mitotic neuronal stem cells. Furthermore, as critical fellow-scientists have indicated, developmental stage is very important, as the developing embryonic brain is very different from the adult brain.\nHowever, after sifting through various publications, the answer to the questi

In [None]:
texts_for_dataset = []
for data in filtered_scores:
    texts_for_dataset.append(alpaca_prompt_M0.format(data['generated_question'],data['answer_to_question']) + EOS_TOKEN)



In [None]:
#push dataset to hub
dataset_to_push = Dataset.from_dict({"text":texts_for_dataset})
dataset_to_push.push_to_hub('panchub/high_quality_dataset')

Uploading the dataset shards:   0%|          | 0/1 [00:00<?, ?it/s]

Creating parquet from Arrow format:   0%|          | 0/1 [00:00<?, ?ba/s]

CommitInfo(commit_url='https://huggingface.co/datasets/panchub/high_quality_dataset/commit/702d461829a4b74cc88ac9d609784d96e36538b3', commit_message='Upload dataset', commit_description='', oid='702d461829a4b74cc88ac9d609784d96e36538b3', pr_url=None, pr_revision=None, pr_num=None)

Finetune base model on dataset generated by step 3 (25 points)
Push the instruction fine tuned model to HF hub and paste the url here


In [None]:
from datasets import load_dataset
dataset = load_dataset('panchub/high_quality_dataset')
dataset = dataset['train'].train_test_split(test_size=0.1)

print(dataset)
splits = ['train', 'test']
texts = []


for split in splits:
    for text in dataset[split]["text"]:
        texts.append(text + EOS_TOKEN)


dataset = Dataset.from_dict({"text": texts})
max_length = 256


Downloading readme:   0%|          | 0.00/270 [00:00<?, ?B/s]

Downloading data:   0%|          | 0.00/148k [00:00<?, ?B/s]

Generating train split:   0%|          | 0/101 [00:00<?, ? examples/s]

DatasetDict({
    train: Dataset({
        features: ['text'],
        num_rows: 90
    })
    test: Dataset({
        features: ['text'],
        num_rows: 11
    })
})


In [None]:
def generate_and_tokenize_prompt(prompt):
    result = tokenizer(
        prompt['text'],
        truncation=True,
        max_length=max_length,
        padding="max_length",
    )
    result["labels"] = result["input_ids"].copy()
    return result

tokenized_dataset = dataset.map(generate_and_tokenize_prompt)
quant_config = BitsAndBytesConfig(
    load_in_4bit=True,
    bnb_4bit_quant_type="nf4",
    bnb_4bit_compute_dtype=compute_dtype,
    bnb_4bit_use_double_quant=False,
)

model_with_new_dataset = AutoModelForCausalLM.from_pretrained(
    model_tag,
    quantization_config=quant_config,
    device_map="auto",
)
model.config.use_cache = False
model.config.pretraining_tp = 1





Map:   0%|          | 0/101 [00:00<?, ? examples/s]

Loading checkpoint shards:   0%|          | 0/2 [00:00<?, ?it/s]



In [None]:
from peft import LoraConfig, get_peft_model

config = LoraConfig(
    r=32,
    lora_alpha=64,
    target_modules=[
        "q_proj",
        "k_proj",
        "v_proj",
        "fc1",
        "fc2",
        "dense",
        "lm_head"
    ],
    bias="none",
    lora_dropout=0.05,
    task_type="CAUSAL_LM",
)

model_with_new_dataset = get_peft_model(model_with_new_dataset, config)

In [None]:
from transformers import Trainer, TrainingArguments, DataCollatorForLanguageModeling



training_args = TrainingArguments(
    output_dir="./output",
    per_device_train_batch_size=1,
    gradient_accumulation_steps=1,
    lr_scheduler_type='cosine',
    max_steps=50,
    learning_rate=2e-5,
    optim="paged_adamw_8bit",
    logging_steps=5,
)


trainer_new = Trainer(
    model=model_with_new_dataset,
    train_dataset=tokenized_dataset,
    args=training_args,
    data_collator=DataCollatorForLanguageModeling(tokenizer, mlm=False)
)

trainer_new.train()

dataloader_config = DataLoaderConfiguration(dispatch_batches=None, split_batches=False, even_batches=True, use_seedable_sampler=True)


Step,Training Loss
5,2.1051
10,1.4684
15,2.005
20,1.5855
25,1.6046
30,2.1404
35,1.6524
40,1.9084
45,1.7909


TrainOutput(global_step=50, training_loss=1.835178213119507, metrics={'train_runtime': 34.1717, 'train_samples_per_second': 1.463, 'train_steps_per_second': 1.463, 'total_flos': 509465434521600.0, 'train_loss': 1.835178213119507, 'epoch': 0.5})

push the base model trained on dataset pushed to hub



In [None]:
trainer_new.model.push_to_hub('panchub/fine_tuned_instruction')

README.md:   0%|          | 0.00/5.18k [00:00<?, ?B/s]



adapter_model.safetensors:   0%|          | 0.00/365M [00:00<?, ?B/s]

CommitInfo(commit_url='https://huggingface.co/panchub/fine_tuned_instruction/commit/b34357e953cb0db774917974baf563126a40f764', commit_message='Upload model', commit_description='', oid='b34357e953cb0db774917974baf563126a40f764', pr_url=None, pr_revision=None, pr_num=None)