In [1]:
from unsloth import FastLanguageModel
import torch

max_seq_length = 4096 # Choose any! We auto support RoPE Scaling internally!
dtype = None # None for auto detection. Float16 for Tesla T4, V100, Bfloat16 for Ampere+
load_in_4bit = True # Use 4bit quantization to reduce memory usage. Can be False.

model, tokenizer=FastLanguageModel.from_pretrained(
    model_name="Qwen/Qwen2.5-1.5B-Instruct",
    max_seq_length=max_seq_length,
    dtype=dtype,
    load_in_4bit=load_in_4bit,
)

🦥 Unsloth: Will patch your computer to enable 2x faster free finetuning.
==((====))==  Unsloth 2024.10.7: Fast Qwen2 patching. Transformers = 4.45.2.
   \\   /|    GPU: NVIDIA GeForce RTX 3090. Max memory: 23.668 GB. Platform = Linux.
O^O/ \_/ \    Pytorch: 2.4.0+cu121. CUDA = 8.6. CUDA Toolkit = 12.1.
\        /    Bfloat16 = TRUE. FA [Xformers = 0.0.27.post2. FA2 = True]
 "-____-"     Free Apache license: http://github.com/unslothai/unsloth


Unsloth: We fixed a gradient accumulation bug, but it seems like you don't have the latest transformers version!
Please update transformers, TRL and unsloth via:
`pip install --upgrade --no-cache-dir --no-deps unsloth transformers git+https://github.com/huggingface/trl.git`


In [2]:
model=FastLanguageModel.get_peft_model(
    model,
    r=64, # Choose any number > 0 ! Suggested 8, 16, 32, 64, 128
    target_modules=["q_proj", "k_proj", "v_proj", "o_proj",
                    "gate_proj", "up_proj", "down_proj"],
    lora_alpha=64,
    lora_dropout=0, # Currently only supports dropout=0
    bias="none",    # Currently only supports bias="none"
    # [NEW] "unsloth" uses 30% less VRAM, fits 2x larger batch sizes!
    use_gradient_checkpointing="unsloth", # True or "unsloth" for very long context
    random_state=3407,
    use_rslora=False,  # We support rank stabilized LoRA
    loftq_config=None, # And LoftQ
)

Unsloth 2024.10.7 patched 28 layers with 0 QKV layers, 28 O layers and 28 MLP layers.


In [12]:
prompt_template = """<|im_start|>system
You are a helpful assistant.<|im_end|>
<|im_start|>user
{}<|im_end|>
<|im_start|>assistant
{}<|im_end|>"""

EOS_TOKEN = tokenizer.eos_token # Must add EOS_TOKEN

def format_prompt(examples):
    prompts   = examples["prompt"]
    responses = examples["response"]
    texts = []
    for prompt, response in zip(prompts, responses):
        # Must add EOS_TOKEN, otherwise your generation will go on forever!
        text = prompt_template.format(prompt, response)
        texts.append(text)
    return {"text": texts}

from datasets import load_dataset
dataset = load_dataset("OpenEndedLM/OpenEnded-v0.2")["train"]
dataset = dataset.map(format_prompt, batched=True)

Map:   0%|          | 0/2000 [00:00<?, ? examples/s]

In [13]:
print(dataset["text"][-1])

<|im_start|>system
You are a helpful assistant.<|im_end|>
<|im_start|>user
Can you explain the difference between an abstract class and an interface in Java? What situations would you use one over the other?<|im_end|>
<|im_start|>assistant
In Java, both abstract classes and interfaces serve to define a blueprint for objects or common behaviors but do so in different ways:

### Abstract Class

1. **Definition**: An abstract class is a class that cannot be instantiated on its own; it must be inherited by another (concrete) class.
2. **Abstract Methods & Concrete Methods**:
   - You can have both abstract methods (methods without implementation) and concrete methods (methods with implementations) in an abstract class.
3. **Field Access Modifiers**: Fields in abstract classes can have any access modifier, allowing them to be `public`, `private`, or protected`.
4. **Use Case**: Use an abstract class when you want to define common fields and behaviors that are shared among multiple subclasse

In [6]:
from trl import SFTTrainer
from transformers import TrainingArguments
from unsloth import is_bfloat16_supported

trainer = SFTTrainer(
    model = model,
    tokenizer = tokenizer,
    train_dataset = dataset,
    dataset_text_field = "text",
    max_seq_length = max_seq_length,
    dataset_num_proc = 2,
    packing = False, # Can make training 5x faster for short sequences.
    args = TrainingArguments(
        per_device_train_batch_size = 2,
        gradient_accumulation_steps = 4,
        warmup_steps = 5,

        num_train_epochs = 2,
        #max_steps = 60,

        learning_rate = 2e-4,
        fp16 = not is_bfloat16_supported(),
        bf16 = is_bfloat16_supported(),
        logging_steps = 1,
        optim = "adamw_8bit",
        weight_decay = 0.01,
        lr_scheduler_type = "linear",
        seed = 3407,
        output_dir = "outputs",
        report_to = "none", # Use this for WandB etc
    ),
)

Map (num_proc=2):   0%|          | 0/2000 [00:00<?, ? examples/s]

In [7]:
trainer_stats = trainer.train()

==((====))==  Unsloth - 2x faster free finetuning | Num GPUs = 1
   \\   /|    Num examples = 2,000 | Num Epochs = 2
O^O/ \_/ \    Batch size per device = 2 | Gradient Accumulation steps = 4
\        /    Total batch size = 8 | Total steps = 500
 "-____-"     Number of trainable parameters = 73,859,072


**** Unsloth: Please use our fixed gradient_accumulation_steps by updating transformers, TRL and Unsloth!
`pip install --upgrade --no-cache-dir --no-deps unsloth transformers git+https://github.com/huggingface/trl.git`


  0%|          | 0/500 [00:00<?, ?it/s]

{'loss': 1.3925, 'grad_norm': 0.7581949234008789, 'learning_rate': 4e-05, 'epoch': 0.0}
{'loss': 1.3313, 'grad_norm': 0.6711294054985046, 'learning_rate': 8e-05, 'epoch': 0.01}
{'loss': 1.3238, 'grad_norm': 0.625776469707489, 'learning_rate': 0.00012, 'epoch': 0.01}
{'loss': 1.1564, 'grad_norm': 0.5471979975700378, 'learning_rate': 0.00016, 'epoch': 0.02}
{'loss': 1.3209, 'grad_norm': 0.509019136428833, 'learning_rate': 0.0002, 'epoch': 0.02}
{'loss': 1.1166, 'grad_norm': 0.5876598954200745, 'learning_rate': 0.0001995959595959596, 'epoch': 0.02}
{'loss': 1.1308, 'grad_norm': 0.42803826928138733, 'learning_rate': 0.0001991919191919192, 'epoch': 0.03}
{'loss': 0.9379, 'grad_norm': 0.35973235964775085, 'learning_rate': 0.00019878787878787878, 'epoch': 0.03}
{'loss': 1.09, 'grad_norm': 0.3838503360748291, 'learning_rate': 0.00019838383838383837, 'epoch': 0.04}
{'loss': 0.9814, 'grad_norm': 0.3114466369152069, 'learning_rate': 0.000197979797979798, 'epoch': 0.04}
{'loss': 0.7578, 'grad_norm

In [45]:
prompt = """<|im_start|>system
You are a helpful assistant.<|im_end|>
<|im_start|>user
Can you tell me about triangles...<|im_end|>
<|im_start|>assistant
"""

FastLanguageModel.for_inference(model) # Enable native 2x faster inference
inputs=tokenizer([prompt], return_tensors="pt").to("cuda")

outputs=model.generate(**inputs, max_new_tokens=512, use_cache=True)
print(tokenizer.batch_decode(outputs)[0])

<|im_start|>system
You are a helpful assistant.<|im_end|>
<|im_start|>user
Can you tell me about triangles...<|im_end|>
<|im_start|>assistant
Certainly! Triangles are fundamental geometric shapes and play crucial roles in various fields such as architecture, engineering, physics, and computer science.

### Basic Types of Triangles

1. **Equilateral Triangle**: All three sides have the same length.
2. **Isosceles Triangle**: Two sides are equal in length; the third side is different from these two.
3. **Scalene Triangle**: No sides are equal in length.

### Classification Based on Angles

- **Acute-Angled Triangle**: All interior angles are less than 90 degrees (right angle).
- **Right-Angled Triangle**: One angle measures exactly 90 degrees (a right angle). The side opposite this angle is called the hypotenuse, which is always the longest side.
- **Obtuse-Angled Triangle**: One interior angle is greater than 90 degrees but less than 180 degrees.

### Properties of Triangles

- **Sum of

In [46]:
model.save_pretrained_merged("output")

Unsloth: You're not saving a tokenizer as well?
You can do it separately via `tokenizer.save_pretrained(...)`


Unsloth: Merging 4bit and LoRA weights to 16bit...
Unsloth: Will use up to 14.05 out of 31.11 RAM for saving.


100%|██████████| 28/28 [00:00<00:00, 186.20it/s]


Unsloth: Saving model... This might take 5 minutes for Llama-7b...





Done.
