<a href="https://colab.research.google.com/github/spider2048/mlexperiments/blob/main/bigcode_qwen_coder_2_5_1_5b.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

In [None]:
import re
import torch; v = re.match(r"[0-9\.]{3,}", str(torch.__version__)).group(0)
xformers = "xformers==" + ("0.0.32.post2" if v == "2.8.0" else "0.0.29.post3")
!uv pip install --system --no-deps bitsandbytes accelerate {xformers} peft trl triton cut_cross_entropy unsloth_zoo
!uv pip install --system sentencepiece protobuf "datasets>=3.4.1,<4.0.0" "huggingface_hub>=0.34.0" hf_transfer
!uv pip install --system --no-deps unsloth
!uv pip install --system transformers==4.55.4
!uv pip install --system --no-deps trl==0.22.2

[2mUsing Python 3.12.11 environment at: /usr[0m
[2K[2mResolved [1m8 packages[0m [2min 289ms[0m[0m
[2mAudited [1m8 packages[0m [2min 0.50ms[0m[0m
[2mUsing Python 3.12.11 environment at: /usr[0m
[2mAudited [1m5 packages[0m [2min 535ms[0m[0m
[2mUsing Python 3.12.11 environment at: /usr[0m
[2K[2mResolved [1m1 package[0m [2min 53ms[0m[0m
[2mAudited [1m1 package[0m [2min 0.19ms[0m[0m
[2mUsing Python 3.12.11 environment at: /usr[0m
[2mAudited [1m1 package[0m [2min 199ms[0m[0m
[2mUsing Python 3.12.11 environment at: /usr[0m
[2mAudited [1m1 package[0m [2min 149ms[0m[0m


In [None]:
import torch
from unsloth import FastLanguageModel
from trl import SFTTrainer, SFTConfig

🦥 Unsloth: Will patch your computer to enable 2x faster free finetuning.
🦥 Unsloth Zoo will now patch everything to make training faster!


In [None]:
from datasets import load_dataset, Dataset
from transformers import AutoTokenizer, AutoModelForCausalLM

In [None]:
MODEL_NAME = "Qwen/Qwen2.5-Coder-1.5B-Instruct"
model, tokenizer = FastLanguageModel.from_pretrained(
    model_name = MODEL_NAME,
    max_seq_length = 2048,   # Context length - can be longer, but uses more memory
    load_in_4bit = True,     # 4bit uses much less memory
    load_in_8bit = False,    # A bit more accurate, uses 2x memory
    full_finetuning = False, # We have full finetuning now!
)

==((====))==  Unsloth 2025.9.9: Fast Llama patching. Transformers: 4.55.4.
   \\   /|    Tesla T4. Num GPUs = 1. Max memory: 14.741 GB. Platform: Linux.
O^O/ \_/ \    Torch: 2.8.0+cu126. CUDA: 7.5. CUDA Toolkit: 12.6. Triton: 3.4.0
\        /    Bfloat16 = FALSE. FA [Xformers = 0.0.32.post2. FA2 = False]
 "-____-"     Free license: http://github.com/unslothai/unsloth
Unsloth: Fast downloading is enabled - ignore downloading bars which are red colored!


model.safetensors.index.json: 0.00B [00:00, ?B/s]

Fetching 2 files:   0%|          | 0/2 [00:00<?, ?it/s]

model-00002-of-00002.safetensors:   0%|          | 0.00/1.46G [00:00<?, ?B/s]

model-00001-of-00002.safetensors:   0%|          | 0.00/4.97G [00:00<?, ?B/s]

KeyboardInterrupt: 

In [None]:
model

In [None]:
model = FastLanguageModel.get_peft_model(
    model,
    r = 8,
    target_modules = ["q_proj", "k_proj", "v_proj", "o_proj",
                      "gate_proj", "up_proj", "down_proj",],
    lora_alpha = 16,
    lora_dropout = 0,
    bias = "none",    # Supports any, but = "none" is optimized
    # [NEW] "unsloth" uses 30% less VRAM, fits 2x larger batch sizes!
    use_gradient_checkpointing = "unsloth", # True or "unsloth" for very long context
    random_state = 3407,
    use_rslora = False,   # We support rank stabilized LoRA
    loftq_config = None,  # And LoftQ
)

Unsloth 2025.9.9 patched 28 layers with 28 QKV layers, 28 O layers and 28 MLP layers.


In [None]:
DATASET_NAME = "bigcode/self-oss-instruct-sc2-exec-filter-50k"
dataset = load_dataset(DATASET_NAME, split="train")

In [None]:
dataset[0].keys()

dict_keys(['fingerprint', 'sha1', 'seed', 'response', 'concepts', 'prompt', 'instruction', 'id'])

In [None]:
dataset[0]['instruction']

'Write a Python function named `get_value` that takes a matrix (represented by a list of lists) and a tuple of indices, and returns the value at that index in the matrix. The function should handle index out of range errors by returning None.'

In [None]:
def generate_conversation(examples):
    problems  = examples["prompt"]
    instructions = examples["instruction"]
    solutions = examples["response"]
    conversations = []
    for problem, instr, solution in zip(problems, instructions, solutions):
        conversations.append([
            {"role": "system", "content": problem},
            {"role" : "user",      "content" : instr},
            {"role" : "assistant", "content" : solution},
        ])
    return { "conversations": conversations, }

In [None]:
reasoning_conversations = tokenizer.apply_chat_template(
    dataset.map(generate_conversation, batched = True)["conversations"],
    tokenize = False,
)

In [None]:
dataset_text = Dataset.from_dict({"text": reasoning_conversations})

In [None]:
trainer = SFTTrainer(
    model = model,
    tokenizer = tokenizer,
    train_dataset = dataset_text,
    eval_dataset = None, # Can set up evaluation!
    args = SFTConfig(
        dataset_text_field = "text",
        per_device_train_batch_size = 2,
        gradient_accumulation_steps = 4, # Use GA to mimic batch size!
        warmup_steps = 5,
        num_train_epochs = 1, # Set this for 1 full training run.
        max_steps = 30,
        learning_rate = 2e-4, # Reduce to 2e-5 for long training runs
        logging_steps = 1,
        optim = "adamw_8bit",
        weight_decay = 0.01,
        lr_scheduler_type = "linear",
        seed = 3407,
        report_to = "none", # Use this for WandB etc,
    ),
)

Unsloth: Tokenizing ["text"] (num_proc=6):   0%|          | 0/50661 [00:00<?, ? examples/s]

TimeoutError: 

In [None]:
trainer_stats = trainer.train()

In [None]:
def generate_question(examples):
    problems  = examples["prompt"]
    instructions = examples["instruction"]
    solutions = examples["response"]
    conversations = []
    for problem, instr, solution in zip(problems, instructions, solutions):
        conversations.append([
            {"role": "system", "content": problem},
            {"role" : "user",      "content" : instr},
        ])
    return { "conversations": conversations, }

In [None]:
qreasoning_conversations = tokenizer.apply_chat_template(
    dataset.map(generate_question, batched = True)["conversations"],
    tokenize = False,
    add_generation_prompt=True
)

qdataset_text = Dataset.from_dict({"text": qreasoning_conversations})

In [None]:
item = prompt
input_ids = tokenizer.encode(item, return_tensors="pt",).to(model.device)
out_ids = model.generate(input_ids, max_new_tokens=2048, do_sample=True, top_p=0.9, temperature=0.7)
new_ids = out_ids[:, input_ids.size(-1):]
decoded = tokenizer.batch_decode(new_ids, skip_special_tokens=True)[0]
print(decoded)

In [None]:
print(decoded)

Sure! Here's a Python function that multiplies two matrices:

```python
def multiply_matrices(matrix1, matrix2):
    # Check if the number of columns in matrix1 matches the number of rows in matrix2
    if len(matrix1[0]) != len(matrix2):
        raise ValueError("Number of columns in matrix1 must match number of rows in matrix2")

    # Initialize the result matrix with zeros
    result_matrix = [[0 for _ in range(len(matrix2[0]))] for _ in range(len(matrix1))]

    # Multiply each row of matrix1 with each column of matrix2
    for i in range(len(matrix1)):
        for j in range(len(matrix2[0])):
            for k in range(len(matrix2)):
                result_matrix[i][j] += matrix1[i][k] * matrix2[k][j]

    return result_matrix
```

You can use this function by passing in two matrices as arguments like so:

```python
matrix1 = [[1, 2], [3, 4]]
matrix2 = [[5, 6], [7, 8]]

result_matrix = multiply_matrices(matrix1, matrix2)
print(result_matrix)
```

This will output:

```
[[19, 22],

In [None]:
prompt = """
<|im_start|>system
Provide the best response to a given instruction. Follow the following steps to craft your response:
1. reason about the given instruction
2. provide a high-quality solution
3. offer a concise explanation
4. write tests to verify the correctness your solution

## Example 1
### Instruction
Design a Python function that takes a sorted array and a target value, and return a valid index where target can be inserted to maintain the array's sorted order. Optimize the function to run in logarithmic time complexity.

For example, given `array = [1, 3, 5, 5, 6]` and `target = 5`, the function should return either 2 or 3 because 5 presents at both indices 2 and 3.

### Response
[Reasoning]
To solve this problem efficiently and ensure logarithmic time complexity, we can use a binary search algorithm. Compared with a standard binary search that looks for an exact match, we can modify the algorithm such that when the target is not found, we return the `left` bound, which represents the index where the target can be inserted to maintain the array's sorted order. Since any valid index for insertion is acceptable, we can direclty return the index if there is an exact match.

[Implementation]
Here is a Python function that implements this approach:

```python
from typing import List

def search_insert_position(nums: List[int], target: int) -> int:
    left, right = 0, len(nums) - 1

    while left <= right:
        mid = (left + right) // 2

        # Directly return the index if there's an exact match
        if nums[mid] == target:
            return mid
        elif nums[mid] < target:
            left = mid + 1
        else:
            right = mid - 1

    # At this point, `left` is the correct insertion index
    return left
```

[Explanation]
This implementation ensures that when `target` is not found, `left` represents the correct insertion index. This is because at the end of the loop, all elements to the left of `left` are smaller than `target`, all elements to the right of `left` are equal to or larger than `target`, and `left > right`. Therefore, `left` is the correct index for insertion to maintain the sorted order.

[Tests]
To test this function, you can use the example you provided:

```python
# Provided example
assert search_insert_position([1, 3, 5, 5, 6], 5) in [2, 3]
# Additional cases
assert search_insert_position([1, 3, 5, 5, 6], 2) == 1
assert search_insert_position([1, 3, 5, 5, 6], 7) == 5
```

These tests cover the scenario where the target is found at multiple indices, as well as cases where the target is not present in the array but needs to be inserted at the correct position to maintain the sorted order.

## Example 2
### Instruction
Write a Python function that finds the first element in a list of numbers that is divisible by 5. If no such element exists, the function should return -1. The function should handle any errors that may occur during its execution and return 1000.

### Response<|im_end|>
<|im_start|>user
Can you write a python function to use einops to multiply two matrices?<|im_end|>
<|im_start|>assistant
"""

In [None]:
print(item)
print(decoded)