## Fine-tune CodeLLaMa on Bootstrapped Data

Created based on Younes Belkada's [GitHub Gist](https://gist.github.com/younesbelkada/9f7f75c94bdc1981c8ca5cc937d4a4da) and https://mlabonne.github.io/blog/posts/Fine_Tune_Your_Own_Llama_2_Model_in_a_Colab_Notebook.html

This notebook runs on a A100 GPU. (Last update: 24 Aug 2023)


In [None]:
# install dependencies

# we use the latest version of transformers, peft, and accelerate
!pip install git+https://github.com/huggingface/transformers.git@refs/pull/25740/head
!pip install -q accelerate peft bitsandbytes

# install bitsandbytes for quantization
!pip install -q bitsandbytes

# install trl for the SFT library
!pip install -q trl

# we need sentencepiece for the llama2 slow tokenizer
!pip install sentencepiece

# we need einops, used by falcon-7b, llama-2 etc
# einops (einsteinops) is used to simplify tensorops by making them readable
!pip install -q -U einops

# we need to install datasets for our training dataset
!pip install -q datasets


Collecting git+https://github.com/huggingface/transformers.git@refs/pull/25740/head
  Cloning https://github.com/huggingface/transformers.git (to revision refs/pull/25740/head) to /tmp/pip-req-build-kq54rv95
  Running command git clone --filter=blob:none --quiet https://github.com/huggingface/transformers.git /tmp/pip-req-build-kq54rv95
[0m  Running command git fetch -q https://github.com/huggingface/transformers.git refs/pull/25740/head
  Running command git checkout -q c6c6daa3f07e753cff91a08c4294df4a6ea6227b
  Resolved https://github.com/huggingface/transformers.git to commit c6c6daa3f07e753cff91a08c4294df4a6ea6227b
  Installing build dependencies ... [?25l[?25hdone
  Getting requirements to build wheel ... [?25l[?25hdone
  Preparing metadata (pyproject.toml) ... [?25l[?25hdone


In [None]:
import random
import torch
import numpy as np

def set_seed(seed_value=42):
    """Set seed for reproducibility for PyTorch and NumPy.

    Args:
        seed_value (int): The seed value to set for random number generators.
    """
    random.seed(seed_value)
    np.random.seed(seed_value)
    torch.manual_seed(seed_value)
    torch.cuda.manual_seed_all(seed_value)

    # Additional steps for deterministic behavior
    torch.backends.cudnn.deterministic = True
    torch.backends.cudnn.benchmark = False

# Set the seed
set_seed(42)  # You can replace 42 with any other seed value of your choice


In [None]:
from google.colab import drive
drive.mount('/content/drive')

path = "/content/drive/MyDrive/MultilingualLLMBias"

Drive already mounted at /content/drive; to attempt to forcibly remount, call drive.mount("/content/drive", force_remount=True).


In [None]:
import os
import torch
from datasets import Dataset
from transformers import (
    AutoModelForCausalLM,
    AutoTokenizer,
    BitsAndBytesConfig,
    HfArgumentParser,
    TrainingArguments,
    pipeline,
    logging,
)
from peft import LoraConfig, PeftModel
from trl import SFTTrainer

In [None]:
# The model that you want to train from the Hugging Face hub
model_name = "codellama/CodeLlama-7b-Instruct-hf"
#model_name = "codellama/CodeLlama-7b-Python-hf" #"codellama/CodeLlama-7b-hf"
#model_name = "codellama/CodeLlama-13b-Python-hf"
#model_name = "codellama/CodeLlama-13b-Instruct-hf"

# The instruction dataset to use
#dataset_name = path+"/GPT3.5-finetune-data/stacked_combined_for_codellama.txt"
dataset_name = path+"/training_data_bootstrap_llama_13b.txt"

# Fine-tuned model name
#new_model = f"CodeLlama-7b-Instruct-multi"
#new_model = f"CodeLlama-13b-Python-multi"
#new_model = f"CodeLlama-13b-Python-multi"
#new_model = f"CodeLlama-13b-Instruct-bootstrapped-multi"
new_model = f"CodeLlama-7b-Instruct-bootstrapped-multi"

################################################################################
# QLoRA parameters
################################################################################

# LoRA attention dimension
lora_r = 64

# Alpha parameter for LoRA scaling
lora_alpha = 16

# Dropout probability for LoRA layers
lora_dropout = 0.1

################################################################################
# bitsandbytes parameters
################################################################################

# Activate 4-bit precision base model loading
use_4bit = True

# Compute dtype for 4-bit base models
bnb_4bit_compute_dtype = "float16"

# Quantization type (fp4 or nf4)
bnb_4bit_quant_type = "nf4"

# Activate nested quantization for 4-bit base models (double quantization)
use_nested_quant = False

################################################################################
# TrainingArguments parameters
################################################################################

# Output directory where the model predictions and checkpoints will be stored
output_dir = "./results"

# Number of training epochs
num_train_epochs = 2

# Enable fp16/bf16 training (set bf16 to True with an A100)
fp16 = False
bf16 = True

# Batch size per GPU for training
per_device_train_batch_size = 4

# Batch size per GPU for evaluation
per_device_eval_batch_size = 4

# Number of update steps to accumulate the gradients for
gradient_accumulation_steps = 1

# Enable gradient checkpointing
gradient_checkpointing = True

# Maximum gradient normal (gradient clipping)
max_grad_norm = 0.3

# Initial learning rate (AdamW optimizer)
learning_rate = 2e-4

# Weight decay to apply to all layers except bias/LayerNorm weights
weight_decay = 0.001

# Optimizer to use
optim = "paged_adamw_32bit"

# Learning rate schedule
lr_scheduler_type = "cosine"

# Number of training steps (overrides num_train_epochs)
max_steps = -1

# Ratio of steps for a linear warmup (from 0 to learning rate)
warmup_ratio = 0.03

# Group sequences into batches with same length
# Saves memory and speeds up training considerably
group_by_length = True

# Save checkpoint every X updates steps
save_steps = 0

# Log every X updates steps
logging_steps = 25

################################################################################
# SFT parameters
################################################################################

# Maximum sequence length to use
max_seq_length = None

# Pack multiple short examples in the same input sequence to increase efficiency
packing = False

# Load the entire model on the GPU 0
device_map = {"": 0}

In [None]:
!nvidia-smi

Fri Oct 20 20:52:24 2023       
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 525.105.17   Driver Version: 525.105.17   CUDA Version: 12.0     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|                               |                      |               MIG M. |
|   0  NVIDIA A100-SXM...  Off  | 00000000:00:04.0 Off |                    0 |
| N/A   40C    P0    46W / 400W |      3MiB / 40960MiB |      0%      Default |
|                               |                      |             Disabled |
+-------------------------------+----------------------+----------------------+
                                                                               
+-----------------------------------------------------------------------------+
| Proces

In [None]:
torch.cuda.empty_cache()
import subprocess

from numba import cuda
device = cuda.get_current_device()
device.reset()

def reset_gpu(gpu_id):
    try:
        subprocess.run(['nvidia-smi', '-r', '-i', str(gpu_id)])
        print(f"GPU {gpu_id} has been reset.")
    except Exception as e:
        print(f"Failed to reset GPU {gpu_id} with error: {e}")

# Replace gpu_id with the actual ID of the GPU you want to reset
gpu_id = 0

reset_gpu(gpu_id)

GPU 0 has been reset.


In [None]:
# Load dataset (you can process it here)
with open (dataset_name,"r",encoding = "utf-8") as f:
  lines = list(map(lambda x:eval(x.strip()),f.readlines()))
  print (f"Total training data = {len(lines)}")

# Load dataset (you can process it here)
dataset = Dataset.from_dict({"text":lines})

# Load tokenizer and model with QLoRA configuration
compute_dtype = getattr(torch, bnb_4bit_compute_dtype)

bnb_config = BitsAndBytesConfig(
    load_in_4bit=use_4bit,
    bnb_4bit_quant_type=bnb_4bit_quant_type,
    bnb_4bit_compute_dtype=compute_dtype,
    bnb_4bit_use_double_quant=use_nested_quant,
)

# Check GPU compatibility with bfloat16
if compute_dtype == torch.float16 and use_4bit:
    major, _ = torch.cuda.get_device_capability()
    if major >= 8:
        print("=" * 80)
        print("Your GPU supports bfloat16: accelerate training with bf16=True")
        print("=" * 80)

# Load base model
model = AutoModelForCausalLM.from_pretrained(
    model_name,
    quantization_config=bnb_config,
    device_map=device_map
)
model.config.use_cache = False
model.config.pretraining_tp = 1

# Load LLaMA tokenizer
tokenizer = AutoTokenizer.from_pretrained(model_name, trust_remote_code=True)
tokenizer.pad_token = tokenizer.eos_token
tokenizer.padding_side = "right" # Fix weird overflow issue with fp16 training

# Load LoRA configuration
peft_config = LoraConfig(
    lora_alpha=lora_alpha,
    lora_dropout=lora_dropout,
    r=lora_r,
    bias="none",
    task_type="CAUSAL_LM",
)

# Set training parameters
training_arguments = TrainingArguments(
    output_dir=output_dir,
    num_train_epochs=num_train_epochs,
    per_device_train_batch_size=per_device_train_batch_size,
    gradient_accumulation_steps=gradient_accumulation_steps,
    optim=optim,
    save_steps=save_steps,
    logging_steps=logging_steps,
    learning_rate=learning_rate,
    weight_decay=weight_decay,
    fp16=fp16,
    bf16=bf16,
    max_grad_norm=max_grad_norm,
    max_steps=max_steps,
    warmup_ratio=warmup_ratio,
    group_by_length=group_by_length,
    lr_scheduler_type=lr_scheduler_type,
    report_to="tensorboard"
)

# Set supervised fine-tuning parameters
# Reprocudibility: max_seq_length = 512 for 7b models
# max_seq_length = 256 for 13b models
trainer = SFTTrainer(
    model=model,
    train_dataset=dataset,
    peft_config=peft_config,
    dataset_text_field="text",
    max_seq_length=256,
    tokenizer=tokenizer,
    args=training_arguments,
    packing=packing,
)

# Train model
trainer.train()

# Save trained model
trainer.model.save_pretrained(new_model)

Total training data = 127
Your GPU supports bfloat16: accelerate training with bf16=True


(…)-7b-Instruct-hf/resolve/main/config.json:   0%|          | 0.00/646 [00:00<?, ?B/s]

(…)esolve/main/model.safetensors.index.json:   0%|          | 0.00/25.1k [00:00<?, ?B/s]

Downloading shards:   0%|          | 0/2 [00:00<?, ?it/s]

model-00001-of-00002.safetensors:   0%|          | 0.00/9.98G [00:00<?, ?B/s]

model-00002-of-00002.safetensors:   0%|          | 0.00/3.50G [00:00<?, ?B/s]

Loading checkpoint shards:   0%|          | 0/2 [00:00<?, ?it/s]

(…)t-hf/resolve/main/generation_config.json:   0%|          | 0.00/116 [00:00<?, ?B/s]

(…)ct-hf/resolve/main/tokenizer_config.json:   0%|          | 0.00/749 [00:00<?, ?B/s]

tokenizer.model:   0%|          | 0.00/500k [00:00<?, ?B/s]

(…)-Instruct-hf/resolve/main/tokenizer.json:   0%|          | 0.00/1.84M [00:00<?, ?B/s]

(…)-hf/resolve/main/special_tokens_map.json:   0%|          | 0.00/411 [00:00<?, ?B/s]

Map:   0%|          | 0/127 [00:00<?, ? examples/s]

You're using a CodeLlamaTokenizerFast tokenizer. Please note that with a fast tokenizer, using the `__call__` method is faster than using a method to encode the text followed by a call to the `pad` method to get a padded encoding.


Step,Training Loss
25,1.4566
50,0.5478


In [None]:
# %load_ext tensorboard
# %tensorboard --logdir results/runs

In [None]:
# Empty VRAM
del model
del pipeline
del trainer
import gc
gc.collect()
gc.collect()

NameError: ignored

In [None]:
# Reload model in FP16 and merge it with LoRA weights
base_model = AutoModelForCausalLM.from_pretrained(
    model_name,
    low_cpu_mem_usage=True,
    return_dict=True,
    torch_dtype=torch.float16,
    device_map=device_map,
)
model = PeftModel.from_pretrained(base_model, new_model)
model = model.merge_and_unload()

# Reload tokenizer to save it
tokenizer = AutoTokenizer.from_pretrained(model_name, trust_remote_code=True)
tokenizer.pad_token = tokenizer.eos_token
tokenizer.padding_side = "right"

#model.save_pretrained(f"{path}/{new_model}")
#tokenizer.save_pretrained(f"{path}/{new_model}")


Loading checkpoint shards:   0%|          | 0/2 [00:00<?, ?it/s]

In [None]:
#!rm -rf ~/.cache/huggingface/hub/models--codellama--CodeLlama-13b*
#!rm -rf CodeLlama*

In [None]:
import locale
locale.getpreferredencoding = lambda: "UTF-8"
#!pip install huggingface_hub["cli"]
#!huggingface-cli delete-cache
model.save_pretrained(f"{path}/{new_model}",safe_serialization=True)
tokenizer.save_pretrained(f"{path}/{new_model}")


('/content/drive/MyDrive/MultilingualLLMBias/CodeLlama-7b-Instruct-bootstrapped-multi/tokenizer_config.json',
 '/content/drive/MyDrive/MultilingualLLMBias/CodeLlama-7b-Instruct-bootstrapped-multi/special_tokens_map.json',
 '/content/drive/MyDrive/MultilingualLLMBias/CodeLlama-7b-Instruct-bootstrapped-multi/tokenizer.model',
 '/content/drive/MyDrive/MultilingualLLMBias/CodeLlama-7b-Instruct-bootstrapped-multi/added_tokens.json',
 '/content/drive/MyDrive/MultilingualLLMBias/CodeLlama-7b-Instruct-bootstrapped-multi/tokenizer.json')

## Run Inference on Multilingual Prompts

In [None]:
from transformers import (
    AutoModelForCausalLM,
    AutoTokenizer,
    BitsAndBytesConfig,
    HfArgumentParser,
    TrainingArguments,
    pipeline,
    logging,
)
import transformers
import torch

path = "/content/drive/MyDrive/MultilingualLLMBias/results/"

finetune_flag = True
# Enable these models to reproduce results

#model = "codellama/CodeLlama-7b-Instruct-hf"
#model = "codellama/CodeLlama-7b-Python-hf" #"codellama/CodeLlama-7b-hf"
#model = "codellama/CodeLlama-13b-Python-hf"
#model = "codellama/CodeLlama-13b-Instruct-hf"

# Fine tuned
#model = "/content/drive/MyDrive/MultilingualLLMBias/CodeLlama-7b-Instruct-multi"
#model = "/content/drive/MyDrive/MultilingualLLMBias/CodeLlama-13b-Instruct-multi"
#model = "/content/drive/MyDrive/MultilingualLLMBias/CodeLlama-13b-Instruct-bootstrapped-multi"
model = "/content/drive/MyDrive/MultilingualLLMBias/CodeLlama-7b-Instruct-bootstrapped-multi"

dirname = model.split("/")[-1]
result_dir = path+f"results-{dirname}"
!mkdir -p $result_dir


tokenizer = AutoTokenizer.from_pretrained(model)
if not finetune_flag:
  pipeline = transformers.pipeline(
      "text-generation",
      model=model,
      torch_dtype=torch.float16,
      device_map="auto",
      tokenizer = tokenizer
  )
else:

  # Activate 4-bit precision base model loading
  use_4bit = False

  # Compute dtype for 4-bit base models
  bnb_4bit_compute_dtype = "float16"

  # Quantization type (fp4 or nf4)
  bnb_4bit_quant_type = "nf4"

  # Activate nested quantization for 4-bit base models (double quantization)
  use_nested_quant = False

  compute_dtype = getattr(torch, bnb_4bit_compute_dtype)


  bnb_config = BitsAndBytesConfig(
      load_in_4bit=use_4bit,
      bnb_4bit_quant_type=bnb_4bit_quant_type,
      bnb_4bit_compute_dtype=compute_dtype,
      bnb_4bit_use_double_quant=use_nested_quant,
  )


model = AutoModelForCausalLM.from_pretrained(
    model,
    quantization_config=bnb_config,
    torch_dtype=torch.float16,
    device_map="auto"
)
pipeline = transformers.pipeline(
    "text-generation",
    model=model,
    tokenizer = tokenizer
)

Loading checkpoint shards:   0%|          | 0/2 [00:00<?, ?it/s]

In [None]:
import pandas
import tqdm
languages = ["en","hi","ru","es","ja","zh-cn"]


def create_template_rich(question):
  system = "Provide answers in Python"
  user = question
  prompt =  f"<s>[INST] <<SYS>>\\n{system}\\n<</SYS>>\\n\\n{user}[/INST]"
  return prompt
def create_template(question):
  system = "Provide answers in Python"
  user = question
  prompt = f"<s><<SYS>>\n{system}\n<</SYS>>\n\n{user}"
  return prompt

def create_template_normal(question):
  return question.strip()

for language in tqdm.tqdm(languages):

  df = pandas.read_csv(f"/content/drive/MyDrive/MultilingualLLMBias/test.{language}.sanitized.csv")
  all_data = []

  # prompts_df = df["prompt"].apply(create_template)
  prompts_df = df["prompt"].apply(create_template_rich)
  df["prompt_modified"] = prompts_df
  prompts = prompts_df.to_list()

  # Enable the following for CodeLLaMa models
  pipeline.tokenizer.pad_token_id = pipeline.model.config.eos_token_id

  # Enable the following for LLaMa models
  #pipeline.tokenizer.pad_token = "[PAD]"
  #pipeline.tokenizer.padding_side = "left"
  print(prompts)
  sequences = pipeline(
      prompts,
      do_sample=True,
      temperature=0.8,
      num_return_sequences=1,
      eos_token_id=tokenizer.eos_token_id,
      max_length=200,
      add_special_tokens=False,
      batch_size = 16,
  )

  for sequence in sequences:
    all_data.append(sequence[0]['generated_text'])

  df["results"] = all_data
  df.to_csv(f"{result_dir}/test.{language}.sanitized.results.csv")

  0%|          | 0/6 [00:00<?, ?it/s]Setting `pad_token_id` to `eos_token_id`:2 for open-end generation.


['<s>[INST] <<SYS>>\\nProvide answers in Python\\n<</SYS>>\\n\\nWrite a python function to remove first and last occurrence of a given character from the string.[/INST]', '<s>[INST] <<SYS>>\\nProvide answers in Python\\n<</SYS>>\\n\\nWrite a function to sort a given matrix in ascending order according to the sum of its rows.[/INST]', '<s>[INST] <<SYS>>\\nProvide answers in Python\\n<</SYS>>\\n\\nWrite a python function to find the volume of a triangular prism.[/INST]', '<s>[INST] <<SYS>>\\nProvide answers in Python\\n<</SYS>>\\n\\nWrite a function to that returns true if the input string contains sequences of lowercase letters joined with an underscore and false otherwise.[/INST]', '<s>[INST] <<SYS>>\\nProvide answers in Python\\n<</SYS>>\\n\\nWrite a function that returns the perimeter of a square given its side length as input.[/INST]', '<s>[INST] <<SYS>>\\nProvide answers in Python\\n<</SYS>>\\n\\nWrite a function to remove characters from the first string which are present in the s

Setting `pad_token_id` to `eos_token_id`:2 for open-end generation.
Setting `pad_token_id` to `eos_token_id`:2 for open-end generation.
Setting `pad_token_id` to `eos_token_id`:2 for open-end generation.
Setting `pad_token_id` to `eos_token_id`:2 for open-end generation.
Setting `pad_token_id` to `eos_token_id`:2 for open-end generation.
Setting `pad_token_id` to `eos_token_id`:2 for open-end generation.
Setting `pad_token_id` to `eos_token_id`:2 for open-end generation.
Setting `pad_token_id` to `eos_token_id`:2 for open-end generation.
Setting `pad_token_id` to `eos_token_id`:2 for open-end generation.
Setting `pad_token_id` to `eos_token_id`:2 for open-end generation.
Setting `pad_token_id` to `eos_token_id`:2 for open-end generation.
Setting `pad_token_id` to `eos_token_id`:2 for open-end generation.
Setting `pad_token_id` to `eos_token_id`:2 for open-end generation.
Setting `pad_token_id` to `eos_token_id`:2 for open-end generation.
Setting `pad_token_id` to `eos_token_id`:2 for o

['<s>[INST] <<SYS>>\\nProvide answers in Python\\n<</SYS>>\\n\\nस्ट्रिंग से किसी दिए गए कैरेक्टर की पहली और आखिरी घटना को हटाने के लिए एक पायथन फ़ंक्शन लिखें।[/INST]', '<s>[INST] <<SYS>>\\nProvide answers in Python\\n<</SYS>>\\n\\nकिसी दिए गए मैट्रिक्स को उसकी पंक्तियों के योग के अनुसार आरोही क्रम में क्रमबद्ध करने के लिए एक फ़ंक्शन लिखें।[/INST]', '<s>[INST] <<SYS>>\\nProvide answers in Python\\n<</SYS>>\\n\\nत्रिकोणीय प्रिज्म का आयतन ज्ञात करने के लिए एक पायथन फ़ंक्शन लिखें।[/INST]', '<s>[INST] <<SYS>>\\nProvide answers in Python\\n<</SYS>>\\n\\nएक ऐसा फ़ंक्शन लिखें जो सत्य लौटाता है यदि इनपुट स्ट्रिंग में अंडरस्कोर के साथ जुड़े हुए लोअरकेस अक्षरों का अनुक्रम होता है और अन्यथा गलत होता है।[/INST]', '<s>[INST] <<SYS>>\\nProvide answers in Python\\n<</SYS>>\\n\\nएक फ़ंक्शन लिखें जो इनपुट के रूप में एक वर्ग की भुजा की लंबाई दी गई परिधि लौटाता है।[/INST]', '<s>[INST] <<SYS>>\\nProvide answers in Python\\n<</SYS>>\\n\\nपहली स्ट्रिंग से उन वर्णों को हटाने के लिए एक फ़ंक्शन लिखें जो दूसरी स

Setting `pad_token_id` to `eos_token_id`:2 for open-end generation.
Setting `pad_token_id` to `eos_token_id`:2 for open-end generation.
Setting `pad_token_id` to `eos_token_id`:2 for open-end generation.
Setting `pad_token_id` to `eos_token_id`:2 for open-end generation.
Setting `pad_token_id` to `eos_token_id`:2 for open-end generation.
Setting `pad_token_id` to `eos_token_id`:2 for open-end generation.
Setting `pad_token_id` to `eos_token_id`:2 for open-end generation.
Setting `pad_token_id` to `eos_token_id`:2 for open-end generation.
Setting `pad_token_id` to `eos_token_id`:2 for open-end generation.
Setting `pad_token_id` to `eos_token_id`:2 for open-end generation.
Setting `pad_token_id` to `eos_token_id`:2 for open-end generation.
Setting `pad_token_id` to `eos_token_id`:2 for open-end generation.
Setting `pad_token_id` to `eos_token_id`:2 for open-end generation.
Setting `pad_token_id` to `eos_token_id`:2 for open-end generation.
Setting `pad_token_id` to `eos_token_id`:2 for o

['<s>[INST] <<SYS>>\\nProvide answers in Python\\n<</SYS>>\\n\\nНапишите функцию Python для удаления первого и последнего вхождения данного символа из строки.[/INST]', '<s>[INST] <<SYS>>\\nProvide answers in Python\\n<</SYS>>\\n\\nНапишите функцию, сортирующую заданную матрицу в порядке возрастания суммы ее строк.[/INST]', '<s>[INST] <<SYS>>\\nProvide answers in Python\\n<</SYS>>\\n\\nНапишите функцию Python, чтобы найти объем треугольной призмы.[/INST]', '<s>[INST] <<SYS>>\\nProvide answers in Python\\n<</SYS>>\\n\\nНапишите функцию, которая возвращает true, если входная строка содержит последовательность строчных букв, соединенных подчеркиванием, и false в противном случае.[/INST]', '<s>[INST] <<SYS>>\\nProvide answers in Python\\n<</SYS>>\\n\\nНапишите функцию, которая возвращает периметр квадрата, учитывая длину его стороны в качестве входных данных.[/INST]', '<s>[INST] <<SYS>>\\nProvide answers in Python\\n<</SYS>>\\n\\nНапишите функцию для удаления символов из первой строки, прис

Setting `pad_token_id` to `eos_token_id`:2 for open-end generation.
Setting `pad_token_id` to `eos_token_id`:2 for open-end generation.
Setting `pad_token_id` to `eos_token_id`:2 for open-end generation.
Setting `pad_token_id` to `eos_token_id`:2 for open-end generation.
Setting `pad_token_id` to `eos_token_id`:2 for open-end generation.
Setting `pad_token_id` to `eos_token_id`:2 for open-end generation.
Setting `pad_token_id` to `eos_token_id`:2 for open-end generation.
Setting `pad_token_id` to `eos_token_id`:2 for open-end generation.
Setting `pad_token_id` to `eos_token_id`:2 for open-end generation.
Setting `pad_token_id` to `eos_token_id`:2 for open-end generation.
Setting `pad_token_id` to `eos_token_id`:2 for open-end generation.
Setting `pad_token_id` to `eos_token_id`:2 for open-end generation.
Setting `pad_token_id` to `eos_token_id`:2 for open-end generation.
Setting `pad_token_id` to `eos_token_id`:2 for open-end generation.
Setting `pad_token_id` to `eos_token_id`:2 for o

['<s>[INST] <<SYS>>\\nProvide answers in Python\\n<</SYS>>\\n\\nEscriba una función de Python para eliminar la primera y la última aparición de un carácter determinado de la cadena.[/INST]', '<s>[INST] <<SYS>>\\nProvide answers in Python\\n<</SYS>>\\n\\nEscribe una función para ordenar una matriz dada en orden ascendente según la suma de sus filas.[/INST]', '<s>[INST] <<SYS>>\\nProvide answers in Python\\n<</SYS>>\\n\\nEscribe una función de Python para encontrar el volumen de un prisma triangular.[/INST]', '<s>[INST] <<SYS>>\\nProvide answers in Python\\n<</SYS>>\\n\\nEscriba una función que devuelva verdadero si la cadena de entrada contiene secuencias de letras minúsculas unidas con un guión bajo y falso en caso contrario.[/INST]', '<s>[INST] <<SYS>>\\nProvide answers in Python\\n<</SYS>>\\n\\nEscribe una función que devuelva el perímetro de un cuadrado dada la longitud de su lado como entrada.[/INST]', '<s>[INST] <<SYS>>\\nProvide answers in Python\\n<</SYS>>\\n\\nEscriba una funci

Setting `pad_token_id` to `eos_token_id`:2 for open-end generation.
Setting `pad_token_id` to `eos_token_id`:2 for open-end generation.
Setting `pad_token_id` to `eos_token_id`:2 for open-end generation.
Setting `pad_token_id` to `eos_token_id`:2 for open-end generation.
Setting `pad_token_id` to `eos_token_id`:2 for open-end generation.
Setting `pad_token_id` to `eos_token_id`:2 for open-end generation.
Setting `pad_token_id` to `eos_token_id`:2 for open-end generation.
Setting `pad_token_id` to `eos_token_id`:2 for open-end generation.
Setting `pad_token_id` to `eos_token_id`:2 for open-end generation.
Setting `pad_token_id` to `eos_token_id`:2 for open-end generation.
Setting `pad_token_id` to `eos_token_id`:2 for open-end generation.
Setting `pad_token_id` to `eos_token_id`:2 for open-end generation.
Setting `pad_token_id` to `eos_token_id`:2 for open-end generation.
Setting `pad_token_id` to `eos_token_id`:2 for open-end generation.
Setting `pad_token_id` to `eos_token_id`:2 for o

['<s>[INST] <<SYS>>\\nProvide answers in Python\\n<</SYS>>\\n\\n文字列から指定された文字の最初と最後の出現を削除する Python 関数を作成します。[/INST]', '<s>[INST] <<SYS>>\\nProvide answers in Python\\n<</SYS>>\\n\\n指定された行列を行の合計に従って昇順に並べ替える関数を作成します。[/INST]', '<s>[INST] <<SYS>>\\nProvide answers in Python\\n<</SYS>>\\n\\n三角柱の体積を求める Python 関数を作成します。[/INST]', '<s>[INST] <<SYS>>\\nProvide answers in Python\\n<</SYS>>\\n\\n入力文字列にアンダースコアで結合された一連の小文字が含まれる場合は true を返し、それ以外の場合は false を返す関数を作成します。[/INST]', '<s>[INST] <<SYS>>\\nProvide answers in Python\\n<</SYS>>\\n\\n入力として辺の長さを指定すると、正方形の周囲長を返す関数を作成します。[/INST]', '<s>[INST] <<SYS>>\\nProvide answers in Python\\n<</SYS>>\\n\\n2 番目の文字列に存在する文字を最初の文字列から削除する関数を作成します。[/INST]', '<s>[INST] <<SYS>>\\nProvide answers in Python\\n<</SYS>>\\n\\n指定された整数の配列に重複する要素が含まれているかどうかを確認する関数を作成します。[/INST]', '<s>[INST] <<SYS>>\\nProvide answers in Python\\n<</SYS>>\\n\\n指定された数字がウッドボールかどうかを確認する関数を作成します。[/INST]', '<s>[INST] <<SYS>>\\nProvide answers in Python\\n<</SYS>>\\n\\n指定された数値がその逆の 2 倍より 1 小さいかどうかを確認する

Setting `pad_token_id` to `eos_token_id`:2 for open-end generation.
Setting `pad_token_id` to `eos_token_id`:2 for open-end generation.
Setting `pad_token_id` to `eos_token_id`:2 for open-end generation.
Setting `pad_token_id` to `eos_token_id`:2 for open-end generation.
Setting `pad_token_id` to `eos_token_id`:2 for open-end generation.
Setting `pad_token_id` to `eos_token_id`:2 for open-end generation.
Setting `pad_token_id` to `eos_token_id`:2 for open-end generation.
Setting `pad_token_id` to `eos_token_id`:2 for open-end generation.
Setting `pad_token_id` to `eos_token_id`:2 for open-end generation.
Setting `pad_token_id` to `eos_token_id`:2 for open-end generation.
Setting `pad_token_id` to `eos_token_id`:2 for open-end generation.
Setting `pad_token_id` to `eos_token_id`:2 for open-end generation.
Setting `pad_token_id` to `eos_token_id`:2 for open-end generation.
Setting `pad_token_id` to `eos_token_id`:2 for open-end generation.
Setting `pad_token_id` to `eos_token_id`:2 for o

['<s>[INST] <<SYS>>\\nProvide answers in Python\\n<</SYS>>\\n\\n编写一个 python 函数，从字符串中删除第一次和最后一次出现的给定字符。[/INST]', '<s>[INST] <<SYS>>\\nProvide answers in Python\\n<</SYS>>\\n\\n编写一个函数，根据给定矩阵的行总和对给定矩阵进行升序排序。[/INST]', '<s>[INST] <<SYS>>\\nProvide answers in Python\\n<</SYS>>\\n\\n编写一个 python 函数来求三棱柱的体积。[/INST]', '<s>[INST] <<SYS>>\\nProvide answers in Python\\n<</SYS>>\\n\\n编写一个函数，如果输入字符串包含以下划线连接的小写字母序列，则返回 true，否则返回 false。[/INST]', '<s>[INST] <<SYS>>\\nProvide answers in Python\\n<</SYS>>\\n\\n编写一个函数，在给定输入边长的情况下返回正方形的周长。[/INST]', '<s>[INST] <<SYS>>\\nProvide answers in Python\\n<</SYS>>\\n\\n编写一个函数，从第一个字符串中删除第二个字符串中存在的字符。[/INST]', '<s>[INST] <<SYS>>\\nProvide answers in Python\\n<</SYS>>\\n\\n编写一个函数来查找给定的整数数组是否包含重复元素。[/INST]', '<s>[INST] <<SYS>>\\nProvide answers in Python\\n<</SYS>>\\n\\n编写一个函数来检查给定的数字是否是木球。[/INST]', '<s>[INST] <<SYS>>\\nProvide answers in Python\\n<</SYS>>\\n\\n编写一个 python 函数来检查给定数字是否小于其倒数两倍。[/INST]', '<s>[INST] <<SYS>>\\nProvide answers in Python\\n<</SYS>>\\n\\n编写一个 p

Setting `pad_token_id` to `eos_token_id`:2 for open-end generation.
Setting `pad_token_id` to `eos_token_id`:2 for open-end generation.
Setting `pad_token_id` to `eos_token_id`:2 for open-end generation.
Setting `pad_token_id` to `eos_token_id`:2 for open-end generation.
Setting `pad_token_id` to `eos_token_id`:2 for open-end generation.
Setting `pad_token_id` to `eos_token_id`:2 for open-end generation.
Setting `pad_token_id` to `eos_token_id`:2 for open-end generation.
Setting `pad_token_id` to `eos_token_id`:2 for open-end generation.
Setting `pad_token_id` to `eos_token_id`:2 for open-end generation.
Setting `pad_token_id` to `eos_token_id`:2 for open-end generation.
Setting `pad_token_id` to `eos_token_id`:2 for open-end generation.
Setting `pad_token_id` to `eos_token_id`:2 for open-end generation.
Setting `pad_token_id` to `eos_token_id`:2 for open-end generation.
Setting `pad_token_id` to `eos_token_id`:2 for open-end generation.
Setting `pad_token_id` to `eos_token_id`:2 for o

In [None]:
print(result_dir)