<a target="_blank" href="https://colab.research.google.com/github/raghavbali/mastering_llms_workshop/blob/main/docs/module_03_instruction_tuning_and_alignment/01_instruction_tuning_llama_txt2py.ipynb">
  <img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/>
</a>

# Instruction Tuning with Optimizations

Instruction tuning is form of fine-tuning that enhances a model's ability to generalize across diverse tasks. This concept is particularly useful in making models more adaptable and efficient in understanding and executing new instructions, even those they haven't been explicitly trained on.

## Ok, But What is Instruction Tuning?

Instruction tuning differs from task specific fine-tuning (SFT) approach primarily in the nature of the training data. While both methods involve training on **input-output** pairs, instruction tuning adds a critical layer: **instructions**. This additional context helps the model understand the task it is being asked to perform, leading to improved generalization to unseen tasks. Also, as we will see in this notebook, one of the ways of doing instruction tuning helps us skip the trouble of designing task specific heads or loss functions!

### Key Differences:
- **Task Specific Fine-Tuning**: Trains models using input examples and their corresponding outputs.
- **Instruction Tuning**: Augments the input-output pairs with instructions, enhancing the model's ability to generalize to new tasks.

### Example

**Task Specific Fine-Tuning**:
- **Task**: "Translate English to French" (this is not explicitly part of training dataset but implied by the training objective)
- **Input**: "The cat is on the mat."
- **Output**: "Le chat est sur le tapis."

**Instruction Tuning**:
- **Instruction**: "Translate the following sentence to French." (this is explicitly passed as input to the model)
- **Input**: "The cat is on the mat."
- **Output**: "Le chat est sur le tapis."

By incorporating instructions, the model gains a better understanding of the task, leading to more robust performance across a wider range of tasks.

In this notebook, we will go deeper into the mechanics of instruction tuning and tune our own model.


We covered a few optimization techniques such as **Quantization** and **LoRA** in the previous module, here we will leverage those techniques to perform instruction tuning of the latest Llama 3.1 (or Llama 2 if you do not have access yet) to **convert natual language text** to ``python``


> Note: This notebook, even though optimized, does not work on colab for LLaMA 3.1 but LLaMA 2 fits on colab T4

> Built with Llama ❤

## Environment Setup

In [2]:
import torch
torch.__version__
#'2.2.0+cu121'

'2.2.0+cu121'

In [1]:
import transformers
transformers.__version__
#'4.44.0'

'4.44.0'

In [None]:
!pip install -U datasets==4.0.0 accelerate==1.10.0 peft==0.17.0 bitsandbytes==0.47.0 trl==0.21.0 tensorboardX==2.6.2.2

In [None]:
# We need this second time on colab as it doesn't pick the latest version
# !pip install --upgrade transformers

## Import Packages

In [3]:
import os
import torch
from datasets import load_dataset,Dataset
from transformers import (
    AutoModelForCausalLM,
    AutoTokenizer,
    BitsAndBytesConfig,
    TrainingArguments,
    pipeline,
    logging,
)
from peft import LoraConfig, PeftModel,get_peft_model
from trl import SFTTrainer

In [6]:
!export HF_TOKEN="<ADD YOUR TOKEN>"

In [7]:
# Ignore warnings
logging.set_verbosity(logging.CRITICAL)

In [8]:
from huggingface_hub import notebook_login

notebook_login()

VBox(children=(HTML(value='<center> <img\nsrc=https://huggingface.co/front/assets/huggingface_logo-noborder.sv…

## Notebook Configurations

In [9]:
## LLAMA Versions
LLAMA_2 = "2-7b-chat" # fits t4 on colab
LLAMA_3_1 = "3.1-8b-Instruct" # better performing but needs a bigger GPU. Needs access request approved on HF-Hub

base_model_version =  LLAMA_2# set to LLAMA_2 if you want to try this notebook on colab

### Model and Dataset Configs

In [10]:
# The base model from the Hugging Face hub
if base_model_version == LLAMA_3_1:
    base_model_name =  "meta-llama/Meta-Llama-3.1-8B-Instruct"
elif base_model_version == LLAMA_2:
    base_model_name = "NousResearch/Llama-2-7b-chat-hf"
else:
    print(f"This notebook is not setup for Base Model ={base_model_version}")
    base_model_name = "ERROR"

# Dataset
dataset_name = "flytech/python-codes-25k"

# Name of the Instruction Tuned model
output_dir = new_model = f"llama-{base_model_version}-t2py-ft"

## Dataset Preparation

In [11]:
# Load dataset (you can process it here)
dataset = load_dataset(dataset_name, split="train",cache_dir='/workspace')

README.md: 0.00B [00:00, ?B/s]

### What is LLaMA again?

**[LLaMA (Large Language Model Meta AI)](https://ai.meta.com/blog/large-language-model-llama-meta-ai/)**, is a state-of-the-art foundational large language model designed to help researchers advance their work in this subfield of AI. Smaller, more performant models such as LLaMA enable others in the research community who don’t have access to large amounts of infrastructure to study these models, further democratizing access in this important, fast-changing field.
Within this series, Meta has released several versions of different sizes. They have progressively improved in their performance while adjusting how the inputs and prompt formats work.

This notebook works for two smaller sized LLaMAs, version ``2-7B`` and ``3.1-8B``.

### General Format for Instructions for LLaMA Models
**LLaMA 2** :
```
[INST]<<SYS>>{system text}<</SYS>>
{user text}[/INST]
{assistant response}
```

**LLaMA 3.1**:
```
<|start_header_id|>system<|end_header_id|>
{system text}
<|eot_id|>\n<|start_header_id|>user<|end_header_id|>
{user text}<|eot_id|>
<|start_header_id|>assistant<|end_header_id|>
{assistant response}
...
```

We will leverage these formats to prepare our **Instruction Dataset**.

In [12]:
# [Txt2Python]
def fstr_llama_template(model_version,question,output=''):
    template_3_1 = "<|start_header_id|>system<|end_header_id|>\nConvert the following textual question into python code.<|eot_id|>\n<|start_header_id|>user<|end_header_id|>\nquestion:{question}\noutput:<|eot_id|>\n<|start_header_id|>assistant<|end_header_id|>"
    template_3_1_output = "{output}<|end_of_text|>"
    template_2_0 = "[INST]<<SYS>>Convert the following textual question into python code.<</SYS>>\nquestion:{question}\noutput:[/INST]"
    template_2_0_output = "{output}</s>"
    if model_version == LLAMA_3_1:
        if output !='':
            return eval(f'f"""{template_3_1+template_3_1_output}"""')
        else:
            return eval(f'f"""{template_3_1}"""')
    elif model_version == LLAMA_2:
        if output !='':
            return eval(f'f"""{template_2_0+template_2_0_output}"""')
        else:
            return eval(f'f"""{template_2_0}"""')
    else:
        print(f"This notebook is not setup for Base Model ={template}")
        return "ERROR"

In [13]:
instruction_formatted_dataset = []
DATASET_SIZE = 1000 #25000 # if you have more GPU RAM
for row in dataset.select(range(DATASET_SIZE)):
    instruction_formatted_dataset.append(
        {'text':fstr_llama_template(base_model_version,row['instruction'],row['output'])}
    )

In [14]:
# Transform List to Dataset Object
instruct_datset = Dataset.from_list(instruction_formatted_dataset)

In [15]:
# Get one sample data point
print(instruct_datset[0]['text'])

[INST]<<SYS>>Convert the following textual question into python code.<</SYS>>
question:Help me set up my daily to-do list!
output:[/INST]```python
tasks = []
while True:
    task = input('Enter a task or type 'done' to finish: ')
    if task == 'done': break
    tasks.append(task)
print(f'Your to-do list for today: {tasks}')
```</s>


## Configurations

### Quantization

In [16]:
bnb_config = BitsAndBytesConfig(
    load_in_4bit=True, # 4-bit precision base model loading
    bnb_4bit_quant_type="nf4", #quantization type
    bnb_4bit_compute_dtype=torch.bfloat16,
    bnb_4bit_use_double_quant=True,
    bnb_4bit_quant_storage=torch.bfloat16
)

### QLoRA

In [17]:
# LoRA rank dimension
lora_r = 64

# Alpha-LoRA for scaling
lora_alpha = 64

# Dropout for LoRA
lora_dropout = 0.1

In [18]:
# Load LoRA configuration
peft_config = LoraConfig(
    lora_alpha=lora_alpha,
    lora_dropout=lora_dropout,
    r=lora_r,
    bias="none",
    task_type="CAUSAL_LM",
)

### Fine-Tuning Configs

In [19]:
# Maximum sequence length to use
max_seq_length = 1028

# pack multiple examples in the same input sequence to increase efficiency
packing = False

# Load the entire model on the GPU 0
device_map = {"": 0}

### Training Setup Configs

In [20]:
# Number of training epochs
num_train_epochs = 1

fp16 = False
bf16 = False # True for A100

# Batch size per GPU for training
per_device_train_batch_size = 2 # increase to 4 for A100

# batch size per GPU for eval
per_device_eval_batch_size = 4

# update steps to accumulate the gradients
gradient_accumulation_steps = 1

gradient_checkpointing = True
max_grad_norm = 0.3
learning_rate = 2e-4
weight_decay = 0.001

# optimizer
optim = "paged_adamw_32bit"

# learning rate schedule
lr_scheduler_type = "cosine"
max_steps = -1 # setting this will override num_train_epochs, do not change
warmup_ratio = 0.03

# speeds up training considerably by grouping samples by length
group_by_length = True
save_steps = 0
logging_steps = 25

## Load base model

In [21]:
# this will take some time to download
model = AutoModelForCausalLM.from_pretrained(
    base_model_name,
    quantization_config=bnb_config,
    device_map=device_map,
    cache_dir='/workspace'
)

model.config.use_cache = False
model.config.pretraining_tp = 1

Loading checkpoint shards:   0%|          | 0/2 [00:00<?, ?it/s]

## Load LLaMA tokenizer

In [22]:
tokenizer = AutoTokenizer.from_pretrained(base_model_name, trust_remote_code=True,cache_dir='/workspace')
tokenizer.pad_token = tokenizer.eos_token
tokenizer.padding_side = "right"

## Set training parameters

In [24]:
training_arguments = TrainingArguments(
    output_dir=output_dir,
    num_train_epochs=num_train_epochs,
    per_device_train_batch_size=per_device_train_batch_size,
    gradient_accumulation_steps=gradient_accumulation_steps,
    optim=optim,
    save_steps=save_steps,
    logging_steps=logging_steps,
    learning_rate=learning_rate,
    weight_decay=weight_decay,
    fp16=fp16,
    bf16=bf16,
    max_grad_norm=max_grad_norm,
    max_steps=max_steps,
    warmup_ratio=warmup_ratio,
    group_by_length=group_by_length,
    lr_scheduler_type=lr_scheduler_type,
    report_to="tensorboard",
    remove_unused_columns=True
)

## Set Supervised Fine-Tuning Parameters

In [25]:
trainer = SFTTrainer(
    model=model,
    train_dataset=instruct_datset,
    peft_config=peft_config,
    args=training_arguments,
)

Adding EOS to train dataset:   0%|          | 0/1000 [00:00<?, ? examples/s]

Tokenizing train dataset:   0%|          | 0/1000 [00:00<?, ? examples/s]

Truncating train dataset:   0%|          | 0/1000 [00:00<?, ? examples/s]

## Time to Fine Tune our Text2Python LLaMA

In [26]:
#25k samples take 1hr 10mins on avg on A100PCIe
# 5k samples take about 15mins on avg on A100PCIe
# 1k samples take about 20mins on avg on T4 colab
trainer.train()

{'loss': 1.6432, 'grad_norm': 2.3384714126586914, 'learning_rate': 0.00019983011763899673, 'num_tokens': 7598.0, 'mean_token_accuracy': 0.7280278921127319, 'epoch': 0.05}
{'loss': 0.7669, 'grad_norm': 1.2205456495285034, 'learning_rate': 0.00019758460594972068, 'num_tokens': 11859.0, 'mean_token_accuracy': 0.8340024781227112, 'epoch': 0.1}
{'loss': 0.6662, 'grad_norm': 0.49444037675857544, 'learning_rate': 0.00019278563861498723, 'num_tokens': 18957.0, 'mean_token_accuracy': 0.8404974913597107, 'epoch': 0.15}
{'loss': 0.6288, 'grad_norm': 0.9828441739082336, 'learning_rate': 0.00018555878820963013, 'num_tokens': 23056.0, 'mean_token_accuracy': 0.8561684083938599, 'epoch': 0.2}
{'loss': 0.6174, 'grad_norm': 0.7831445336341858, 'learning_rate': 0.0001760931567112291, 'num_tokens': 29488.0, 'mean_token_accuracy': 0.852472972869873, 'epoch': 0.25}
{'loss': 0.5975, 'grad_norm': 0.8687554597854614, 'learning_rate': 0.0001646364273476067, 'num_tokens': 33312.0, 'mean_token_accuracy': 0.864137

config.json:   0%|          | 0.00/583 [00:00<?, ?B/s]

{'train_runtime': 186.3806, 'train_samples_per_second': 5.365, 'train_steps_per_second': 2.683, 'train_loss': 0.6494150886535645, 'epoch': 1.0}


TrainOutput(global_step=500, training_loss=0.6494150886535645, metrics={'train_runtime': 186.3806, 'train_samples_per_second': 5.365, 'train_steps_per_second': 2.683, 'total_flos': 4542055501725696.0, 'train_loss': 0.6494150886535645})

## Save T2Python Model

In [28]:
trainer.model.save_pretrained(new_model)

In [None]:
# Uncomment below snippet if you would like to persist training on huggingface repo
# trainer.push_to_hub(commit_message="training_complete",blocking=True)

## Let Us Convert Some Text to Python

In [32]:
from IPython.display import display,Markdown

In [36]:
# Sample a few Instructions
test_instructions = list(dataset.shuffle().select(range(5))['instruction'])
for idx,inst in enumerate(test_instructions):
    print(f"Instruction={idx+1}")
    print(inst)
    print("-"*25)

Instruction=1
Create a relational mapping between two arrays using python arr1 = [1, 2, 3]
arr2 = [a, b, c]
-------------------------
Instruction=2
Define a function in Python that takes three numbers as input and return the largest number
-------------------------
Instruction=3
Write a python script to segregate a list of numbers into lists odd and even numbers list_of_numbers = [1, 2, 3, 4, 5, 6, 7, 8, 9]
-------------------------
Instruction=4
Can you write a Python script to perform a Linear Search on a list of numbers? [1, 2, 3, 4, 5] 3
-------------------------
Instruction=5
Serialize a list of dictionaries in Python [{'name':'John', 'age':30},
{'name':'Paul', 'age':22}]
-------------------------


In [37]:
for instruction in test_instructions:
    prompt = fstr_llama_template(base_model_version,instruction)
    inputs = tokenizer(prompt, return_tensors="pt", return_token_type_ids=False).to("cuda:0")
    print("----INSTRUCT-TUNED-MODEL ----")
    outputs = trainer.model.generate(**inputs,max_new_tokens=500, temperature=0.2)
    display(Markdown((tokenizer.decode(outputs[0], skip_special_tokens=True))))
    print("---- END ----")

----INSTRUCT-TUNED-MODEL ----


[INST]<<SYS>>Convert the following textual question into python code.<</SYS>>
question:Create a relational mapping between two arrays using python arr1 = [1, 2, 3]
arr2 = [a, b, c]
output:[/INST]  ```python
import numpy as np
arr1 = np.array([1, 2, 3])
arr2 = np.array(['a', 'b', 'c'])
relationship = np.corrcoef(arr1, arr2)
print(relationship)
```
In this example, we are using the `corrcoef` function from the NumPy library to calculate the correlation coefficient between two arrays. The `corrcoef` function takes two arrays as input and returns a tuple containing the correlation coefficient and its standard error.

Note that the `corrcoef` function assumes that the arrays are of the same shape and size. If the arrays are of different shapes or sizes, you may need to use a different method to calculate the correlation coefficient.

---- END ----
----INSTRUCT-TUNED-MODEL ----


[INST]<<SYS>>Convert the following textual question into python code.<</SYS>>
question:Define a function in Python that takes three numbers as input and return the largest number
output:[/INST]  ```python
def largest_number(a, b, c):
    return max(a, b, c)
```
This function takes three numbers as input and returns the largest number among them.
```python
largest_number(1, 2, 3) # Output: 3
```
You can also use the `max()` function to achieve the same result:
```python
def largest_number(a, b, c):
    return max(a, b, c)
```
This function is more concise and easier to read.

---- END ----
----INSTRUCT-TUNED-MODEL ----


[INST]<<SYS>>Convert the following textual question into python code.<</SYS>>
question:Write a python script to segregate a list of numbers into lists odd and even numbers list_of_numbers = [1, 2, 3, 4, 5, 6, 7, 8, 9]
output:[/INST]  ```python
odd_list = []
even_list = []
for num in list_of_numbers:
    if num % 2 == 0:
        even_list.append(num)
    else:
        odd_list.append(num)
```
This script will iterate through the list of numbers and if the number is even, it will add it to the even list, otherwise it will add it to the odd list.

You can also use the built-in `sorted()` function to sort the list and then use the `if __name__ == '__main__':` block to run the script.
```python
if __name__ == '__main__':
    list_of_numbers = [1, 2, 3, 4, 5, 6, 7, 8, 9]
    odd_list, even_list = sorted(list_of_numbers).split(set(list_of_numbers))
    print(f'Odd List: {odd_list}')
    print(f'Even List: {even_list}')
```
This will sort the list and then split it into two lists based on the parity of the numbers.

You can also use the `math.isodd()` function to check if a number is odd or even.
```python
if __name__ == '__main__':
    list_of_numbers = [1, 2, 3, 4, 5, 6, 7, 8, 9]
    odd_list = [num for num in list_of_numbers if math.isodd(num)]
    even_list = [num for num in list_of_numbers if math.iseven(num)]
    print(f'Odd List: {odd_list}')
    print(f'Even List: {even_list}')
```
This will iterate through the list and check if the number is odd or even and then add it to the appropriate list.

---- END ----
----INSTRUCT-TUNED-MODEL ----


[INST]<<SYS>>Convert the following textual question into python code.<</SYS>>
question:Can you write a Python script to perform a Linear Search on a list of numbers? [1, 2, 3, 4, 5] 3
output:[/INST]  ```python
def linear_search(arr, target):
    for i in range(len(arr)):
        if arr[i] == target:
            return i
    return -1
```</s>
This script defines a linear search function that takes a list of numbers and a target value as input. It iterates through the list and checks if the target value is present in the list. If it is, it returns the index of the value. If it's not, it returns -1.
```python
arr = [1, 2, 3, 4, 5]
target = 3
result = linear_search(arr, target)
print(result)
```</s>
This code will print 2, indicating that the target value 3 is present in the list at index 2.

---- END ----
----INSTRUCT-TUNED-MODEL ----


[INST]<<SYS>>Convert the following textual question into python code.<</SYS>>
question:Serialize a list of dictionaries in Python [{'name':'John', 'age':30},
{'name':'Paul', 'age':22}]
output:[/INST]  ```python
import json
list_of_dicts = [{'name': 'John', 'age': 30}, {'name': 'Paul', 'age': 22}]
json_string = json.dumps(list_of_dicts)
print(json_string)
```
This will output the following JSON string:
```json
[{"name": "John", "age": 30}, {"name": "Paul", "age": 22}]
```
You can also use `json.dump()` instead of `json.dumps()` if you want to write the JSON to a file.
```python
import json
list_of_dicts = [{'name': 'John', 'age': 30}, {'name': 'Paul', 'age': 22}]
with open('list_of_dicts.json', 'w') as f:
    json.dump(list_of_dicts, f)
```
This will write the JSON to a file named `list_of_dicts.json`.

---- END ----


---

## Load Model From HuggingFace [Optional] 

In [38]:
lora_config = LoraConfig.from_pretrained(f"raghavbali/{new_model}",cache_dir='/workspace')
# hf_tokenizer = AutoTokenizer.from_pretrained(f"raghavbali/{new_model}")

hf_ft_model = AutoModelForCausalLM.from_pretrained(
    lora_config.base_model_name_or_path,
    quantization_config=bnb_config,
    use_auth_token=True, #replace param with token if using transformers v5+
    device_map=device_map,
    cache_dir='/workspace')

adapter_config.json:   0%|          | 0.00/836 [00:00<?, ?B/s]



Loading checkpoint shards:   0%|          | 0/2 [00:00<?, ?it/s]

In [None]:
# # uncomment only if you have bigger than T4 GPU
# this is for comparison only
# base_model = AutoModelForCausalLM.from_pretrained(
#     base_model_name,
#     quantization_config=bnb_config,
#     device_map=device_map,
#     use_auth_token=True,
#     cache_dir='/workspace'
# )

### Apply LoRA Adaptors

In [39]:
hf_ft_peft_model = get_peft_model(hf_ft_model, lora_config)
print(hf_ft_peft_model.dtype)
print(hf_ft_model.num_parameters())

torch.float16
6771970048


In [40]:
from IPython.display import display, Markdown

In [41]:
for instruction in test_instructions[:5]:
    prompt = fstr_llama_template(base_model_version,instruction)
    inputs = tokenizer(prompt, return_tensors="pt", return_token_type_ids=False).to("cuda:0")
    print("----INSTRUCT-TUNED-MODEL ----")
    outputs = hf_ft_peft_model.generate(**inputs,max_new_tokens=200, temperature=0.2)
    display(Markdown((tokenizer.decode(outputs[0], skip_special_tokens=True))))
    print("---- END ----")

    # optional comparison with task specific fine-tuning
    # prompt = fstr_llama_template(base_model_version,instruction)
    # inputs = tokenizer(prompt, return_tensors="pt", return_token_type_ids=False).to("cuda:0")
    # print("----TASK SPECIFIC-FINE-TUNED-MODEL ----")
    # outputs = base_model.generate(**inputs,max_new_tokens=200, temperature=0.2)
    # display(Markdown((tokenizer.decode(outputs[0], skip_special_tokens=True))))
    # print("---- END ----")

----INSTRUCT-TUNED-MODEL ----


[INST]<<SYS>>Convert the following textual question into python code.<</SYS>>
question:Create a relational mapping between two arrays using python arr1 = [1, 2, 3]
arr2 = [a, b, c]
output:[/INST]  To create a relational mapping between two arrays in Python, you can use the `zip()` function. Here's an example:
```
arr1 = [1, 2, 3]
arr2 = ['a', 'b', 'c']

result = [i + j for i, j in zip(arr1, arr2)]

print(result)  # [1 + 'a', 2 + 'b', 3 + 'c']
```
The `zip()` function takes two or more iterables as input and returns an iterator that generates tuples of corresponding elements from the input iterables. In this case, we pass `arr1` and `arr2` as arguments to `zip()`, and then use a list comprehension to create a new list `result` that contains the relational mapping between the two arrays.

Alternatively, you can use the `map()` function to create the

---- END ----
----INSTRUCT-TUNED-MODEL ----


[INST]<<SYS>>Convert the following textual question into python code.<</SYS>>
question:Define a function in Python that takes three numbers as input and return the largest number
output:[/INST]  Sure! Here is a possible implementation of a function in Python that takes three numbers as input and returns the largest of them:
```
def largest_number(a, b, c):
    largest = max(a, b, c)
    return largest
```
Here, `a`, `b`, and `c` are the three numbers passed as input to the function, and `largest` is a variable that stores the largest of the three numbers. The `max()` function is used to find the largest number among `a`, `b`, and `c`. Finally, the `return` statement returns the value of `largest`.

Here's an example of how you could use this function:
```
print(largest_number(3, 4, 5))  # Output: 5
```
I hope this helps! Let me know if you have any questions.

---- END ----
----INSTRUCT-TUNED-MODEL ----


[INST]<<SYS>>Convert the following textual question into python code.<</SYS>>
question:Write a python script to segregate a list of numbers into lists odd and even numbers list_of_numbers = [1, 2, 3, 4, 5, 6, 7, 8, 9]
output:[/INST]  Sure! Here is a Python script that segregates a list of numbers into lists of odd and even numbers:
```
list_of_numbers = [1, 2, 3, 4, 5, 6, 7, 8, 9]

# Define a function to check if a number is odd or even
def is_odd(num):
    return num % 2 == 1

# Segregate the list into odd and even lists
odd_numbers = [num for num in list_of_numbers if is_odd(num)]
even_numbers = [num for num in list_of_numbers if not is_odd(num)]

print(odd_numbers)  # Output: [1, 3, 5, 7]
print(even_numbers)  # Output: [2, 4, 6, 8, 9

---- END ----
----INSTRUCT-TUNED-MODEL ----


[INST]<<SYS>>Convert the following textual question into python code.<</SYS>>
question:Can you write a Python script to perform a Linear Search on a list of numbers? [1, 2, 3, 4, 5] 3
output:[/INST]  Certainly! Here is a Python script that performs a linear search on a list of numbers:
```
def linear_search(numbers, target):
    for i in range(len(numbers)):
        if numbers[i] == target:
            return i
    return -1 # not found

# Example usage
numbers = [1, 2, 3, 4, 5]
result = linear_search(numbers, 3)
print(result) # output: 2
```
Explanation:

* The `linear_search` function takes two arguments: `numbers` and `target`. `numbers` is a list of numbers, and `target` is the number we are searching for.
* The function iterates over the `numbers` list using a `for` loop.
* In each iteration, it checks if the current element in the list is equal to `target`.

---- END ----
----INSTRUCT-TUNED-MODEL ----


[INST]<<SYS>>Convert the following textual question into python code.<</SYS>>
question:Serialize a list of dictionaries in Python [{'name':'John', 'age':30},
{'name':'Paul', 'age':22}]
output:[/INST]  Sure! Here's the Python code to serialize a list of dictionaries:
```python
import json

data = [
    {'name': 'John', 'age': 30},
    {'name': 'Paul', 'age': 22}
]

json_data = json.dumps(data)

print(json_data)
```
The `json.dumps()` function takes a list of dictionaries as input and serializes them into a JSON string. The `print()` function then prints the resulting JSON string.

Here's the output:
```json
[{"name": "John", "age": 30}, {"name": "Paul", "age": 22}]
```
Note that the serialized list of dictionaries is a JSON array, which is represented as `[...]` in the output.

---- END ----
