<a target="_blank" href="https://colab.research.google.com/github/raghavbali/mastering_llms_workshop/blob/main/docs/module_03_instruction_tuning_and_alignment/01_instruction_tuning_llama_txt2py.ipynb">
  <img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/>
</a>

# Instruction Tuning with Optimizations

Instruction tuning is form of fine-tuning that enhances a model's ability to generalize across diverse tasks. This concept is particularly useful in making models more adaptable and efficient in understanding and executing new instructions, even those they haven't been explicitly trained on.

## Ok, But What is Instruction Tuning?

Instruction tuning differs from task specific fine-tuning (SFT) approach primarily in the nature of the training data. While both methods involve training on **input-output** pairs, instruction tuning adds a critical layer: **instructions**. This additional context helps the model understand the task it is being asked to perform, leading to improved generalization to unseen tasks. Also, as we will see in this notebook, one of the ways of doing instruction tuning helps us skip the trouble of designing task specific heads or loss functions!

### Key Differences:
- **Task Specific Fine-Tuning**: Trains models using input examples and their corresponding outputs.
- **Instruction Tuning**: Augments the input-output pairs with instructions, enhancing the model's ability to generalize to new tasks.

### Example

**Task Specific Fine-Tuning**:
- **Task**: "Translate English to French" (this is not explicitly part of training dataset but implied by the training objective)
- **Input**: "The cat is on the mat."
- **Output**: "Le chat est sur le tapis."

**Instruction Tuning**:
- **Instruction**: "Translate the following sentence to French." (this is explicitly passed as input to the model)
- **Input**: "The cat is on the mat."
- **Output**: "Le chat est sur le tapis."

By incorporating instructions, the model gains a better understanding of the task, leading to more robust performance across a wider range of tasks.

In this notebook, we will go deeper into the mechanics of instruction tuning and tune our own model.


We covered a few optimization techniques such as **Quantization** and **LoRA** in the previous module, here we will leverage those techniques to perform instruction tuning of the latest Llama 3.1 (or Llama 2 if you do not have access yet) to **convert natual language text** to ``python``


> Note: This notebook, even though optimized, does not work on colab for LLaMA 3.1 but LLaMA 2 fits on colab T4

> Built with Llama ❤

## Environment Setup

In [None]:
# !pip install -q accelerate peft bitsandbytes transformers trl tensorboardX bitsandbytes

In [None]:
# We need this second time on colab as it doesn't pick the latest version
# !pip install --upgrade transformers

## Import Packages

In [1]:
import os
import torch
from datasets import load_dataset,Dataset
from transformers import (
    AutoModelForCausalLM,
    AutoTokenizer,
    BitsAndBytesConfig,
    TrainingArguments,
    pipeline,
    logging,
)
from peft import LoraConfig, PeftModel,get_peft_model
from trl import SFTTrainer

In [2]:
!export HF_TOKEN="<ADD YOUR TOKEN>"

In [3]:
# Ignore warnings
logging.set_verbosity(logging.CRITICAL)

In [4]:
from huggingface_hub import notebook_login

notebook_login()

VBox(children=(HTML(value='<center> <img\nsrc=https://huggingface.co/front/assets/huggingface_logo-noborder.sv…

## Notebook Configurations

In [5]:
## LLAMA Versions
LLAMA_2 = "2-7b-chat" # fits t4 on colab
LLAMA_3_1 = "3.1-8b-Instruct" # better performing but needs a bigger GPU. Needs access request approved on HF-Hub

base_model_version =  LLAMA_2# set to LLAMA_2 if you want to try this notebook on colab

### Model and Dataset Configs

In [6]:
# The base model from the Hugging Face hub
if base_model_version == LLAMA_3_1:
    base_model_name =  "meta-llama/Meta-Llama-3.1-8B-Instruct"
elif base_model_version == LLAMA_2:
    base_model_name = "NousResearch/Llama-2-7b-chat-hf"
else:
    print(f"This notebook is not setup for Base Model ={base_model_version}")
    base_model_name = "ERROR"

# Dataset
dataset_name = "flytech/python-codes-25k"

# Name of the Instruction Tuned model
output_dir = new_model = f"llama-{base_model_version}-t2py-ft"

## Dataset Preparation

In [7]:
# Load dataset (you can process it here)
dataset = load_dataset(dataset_name, split="train",cache_dir='/workspace')

The secret `HF_TOKEN` does not exist in your Colab secrets.
To authenticate with the Hugging Face Hub, create a token in your settings tab (https://huggingface.co/settings/tokens), set it as secret in your Google Colab and restart your session.
You will be able to reuse this secret in all of your notebooks.
Please note that authentication is recommended but still optional to access public models or datasets.


### What is LLaMA again?

**[LLaMA (Large Language Model Meta AI)](https://ai.meta.com/blog/large-language-model-llama-meta-ai/)**, is a state-of-the-art foundational large language model designed to help researchers advance their work in this subfield of AI. Smaller, more performant models such as LLaMA enable others in the research community who don’t have access to large amounts of infrastructure to study these models, further democratizing access in this important, fast-changing field.
Within this series, Meta has released several versions of different sizes. They have progressively improved in their performance while adjusting how the inputs and prompt formats work.

This notebook works for two smaller sized LLaMAs, version ``2-7B`` and ``3.1-8B``.

### General Format for Instructions for LLaMA Models
**LLaMA 2** :
```
[INST]<<SYS>>{system text}<</SYS>>
{user text}[/INST]
{assistant response}
```

**LLaMA 3.1**:
```
<|start_header_id|>system<|end_header_id|>
{system text}
<|eot_id|>\n<|start_header_id|>user<|end_header_id|>
{user text}<|eot_id|>
<|start_header_id|>assistant<|end_header_id|>
{assistant response}
...
```

We will leverage these formats to prepare our **Instruction Dataset**.

In [9]:
# [Txt2Python]
def fstr_llama_template(model_version,question,output=''):
    template_3_1 = "<|start_header_id|>system<|end_header_id|>\nConvert the following textual question into python code.<|eot_id|>\n<|start_header_id|>user<|end_header_id|>\nquestion:{question}\noutput:<|eot_id|>\n<|start_header_id|>assistant<|end_header_id|>"
    template_3_1_output = "{output}<|end_of_text|>"
    template_2_0 = "[INST]<<SYS>>Convert the following textual question into python code.<</SYS>>\nquestion:{question}\noutput:[/INST]"
    template_2_0_output = "{output}</s>"
    if model_version == LLAMA_3_1:
        if output !='':
            return eval(f'f"""{template_3_1+template_3_1_output}"""')
        else:
            return eval(f'f"""{template_3_1}"""')
    elif model_version == LLAMA_2:
        if output !='':
            return eval(f'f"""{template_2_0+template_2_0_output}"""')
        else:
            return eval(f'f"""{template_2_0}"""')
    else:
        print(f"This notebook is not setup for Base Model ={template}")
        return "ERROR"

In [10]:
instruction_formatted_dataset = []
DATASET_SIZE = 1000 #25000 # if you have more GPU RAM
for row in dataset.select(range(DATASET_SIZE)):
    instruction_formatted_dataset.append(
        {'text':fstr_llama_template(base_model_version,row['instruction'],row['output'])}
    )

In [11]:
# Transform List to Dataset Object
instruct_datset = Dataset.from_list(instruction_formatted_dataset)

In [12]:
# Get one sample data point
print(instruct_datset[0]['text'])

[INST]<<SYS>>Convert the following textual question into python code.<</SYS>>
question:Help me set up my daily to-do list!
output:[/INST]```python
tasks = []
while True:
    task = input('Enter a task or type 'done' to finish: ')
    if task == 'done': break
    tasks.append(task)
print(f'Your to-do list for today: {tasks}')
```</s>


## Configurations

### Quantization

In [13]:
bnb_config = BitsAndBytesConfig(
    load_in_4bit=True, # 4-bit precision base model loading
    bnb_4bit_quant_type="nf4", #quantization type
    bnb_4bit_compute_dtype=torch.bfloat16,
    bnb_4bit_use_double_quant=True,
    bnb_4bit_quant_storage=torch.bfloat16
)

### QLoRA

In [14]:
# LoRA rank dimension
lora_r = 64

# Alpha-LoRA for scaling
lora_alpha = 16

# Dropout for LoRA
lora_dropout = 0.1

In [15]:
# Load LoRA configuration
peft_config = LoraConfig(
    lora_alpha=lora_alpha,
    lora_dropout=lora_dropout,
    r=lora_r,
    bias="none",
    task_type="CAUSAL_LM",
)

### Fine-Tuning Configs

In [16]:
# Maximum sequence length to use
max_seq_length = 1028

# pack multiple examples in the same input sequence to increase efficiency
packing = False

# Load the entire model on the GPU 0
device_map = {"": 0}

### Training Setup Configs

In [17]:
# Number of training epochs
num_train_epochs = 1

fp16 = False
bf16 = False # True for A100

# Batch size per GPU for training
per_device_train_batch_size = 2 # increase to 4 for A100

# batch size per GPU for eval
per_device_eval_batch_size = 4

# update steps to accumulate the gradients
gradient_accumulation_steps = 1

gradient_checkpointing = True
max_grad_norm = 0.3
learning_rate = 2e-4
weight_decay = 0.001

# optimizer
optim = "paged_adamw_32bit"

# learning rate schedule
lr_scheduler_type = "cosine"
max_steps = -1 # setting this will override num_train_epochs, do not change
warmup_ratio = 0.03

# speeds up training considerably by grouping samples by length
group_by_length = True
save_steps = 0
logging_steps = 25

## Load base model

In [18]:
model = AutoModelForCausalLM.from_pretrained(
    base_model_name,
    quantization_config=bnb_config,
    device_map=device_map,
    cache_dir='/workspace'
)

model.config.use_cache = False
model.config.pretraining_tp = 1

Loading checkpoint shards:   0%|          | 0/2 [00:00<?, ?it/s]

## Load LLaMA tokenizer

In [19]:
tokenizer = AutoTokenizer.from_pretrained(base_model_name, trust_remote_code=True,cache_dir='/workspace')
tokenizer.pad_token = tokenizer.eos_token
tokenizer.padding_side = "right"

## Set training parameters

In [20]:
training_arguments = TrainingArguments(
    output_dir=output_dir,
    num_train_epochs=num_train_epochs,
    per_device_train_batch_size=per_device_train_batch_size,
    gradient_accumulation_steps=gradient_accumulation_steps,
    optim=optim,
    save_steps=save_steps,
    logging_steps=logging_steps,
    learning_rate=learning_rate,
    weight_decay=weight_decay,
    fp16=fp16,
    bf16=bf16,
    max_grad_norm=max_grad_norm,
    max_steps=max_steps,
    warmup_ratio=warmup_ratio,
    group_by_length=group_by_length,
    lr_scheduler_type=lr_scheduler_type,
    report_to="tensorboard"
)

## Set Supervised Fine-Tuning Parameters

In [21]:
trainer = SFTTrainer(
    model=model,
    train_dataset=instruct_datset,
    peft_config=peft_config,
    args=training_arguments,
)

Adding EOS to train dataset:   0%|          | 0/1000 [00:00<?, ? examples/s]

Tokenizing train dataset:   0%|          | 0/1000 [00:00<?, ? examples/s]

Truncating train dataset:   0%|          | 0/1000 [00:00<?, ? examples/s]

## Time to Fine Tune our Text2Python LLaMA

In [22]:
#25k samples take 1hr 10mins on avg on A100PCIe
# 5k samples take about 15mins on avg on A100PCIe
# 1k samples take about 20mins on avg on T4 colab
trainer.train()

{'loss': 2.0767, 'grad_norm': 0.7677287459373474, 'learning_rate': 0.00019983011763899673, 'num_tokens': 7598.0, 'mean_token_accuracy': 0.6830279684066772, 'epoch': 0.05}
{'loss': 0.9611, 'grad_norm': 0.6689113974571228, 'learning_rate': 0.00019758460594972068, 'num_tokens': 11859.0, 'mean_token_accuracy': 0.8102640295028687, 'epoch': 0.1}
{'loss': 0.6963, 'grad_norm': 0.2795554995536804, 'learning_rate': 0.00019278563861498723, 'num_tokens': 18957.0, 'mean_token_accuracy': 0.8369954037666321, 'epoch': 0.15}
{'loss': 0.6513, 'grad_norm': 0.6598781943321228, 'learning_rate': 0.00018555878820963013, 'num_tokens': 23056.0, 'mean_token_accuracy': 0.8517589926719665, 'epoch': 0.2}
{'loss': 0.6387, 'grad_norm': 0.6539326906204224, 'learning_rate': 0.0001760931567112291, 'num_tokens': 29488.0, 'mean_token_accuracy': 0.8469677329063415, 'epoch': 0.25}
{'loss': 0.6138, 'grad_norm': 0.5377271175384521, 'learning_rate': 0.0001646364273476067, 'num_tokens': 33312.0, 'mean_token_accuracy': 0.859945

config.json:   0%|          | 0.00/583 [00:00<?, ?B/s]

{'train_runtime': 1318.9226, 'train_samples_per_second': 0.758, 'train_steps_per_second': 0.379, 'train_loss': 0.7069786338806152, 'epoch': 1.0}


TrainOutput(global_step=500, training_loss=0.7069786338806152, metrics={'train_runtime': 1318.9226, 'train_samples_per_second': 0.758, 'train_steps_per_second': 0.379, 'total_flos': 4542055501725696.0, 'train_loss': 0.7069786338806152})

## Save T2Python Model

In [23]:
trainer.model.save_pretrained(new_model)

In [None]:
# Uncomment below snippet if you would like to persist training on huggingface repo
# trainer.push_to_hub(commit_message="training_complete",blocking=True)

## Let Us Convert Some Text to Python

In [None]:
# Sample a few Instructions
test_instructions = list(dataset.shuffle().select(range(5))['instruction'])
test_instructions

In [33]:
for instruction in test_instructions:
    prompt = fstr_llama_template(base_model_version,instruction)
    inputs = tokenizer(prompt, return_tensors="pt", return_token_type_ids=False).to("cuda:0")
    print("----INSTRUCT-TUNED-MODEL ----")
    outputs = trainer.model.generate(**inputs,max_new_tokens=500, temperature=0.2)
    display(Markdown((tokenizer.decode(outputs[0], skip_special_tokens=True))))
    print("---- END ----")

----INSTRUCT-TUNED-MODEL ----


[INST]<<SYS>>Convert the following textual question into python code.<</SYS>>
question:I am trying to use `psycopg2` to connect to a Postgres DB but apparently it is not supported in 3.9. Is there another alternative I can use?


output:[/INST]  Yes, you can use `psycopg2` with PostgreSQL 3.9, but you need to install the `psycopg2-binary` package. Here's an example of how to use `psycopg2` with PostgreSQL 3.9:
```python
import psycopg2.ext
psycopg2.ext.register_type(psycopg2.ext.TARGET_TYPE)
conn = psycopg2.connect(host='localhost', port=5432, dbname='mydatabase', user='myusername', password='mypassword')
```
Alternatively, you can use `psycopg2-binary` which is a binary-compatible version of `psycopg2` that supports PostgreSQL 3.9:
```python
import psycopg2.ext
psycopg2.ext.register_type(psycopg2.ext.TARGET_TYPE)
conn = psycopg2.connect(host='localhost', port=5432, dbname='mydatabase', user='myusername', password='mypassword')
```
Note that `psycopg2-binary` is not a standard part of the `psycopg2` package, so you may need to install it separately.

---- END ----
----INSTRUCT-TUNED-MODEL ----


[INST]<<SYS>>Convert the following textual question into python code.<</SYS>>
question:Create a Python program to find the highest common factor of two given numbers
output:[/INST]```python
def highest_common_factor(a, b):
    if a == 0 or b == 0:
        return 0
    else:
        return max(a, b)
```</s>
```python
a = 12
b = 15
print(highest_common_factor(a, b))
```</s>
```output
15
```</s>
In this example, the `highest_common_factor()` function takes two arguments `a` and `b`, which are the two numbers to find the highest common factor. The function returns the highest common factor of the two numbers.
```python
a = 12
b = 15
print(highest_common_factor(a, b))
```</s>
```output
15
```</s>
In this example, the `highest_common_factor()` function returns `15` because it is the highest common factor of `12` and `15`.
```python
a = 12
b = 18
print(highest_common_factor(a, b))
```</s>
```output
6
```</s>
In this example, the `highest_common_factor()` function returns `6` because it is the highest common factor of `12` and `18`.
```python
a = 12
b = 20
print(highest_common_factor(a, b))
```</s>
```output
4
```</s>
In this example, the `highest_common_factor()` function returns `4` because it is the highest common factor of `12` and `20`.
```python
a = 12
b = 25
print(highest_common_factor(a, b))
```</s>
```output
5
```</s>
In this example, the `highest_common_factor()` function returns `5` because it is the highest common factor of `12` and `25`.
```python
a = 12
b = 30
print(highest_common_factor(a, b))
```</s>

---- END ----
----INSTRUCT-TUNED-MODEL ----


[INST]<<SYS>>Convert the following textual question into python code.<</SYS>>
question:Optimize the following Python code by changing its structure and by incorporating efficient algorithms def factor_of_three(n):
 count = 0
 for i in range(1, n+1):
  if i % 3 == 0:
   count += 1
 return count

factor_of_three(50)
output:[/INST]  Here's the optimized version of the code:
```python
def factor_of_three(n):
    return sum(1 for i in range(1, n+1) if i % 3 == 0)
```
Explanation:

1. The original code uses a for loop to iterate from 1 to n, and inside the loop, it checks if the current number is divisible by 3. Instead, we can use the sum() function to calculate the sum of the numbers that are divisible by 3.
2. The original code uses a variable 'count' to store the number of numbers that are divisible by 3. Instead, we can use the sum() function to calculate the total number of divisors of n.

Output:
```python
factor_of_three(50)
```
Output: 15

Note: The optimized code is more efficient than the original code because it avoids unnecessary iterations and uses a more efficient algorithm.

---- END ----
----INSTRUCT-TUNED-MODEL ----


[INST]<<SYS>>Convert the following textual question into python code.<</SYS>>
question:Write the code for a web scraper using Selenium.
output:[/INST]  Here's an example of a simple web scraper using Selenium in Python:
```python
from selenium import webdriver
from selenium.webdriver.common.by import By
from selenium.webdriver.common.keys import Keys
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
from selenium.webdriver.common.alert import Alert
import time

driver = webdriver.Chrome()
driver.get('https://www.example.com')
WebDriverWait(driver, 10).until(EC.presence_of_element_located((By.CSS_SELECTOR, 'h1')))
print('Title:', driver.title)
print('Title:', driver.find_element(By.CSS_SELECTOR, 'h1').text)
```
This code uses the `Chrome` driver to navigate to the website, and then uses the `WebDriverWait` class to wait for the title of the page to be loaded. Once the title is loaded, the `print` statement is used to print the title.

You can modify the URL and the locator to scrape different websites.

Please note that web scraping can be illegal depending on the website and the data you are trying to scrape. Make sure you are scraping data that is publicly available and legal to scrape.

---- END ----
----INSTRUCT-TUNED-MODEL ----


KeyboardInterrupt: 

---

## Load Model From HuggingFace [Optional] 

In [40]:
lora_config = LoraConfig.from_pretrained(f"raghavbali/{new_model}",cache_dir='/workspace')
# hf_tokenizer = AutoTokenizer.from_pretrained(f"raghavbali/{new_model}")

hf_ft_model = AutoModelForCausalLM.from_pretrained(
    lora_config.base_model_name_or_path,
    quantization_config=bnb_config,
    use_auth_token=True,
    device_map=device_map,
    cache_dir='/workspace')

adapter_config.json:   0%|          | 0.00/836 [00:00<?, ?B/s]



Loading checkpoint shards:   0%|          | 0/2 [00:00<?, ?it/s]

In [None]:
# # uncomment only if you have bigger than T4 GPU
# this is for comparison only
# base_model = AutoModelForCausalLM.from_pretrained(
#     base_model_name,
#     quantization_config=bnb_config,
#     device_map=device_map,
#     use_auth_token=True,
#     cache_dir='/workspace'
# )

### Apply LoRA Adaptors

In [41]:
hf_ft_peft_model = get_peft_model(hf_ft_model, lora_config)
print(hf_ft_peft_model.dtype)
print(hf_ft_model.num_parameters())

torch.float16
6771970048


In [42]:
from IPython.display import display, Markdown

In [45]:
for instruction in test_instructions[:5]:
    prompt = fstr_llama_template(base_model_version,instruction)
    inputs = tokenizer(prompt, return_tensors="pt", return_token_type_ids=False).to("cuda:0")
    print("----INSTRUCT-TUNED-MODEL ----")
    outputs = hf_ft_peft_model.generate(**inputs,max_new_tokens=200, temperature=0.2)
    display(Markdown((tokenizer.decode(outputs[0], skip_special_tokens=True))))
    print("---- END ----")

    # optional comparison with task specific fine-tuning
    # prompt = fstr_llama_template(base_model_version,instruction)
    # inputs = tokenizer(prompt, return_tensors="pt", return_token_type_ids=False).to("cuda:0")
    # print("----TASK SPECIFIC-FINE-TUNED-MODEL ----")
    # outputs = base_model.generate(**inputs,max_new_tokens=200, temperature=0.2)
    # display(Markdown((tokenizer.decode(outputs[0], skip_special_tokens=True))))
    # print("---- END ----")

----INSTRUCT-TUNED-MODEL ----


[INST]<<SYS>>Convert the following textual question into python code.<</SYS>>
question:I am trying to use `psycopg2` to connect to a Postgres DB but apparently it is not supported in 3.9. Is there another alternative I can use?


output:[/INST]  Sure! Here's the Python code to convert the given textual question:
```
import psycopg2

# Check if psycopg2 is supported in the current Python version
print(f"Is psycopg2 supported in your Python version? {psycopg2.__version__}")

# If not, look for an alternative PostgreSQL driver
alternative_driver = "psycopg3"
print(f"Using alternative PostgreSQL driver: {alternative_driver}")
```
Explanation:

1. The question is asking if the `psycopg2` library is supported in the current Python version.
2. To check this, we use the `__version__` attribute of the `psycopg2` module, which returns the version number of the library.
3. If `psycopg2` is not supported in the current

---- END ----
----INSTRUCT-TUNED-MODEL ----


[INST]<<SYS>>Convert the following textual question into python code.<</SYS>>
question:Create a Python program to find the highest common factor of two given numbers
output:[/INST]  Sure! Here is a Python program that finds the highest common factor of two given numbers:
```
def highest_common_factor(num1, num2):
    # Use the divide and conquer method to find the highest common factor
    while num1 > 1 and num2 > 1:
        if num1 % num2 == 0:
            return num2
        num1 = num1 // num2
    return num1

# Test the function
print(highest_common_factor(12, 15)) # Output: 3
print(highest_common_factor(24, 30)) # Output: 6
```
Explanation:

The `highest_common_factor` function uses the divide and conquer method to find the highest common factor of two given numbers. It works by repeatedly dividing one of the numbers by the other until one of the

---- END ----
----INSTRUCT-TUNED-MODEL ----


[INST]<<SYS>>Convert the following textual question into python code.<</SYS>>
question:Optimize the following Python code by changing its structure and by incorporating efficient algorithms def factor_of_three(n):
 count = 0
 for i in range(1, n+1):
  if i % 3 == 0:
   count += 1
 return count

factor_of_three(50)
output:[/INST]  Here is the Python code to optimize the given code:
```
def factor_of_three(n):
    count = 0
    return_value = 0
    i = 1
    while i <= n:
        if i % 3 == 0:
            count += 1
            return_value += count
            count = 0
        i += 1
    return return_value

factor_of_three(50)
```
Explanation:

1. The original code uses a for loop to iterate from 1 to n, and inside the loop, it checks if i % 3 == 0 and increments the count variable. This can be optimized by using a single while loop, where the condition is i <= n and i % 3 == 0. This reduces the number of iterations and makes the code more efficient.
2. In the original code,

---- END ----
----INSTRUCT-TUNED-MODEL ----


[INST]<<SYS>>Convert the following textual question into python code.<</SYS>>
question:Write the code for a web scraper using Selenium.
output:[/INST]  Sure! Here is an example of how you could use Selenium to write a web scraper in Python:
```
from selenium import webdriver
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC

# Set up the webdriver
driver = webdriver.Chrome()

# Navigate to the website you want to scrape
driver.get("https://www.example.com")

# Find the element you want to scrape
element = driver.find_element(By.CSS_SELECTOR, ".scrapable-element")

# Wait for the element to be visible
WebDriverWait(driver, EC.presence_of_element_located((By.CSS_SELECTOR, ".scrapable-element")))
driver.implicitly_wait(1

---- END ----
----INSTRUCT-TUNED-MODEL ----


[INST]<<SYS>>Convert the following textual question into python code.<</SYS>>
question:Create a function in Python that takes in a list and sorts it in ascending order [3, 5, 1, 2, 9]
output:[/INST]  Sure! Here is the Python code to sort a list in ascending order:
```
def sort_list(list):
    return sorted(list)

# Example usage
list = [3, 5, 1, 2, 9]
sorted_list = sort_list(list)
print(sorted_list)  # [1, 2, 3, 5, 9]
```
Explanation:

* The `def sort_list(list):` line defines a function called `sort_list` that takes in a list as input.
* The `return sorted(list)` line sorts the input list using the `sorted()` function and returns the sorted list.
* In the example usage, we create a list `list` containing the numbers 3, 5, 1, 2, and 9, and pass it to the `sort_list()` function

---- END ----
