## Setup

Run the cells below to setup and install the required libraries. For our experiment we will need `accelerate`, `peft`, `transformers`, `datasets` and TRL to leverage the recent [`SFTTrainer`](https://huggingface.co/docs/trl/main/en/sft_trainer). We will use `bitsandbytes` to [quantize the base model into 4bit](https://huggingface.co/blog/4bit-transformers-bitsandbytes). We will also install `einops` as it is a requirement to load Falcon models.

In [1]:
%pip install  -U trl transformers accelerate git+https://github.com/huggingface/peft.git
%pip install  datasets bitsandbytes einops wandb

Collecting git+https://github.com/huggingface/peft.git
  Cloning https://github.com/huggingface/peft.git to /tmp/pip-req-build-hbh5cj9t
  Running command git clone --filter=blob:none --quiet https://github.com/huggingface/peft.git /tmp/pip-req-build-hbh5cj9t
  Resolved https://github.com/huggingface/peft.git to commit 043238578f9c7af88334005819543cf60e263760
  Installing build dependencies ... [?25ldone
[?25h  Getting requirements to build wheel ... [?25ldone
[?25h  Preparing metadata (pyproject.toml) ... [?25ldone
[?25hCollecting trl
  Downloading trl-0.7.4-py3-none-any.whl (133 kB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m133.9/133.9 kB[0m [31m4.0 MB/s[0m eta [36m0:00:00[0m
Collecting transformers
  Downloading transformers-4.35.2-py3-none-any.whl (7.9 MB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m7.9/7.9 MB[0m [31m67.1 MB/s[0m eta [36m0:00:00[0m:00:01[0m0:01[0m
Collecting accelerate
  Downloading accelerate-0.24.1-py3-non

## Loading the model

In [2]:
import torch
from transformers import AutoModelForCausalLM, AutoTokenizer, BitsAndBytesConfig, AutoTokenizer
from accelerate import infer_auto_device_map
model_name = "codellama/CodeLlama-13b-Instruct-hf"

bnb_config = BitsAndBytesConfig(
    load_in_4bit=True,
    bnb_4bit_quant_type="nf4",
    bnb_4bit_compute_dtype=torch.float16,
    bnb_4bit_use_double_quant=False,
)

model = AutoModelForCausalLM.from_pretrained(
    model_name,
    quantization_config=bnb_config,
    trust_remote_code=True,
    device_map="auto",
    offload_folder='offload'
)

config.json:   0%|          | 0.00/589 [00:00<?, ?B/s]

model.safetensors.index.json:   0%|          | 0.00/31.4k [00:00<?, ?B/s]

Downloading shards:   0%|          | 0/3 [00:00<?, ?it/s]

model-00001-of-00003.safetensors:   0%|          | 0.00/9.95G [00:00<?, ?B/s]

model-00002-of-00003.safetensors:   0%|          | 0.00/9.90G [00:00<?, ?B/s]

model-00003-of-00003.safetensors:   0%|          | 0.00/6.18G [00:00<?, ?B/s]



Loading checkpoint shards:   0%|          | 0/3 [00:00<?, ?it/s]

generation_config.json:   0%|          | 0.00/116 [00:00<?, ?B/s]

Let's also load the tokenizer below

In [3]:
tokenizer = AutoTokenizer.from_pretrained(model_name, trust_remote_code=True)
tokenizer.pad_token = tokenizer.eos_token

tokenizer_config.json:   0%|          | 0.00/749 [00:00<?, ?B/s]

tokenizer.model:   0%|          | 0.00/500k [00:00<?, ?B/s]

tokenizer.json:   0%|          | 0.00/1.84M [00:00<?, ?B/s]

special_tokens_map.json:   0%|          | 0.00/411 [00:00<?, ?B/s]

In [4]:
example = {
    'question': 
        "Can you solve this real interview question? Find The Original Array of Prefix Xor - You are given an integer array pref of size n. Find and return the array arr of size n that satisfies:\n\n * pref[i] = arr[0] ^ arr[1] ^ ... ^ arr[i].\n\nNote that ^ denotes the bitwise-xor operation.\n\nIt can be proven that the answer is unique.\n\n \n\nExample 1:\n\n\nInput: pref = [5,2,0,3,1]\nOutput: [5,7,2,3,2]\nExplanation: From the array [5,7,2,3,2] we have the following:\n- pref[0] = 5.\n- pref[1] = 5 ^ 7 = 2.\n- pref[2] = 5 ^ 7 ^ 2 = 0.\n- pref[3] = 5 ^ 7 ^ 2 ^ 3 = 3.\n- pref[4] = 5 ^ 7 ^ 2 ^ 3 ^ 2 = 1.\n\n\nExample 2:\n\n\nInput: pref = [13]\nOutput: [13]\nExplanation: We have pref[0] = arr[0] = 13.\n\n\n \n\nConstraints:\n\n * 1 <= pref.length <= 105\n * 0 <= pref[i] <= 106",
    "code": "class Solution {\npublic:\n    vector<int> findArray(vector<int>& pref) {\n        int n = pref.size();\n        \n        vector<int> arr;\n        arr.push_back(pref[0]);\n        for (int i = 1; i < n; i++) {\n            arr.push_back(pref[i] ^ pref[i - 1]);\n        }\n        \n        return arr;\n    }\n};"
}  # Replace 0 with the index of the example you want to use
text = f"""
    <s>[INST] <<SYS>>
    {{You are a smart AI Explainer. You will explain what is happening in the ###SOLUTION based on what is asked in the ###QUESTION and add the explaination after the ###EXPLANATION }}
    <</SYS>>
    ###QUESTION: {example['question']} 
    ###SOLUTION: {example['code']}[/INST]
    
    ###EXPLANATION:"""
text

'\n    <s>[INST] <<SYS>>\n    {You are a smart AI Explainer. You will explain what is happening in the ###SOLUTION based on what is asked in the ###QUESTION and add the explaination after the ###EXPLANATION }\n    <</SYS>>\n    ###QUESTION: Can you solve this real interview question? Find The Original Array of Prefix Xor - You are given an integer array pref of size n. Find and return the array arr of size n that satisfies:\n\n * pref[i] = arr[0] ^ arr[1] ^ ... ^ arr[i].\n\nNote that ^ denotes the bitwise-xor operation.\n\nIt can be proven that the answer is unique.\n\n \n\nExample 1:\n\n\nInput: pref = [5,2,0,3,1]\nOutput: [5,7,2,3,2]\nExplanation: From the array [5,7,2,3,2] we have the following:\n- pref[0] = 5.\n- pref[1] = 5 ^ 7 = 2.\n- pref[2] = 5 ^ 7 ^ 2 = 0.\n- pref[3] = 5 ^ 7 ^ 2 ^ 3 = 3.\n- pref[4] = 5 ^ 7 ^ 2 ^ 3 ^ 2 = 1.\n\n\nExample 2:\n\n\nInput: pref = [13]\nOutput: [13]\nExplanation: We have pref[0] = arr[0] = 13.\n\n\n \n\nConstraints:\n\n * 1 <= pref.length <= 105\n * 

In [5]:
device = "cuda:0"

# Tokenize the input text
inputs = tokenizer(text, return_tensors="pt").to(device)

# Generate the output using the model
outputs = model.generate(**inputs, max_new_tokens=200)

# Decode and print the output
print(tokenizer.decode(outputs[0], skip_special_tokens=True))

Setting `pad_token_id` to `eos_token_id`:2 for open-end generation.
caused by: ['/opt/conda/lib/python3.10/site-packages/tensorflow_io/python/ops/libtensorflow_io_plugins.so: undefined symbol: _ZN3tsl6StatusC1EN10tensorflow5error4CodeESt17basic_string_viewIcSt11char_traitsIcEENS_14SourceLocationE']
caused by: ['/opt/conda/lib/python3.10/site-packages/tensorflow_io/python/ops/libtensorflow_io.so: undefined symbol: _ZTVN10tensorflow13GcsFileSystemE']



     [INST] <<SYS>>
    {You are a smart AI Explainer. You will explain what is happening in the ###SOLUTION based on what is asked in the ###QUESTION and add the explaination after the ###EXPLANATION }
    <</SYS>>
    ###QUESTION: Can you solve this real interview question? Find The Original Array of Prefix Xor - You are given an integer array pref of size n. Find and return the array arr of size n that satisfies:

 * pref[i] = arr[0] ^ arr[1] ^ ... ^ arr[i].

Note that ^ denotes the bitwise-xor operation.

It can be proven that the answer is unique.

 

Example 1:


Input: pref = [5,2,0,3,1]
Output: [5,7,2,3,2]
Explanation: From the array [5,7,2,3,2] we have the following:
- pref[0] = 5.
- pref[1] = 5 ^ 7 = 2.
- pref[2] = 5 ^ 7 ^ 2 = 0.
- pref[3] = 5 ^ 7 ^ 2 ^ 3 = 3.
- pref[4] = 5 ^ 7 ^ 2 ^ 3 ^ 2 = 1.


Example 2:


Input: pref = [13]
Output: [13]
Explanation: We have pref[0] = arr[0] = 13.


 

Constraints:

 * 1 <= pref.length <= 105
 * 0 <= pref[i] <= 106 
    ###SOLUTION: class