# `transformers` meets `bitsandbytes` for democratzing Large Language Models (LLMs) through 4bit quantization

<center>
<img src="https://github.com/huggingface/blog/blob/main/assets/96_hf_bitsandbytes_integration/Thumbnail_blue.png?raw=true" alt="drawing" width="700" class="center"/>
</center>

Welcome to this notebook that goes through the recent `bitsandbytes` integration that includes the work from XXX that introduces no performance degradation 4bit quantization techniques, for democratizing LLMs inference and training.

In this notebook, we will learn together how to load a large model in 4bit (`gpt-neo-x-20b`) and train it using Google Colab and PEFT library from Hugging Face ü§ó.

[In the general usage notebook](https://colab.research.google.com/drive/1ge2F1QSK8Q7h0hn3YKuBCOAS0bK8E0wf?usp=sharing), you can learn how to propely load a model in 4bit with all its variants. 

If you liked the previous work for integrating [*LLM.int8*](https://arxiv.org/abs/2208.07339), you can have a look at the [introduction blogpost](https://huggingface.co/blog/hf-bitsandbytes-integration) to lean more about that quantization method.


In [1]:
!nvidia-smi

Wed May 31 18:32:01 2023       
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 450.80.02    Driver Version: 450.80.02    CUDA Version: 11.0     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|                               |                      |               MIG M. |
|   0  Tesla V100-PCIE...  Off  | 00000000:00:05.0 Off |                  Off |
| N/A   42C    P0    38W / 250W |  13925MiB / 32510MiB |      0%      Default |
|                               |                      |                  N/A |
+-------------------------------+----------------------+----------------------+
                                                                               
+-----------------------------------------------------------------------------+
| Proces

In [2]:
!pip install -q -U bitsandbytes
!pip install -q -U git+https://github.com/huggingface/transformers.git 
!pip install -q -U git+https://github.com/huggingface/peft.git
!pip install -q -U git+https://github.com/huggingface/accelerate.git
!pip install -q datasets

[0m

First let's load the model we are going to use - GPT-neo-x-20B! Note that the model itself is around 40GB in half precision

In [11]:
from datasets import load_dataset

data = load_dataset("naver-clova-conversation/klue-tc-tsv")

Found cached dataset klue-tc-tsv (/opt/ml/.cache/huggingface/datasets/naver-clova-conversation___klue-tc-tsv/default/0.0.0/df1f2c6ad91377fafd8feb2c571af2fe3c490b9f2928153d7a23a4c38da20699)


  0%|          | 0/3 [00:00<?, ?it/s]

In [12]:
data

DatasetDict({
    train: Dataset({
        features: ['text', 'label'],
        num_rows: 45678
    })
    validation: Dataset({
        features: ['text', 'label'],
        num_rows: 9107
    })
    test: Dataset({
        features: ['text', 'label'],
        num_rows: 10
    })
})

In [13]:
# # data
# data = data.map(
#     lambda x: 
#     {'text': f"### Î™ÖÎ†πÏñ¥: {x['instruction']}\n\n###Îß•ÎùΩ: {x['input']}\n\n### ÎãµÎ≥Ä: {x['output']}<|endoftext|>" }
#     if x['input'] else 
#     {'text':f"### Î™ÖÎ†πÏñ¥: {x['instruction']}\n\n### ÎãµÎ≥Ä: {x['output']}<|endoftext|>"},
# )
# data
label_dict = {0 : 'ITÍ≥ºÌïô',  1 : 'Í≤ΩÏ†ú', 2 : 'ÏÇ¨Ìöå', 3 : 'ÏÉùÌôúÎ¨∏Ìôî', 4 : 'ÏÑ∏Í≥Ñ', 5 :'Ïä§Ìè¨Ï∏†', 6 : 'Ï†ïÏπò'}

data = data['train'].map(
    lambda x: {'text': f"### ÏßàÎ¨∏: {label_dict[x['label']]}\n\n### ÎãµÎ≥Ä: {x['text']}<|endoftext|>" }
)

Map:   0%|          | 0/45678 [00:00<?, ? examples/s]

In [14]:
data[0]

{'text': '### ÏßàÎ¨∏: Ï†ïÏπò\n\n### ÎãµÎ≥Ä: Ïú†ÌäúÎ∏å ÎÇ¥Îã¨ 2ÏùºÍπåÏßÄ ÌÅ¨Î¶¨ÏóêÏù¥ÌÑ∞ ÏßÄÏõê Í≥µÍ∞Ñ Ïö¥ÏòÅ<|endoftext|>',
 'label': 6}

In [16]:
import torch
from transformers import AutoTokenizer, AutoModelForCausalLM, BitsAndBytesConfig

model_id = "EleutherAI/polyglot-ko-12.8b"
bnb_config = BitsAndBytesConfig(
    load_in_4bit=True,
    bnb_4bit_use_double_quant=True,
    bnb_4bit_quant_type="nf4",
    bnb_4bit_compute_dtype=torch.bfloat16
)

tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForCausalLM.from_pretrained(model_id, quantization_config=bnb_config, device_map={"":0})

CUDA SETUP: CUDA runtime path found: /opt/conda/lib/libcudart.so
CUDA SETUP: Highest compute capability among GPUs detected: 7.0
CUDA SETUP: Detected CUDA version 110
CUDA SETUP: Required library version not found: libbitsandbytes_cuda110_nocublaslt.so. Maybe you need to compile it from source?
CUDA SETUP: Defaulting to libbitsandbytes_cpu.so...

CUDA SETUP: CUDA detection failed! Possible reasons:
1. CUDA driver not installed
2. CUDA not installed
3. You have multiple conflicting CUDA libraries
4. Required library not pre-compiled for this bitsandbytes release!
CUDA SETUP: If you compiled from source, try again with `make CUDA_VERSION=DETECTED_CUDA_VERSION` for example, `make CUDA_VERSION=113`.
CUDA SETUP: The CUDA version for the compile might depend on your conda install. Inspect CUDA version via `conda list | grep cuda`.

CUDA SETUP: Something unexpected happened. Please compile from source:
git clone git@github.com:TimDettmers/bitsandbytes.git
cd bitsandbytes
CUDA_VERSION=110 make

RuntimeError: 
        CUDA Setup failed despite GPU being available. Please run the following command to get more information:

        python -m bitsandbytes

        Inspect the output of the command and see if you can locate CUDA libraries. You might need to add them
        to your LD_LIBRARY_PATH. If you suspect a bug, please take the information from python -m bitsandbytes
        and open an issue at: https://github.com/TimDettmers/bitsandbytes/issues

Then we have to apply some preprocessing to the model to prepare it for training. For that use the `prepare_model_for_kbit_training` method from PEFT.

In [None]:
data = data.map(lambda samples: tokenizer(samples["text"]), batched=True)



In [None]:
from peft import prepare_model_for_kbit_training

model.gradient_checkpointing_enable()
model = prepare_model_for_kbit_training(model)

In [None]:
def print_trainable_parameters(model):
    """
    Prints the number of trainable parameters in the model.
    """
    trainable_params = 0
    all_param = 0
    for _, param in model.named_parameters():
        all_param += param.numel()
        if param.requires_grad:
            trainable_params += param.numel()
    print(
        f"trainable params: {trainable_params} || all params: {all_param} || trainable%: {100 * trainable_params / all_param}"
    )

In [None]:
from peft import LoraConfig, get_peft_model

config = LoraConfig(
    r=8, 
    lora_alpha=32, 
    target_modules=["query_key_value"], 
    lora_dropout=0.05, 
    bias="none", 
    task_type="CAUSAL_LM"
)

model = get_peft_model(model, config)
print_trainable_parameters(model)

trainable params: 6553600 || all params: 6608701440 || trainable%: 0.09916622894073424


Let's load a common dataset, english quotes, to fine tune our model on famous quotes.

In [None]:
!nvidia-smi

Tue May 30 05:21:32 2023       
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 525.85.12    Driver Version: 525.85.12    CUDA Version: 12.0     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|                               |                      |               MIG M. |
|   0  Tesla T4            Off  | 00000000:00:04.0 Off |                    0 |
| N/A   52C    P0    30W /  70W |   9435MiB / 15360MiB |      0%      Default |
|                               |                      |                  N/A |
+-------------------------------+----------------------+----------------------+
                                                                               
+-----------------------------------------------------------------------------+
| Proces

Run the cell below to run the training! For the sake of the demo, we just ran it for few steps just to showcase how to use this integration with existing tools on the HF ecosystem.

In [None]:
import transformers

# needed for gpt-neo-x tokenizer
tokenizer.pad_token = tokenizer.eos_token

trainer = transformers.Trainer(
    model=model,
    train_dataset=data["train"],
    args=transformers.TrainingArguments(
        per_device_train_batch_size=2,
        gradient_accumulation_steps=1,
        # warmup_steps=200,
        max_steps=500, ## Ï¥àÏÜåÌòïÎßå ÌïôÏäµ: 10 step = 20Í∞ú ÏÉòÌîåÎßå ÌïôÏäµ.
        learning_rate=2e-4,
        fp16=True,
        logging_steps=10,
        output_dir="outputs",
        optim="paged_adamw_8bit"
    ),
    data_collator=transformers.DataCollatorForLanguageModeling(tokenizer, mlm=False),
)
model.config.use_cache = False  # silence the warnings. Please re-enable for inference!
trainer.train()

Step,Training Loss
10,2.1921
20,2.0258
30,1.9715
40,1.9402
50,1.895
60,2.0488
70,1.9512
80,1.8986
90,1.9194
100,1.8066


TrainOutput(global_step=500, training_loss=1.9244559707641602, metrics={'train_runtime': 2828.6026, 'train_samples_per_second': 0.354, 'train_steps_per_second': 0.177, 'total_flos': 1.103736813379584e+16, 'train_loss': 1.9244559707641602, 'epoch': 0.05})

In [None]:
print("wow")

wow


In [None]:
model.eval()
model.config.use_cache = True  # silence the warnings. Please re-enable for inference!

In [None]:
model.generate(**tokenizer("### ÏßàÎ¨∏: Ïò§Îäò ÎÇ†Ïî®Îäî?", return_tensors='pt', return_token_type_ids=False))

Setting `pad_token_id` to `eos_token_id`:0 for open-end generation.


tensor([[   6,    6,    6, 2438,   29, 1832, 4770,  272,   34,  224,  202, 4588,
         4770,  272, 1382, 5674,   34, 5000, 9357,  272]])

In [None]:
def gen(x):
    gened = model.generate(
        **tokenizer(
            f"### ÏßàÎ¨∏: {x}\n\n### ÎãµÎ≥Ä:", 
            return_tensors='pt', 
            return_token_type_ids=False
        ), 
        max_new_tokens=256,
        early_stopping=True,
        do_sample=True,
        eos_token_id=2,
    )
    print(tokenizer.decode(gened[0]))

In [None]:
gen('Í±¥Í∞ïÌïòÍ≤å ÏÇ¥Í∏∞ ÏúÑÌïú ÏÑ∏ Í∞ÄÏßÄ Î∞©Î≤ïÏùÄ?')

Setting `pad_token_id` to `eos_token_id`:2 for open-end generation.


'### ÏßàÎ¨∏: Í±¥Í∞ïÌïòÍ≤å ÏÇ¥Í∏∞ ÏúÑÌïú ÏÑ∏ Í∞ÄÏßÄ Î∞©Î≤ïÏùÄ?\n\n### ÎãµÎ≥Ä: 1. Í∑†ÌòïÏûàÎäî ÏãùÏÇ¨: Ìé∏ÏãùÏùÑ Î©ÄÎ¶¨ÌïòÍ≥†, Ï±ÑÏÜå, Í≥ºÏùº, ÏÉùÏÑ†ÏùÑ ÏÑ≠Ï∑®ÌïòÎ©¥ Í±¥Í∞ïÌï¥Ï†∏Ïöî.\n\n2. Í≥ºÏãùÏùÑ Ï§ÑÏùº Í≤É: Í≥ºÏãùÌïòÍ≤å ÎêòÎ©¥ ÏúÑÏû•Ïù¥ ÌôúÎ∞úÌïòÏó¨Ï†∏ÏÑú Î≥ëÏù¥ ÎÇòÍ∏∞ ÏâΩÍ≤å ÎêòÏöî.\n3. Ï†ÅÎãπÌïú Ïä§Ìä∏Î†àÏä§: Ïä§Ìä∏Î†àÏä§Î•º ÏåìÏïÑÎëêÎäî Í≤ÉÏùÄ ÎÇ¥ ÏïàÏóê ÎèÖÏù¥ ÏåìÏù¥Îäî Í≤ÉÏù¥ÎùºÏÑú Ïò§ÌûàÎ†§ Í±¥Í∞ïÏóê Ï¢ãÏßÄ ÏïäÏïÑÏöî.\n\nÏ¶â, Í∑†ÌòïÏûàÎäî ÏãùÏÇ¨, Ï†ÅÎãπÌïú Ïä§Ìä∏Î†àÏä§ Ìï¥ÏÜå, Í≥ºÏãùÏùÑ Ï§ÑÏù¥Î©¥ Ïö∞Î¶¨ Î™∏Ïóê Ï¢ãÏäµÎãàÎã§. Ïù¥ ÏÑ∏ Í∞ÄÏßÄÎ•º ÏäµÍ¥ÄÌôîÌïòÎ©¥ÏÑú Í±¥Í∞ïÌïú ÏÇ∂ÏùÑ Ïú†ÏßÄÌïúÎã§Î©¥ Í±¥Í∞ïÌïòÍ≤å ÏÇ¥ Ïàò ÏûàÍ≤å Îê† Í≤ÉÏûÖÎãàÎã§. \n\nÏúÑÏùò ÏÑ∏ Í∞ÄÏßÄ Î∞©Î≤ïÏùÄ Ï¥àÎì±ÌïôÏÉù ÎïåÎ∂ÄÌÑ∞ Î∞∞Ïö∞Îçò Í≤ÉÏù¥ÎØÄÎ°ú, ÏâΩÍ≤å Ïã§ÌñâÌï† Ïàò ÏûàÏäµÎãàÎã§. Î™®Îëê Í±¥Í∞ïÌïòÍ≤å ÏÇ¨Îäî ÏßÄÎ¶ÑÍ∏∏Ïù¥ÎØÄÎ°ú, Ïù¥Î•º Ïã§Ï≤úÌïòÏó¨ Í±¥Í∞ïÌïú ÏÇ∂ÏùÑ ÏÇ¥ Ïàò ÏûàÎèÑÎ°ù ÎÖ∏Î†•Ìï©ÏãúÎã§.\n### ÎãµÎ≥ÄÏùÑ ÏûëÏÑ±ÌïòÎäîÎç∞, ÏïΩÍ∞ÑÏùò ÏãúÍ∞ÑÏù¥ Í±∏Î†∏ÏäµÎãàÎã§. ÏßàÎ¨∏ ÎÇ¥Ïö©ÏóêÏÑúÎèÑ Ïïå Ïàò ÏûàÎäî Í≤ÉÏ≤ò

In [None]:
gen('ÏäàÏπ¥ÏõîÎìúÍ∞Ä Î¨¥ÏóáÏù∏Í∞ÄÏöî?')

Setting `pad_token_id` to `eos_token_id`:2 for open-end generation.


'### ÏßàÎ¨∏: ÏäàÏπ¥ÏõîÎìúÍ∞Ä Î¨¥ÏóáÏù∏Í∞ÄÏöî?\n\n### ÎãµÎ≥Ä: ÏäàÏπ¥ÏõîÎìúÎäî Ïú†ÌäúÎ∏å Ïù∏Í∏∞ Î∞©ÏÜ°Ïù∏ÏúºÎ°ú, ÏµúÏ†ïÏÉÅ Í≤ΩÏ†ú Ïú†ÌäúÎ≤ÑÎ°ú ÌèâÍ∞ÄÎ∞õÍ≥† ÏûàÏäµÎãàÎã§. ÏäàÏπ¥ÏôÄ Ìï®ÍªòÌïòÎäî Í≤ΩÏ†úÏùò Î™®Îì† Í≤ÉÏù¥ÎùºÎäî ÌÉÄÏù¥ÌãÄÎ°ú ÏßÑÌñâÎêòÎäî Ïù¥ Ï±ÑÎÑêÏóêÏÑúÎäî Ï£ºÏãù, Í≤ΩÏ†úÏóê ÎåÄÌïú Îã§ÏñëÌïú Ïù¥ÏïºÍ∏∞Í∞Ä Ïò¨ÎùºÏò§Í≥† ÏûàÏäµÎãàÎã§. ÏäàÏπ¥ÏôÄ Ìï®ÍªòÌïòÎäî Í≤ΩÏ†úÏùò Î™®Îì† Í≤ÉÏùò Ï±ÑÎÑê ÎßÅÌÅ¨Îäî Îã§ÏùåÍ≥º Í∞ôÏäµÎãàÎã§.\nhttps://www.youtube.com/channel/UCZ6ny7P7lEIHN5bMljHyKQw ÏäàÏπ¥ÏõîÎìúwww.youtube.com Ïú†ÌäúÎ∏åÏóêÏÑúÎäî Í≤ΩÏ†úÏóê Í¥ÄÏã¨ÏùÑ Í∞ÄÏßÄÏãúÎäî Í≤ÉÏù¥ Ï¢ãÏßÄ ÏïäÏùÑÍπå Ïã∂ÏäµÎãàÎã§. Îî∞ÎùºÏÑú Í∏àÏúµ Í≤ΩÏ†úÏóê Í¥ÄÏã¨Ïù¥ ÎßéÏúºÏã§ Í≤É Í∞ôÏùÄÎç∞, ÏäàÏπ¥Í∞Ä Ïö¥ÏòÅÌïòÎäî ÏµúÏ†ïÏÉÅ Í≤ΩÏ†ú Ïú†ÌäúÎ∏å ÏäàÏπ¥ÏõîÎìúÎ•º Ï∂îÏ≤úÎìúÎ¶ΩÎãàÎã§. ÌäπÌûà ÏäàÏπ¥ÏõîÎìú ÏòÅÏÉÅÏù¥ ÏóÖÎ°úÎìúÎêòÎäî ÏõîÏöîÏùºÏùÄ Íº≠ Ï¶êÍ≤®Î≥¥ÏãúÍ∏∞ Î∞îÎûçÎãàÎã§. ÏõîÏöîÏùºÏóê ÏóÖÎ°úÎìúÎêòÎäî ÏäàÏπ¥ÏõîÎìú ÏòÅÏÉÅÏùÄ ÏäàÏπ¥ÏõîÎìúÏùò ÎåÄÌëú ÏΩîÎÑàÏù¥Î©∞, Îã§ÏñëÌïú Ïπ¥ÌÖåÍ≥†Î¶¨Ïóê ÎåÄÌï¥ Ïù¥ÏïºÍ∏∞ÌïòÍ≥† ÏûàÍ∏∞ ÎïåÎ¨∏Ïó

In [None]:
gen('Í≥µÍ≥µÏÑ±ÏùÑ Ï∏°Ï†ïÌïòÍ∏∞ÏúÑÌïú Î∞©Î≤ïÏùÄ?')

Setting `pad_token_id` to `eos_token_id`:2 for open-end generation.


'### ÏßàÎ¨∏: Í≥µÍ≥µÏÑ±ÏùÑ Ï∏°Ï†ïÌïòÍ∏∞ÏúÑÌïú Î∞©Î≤ïÏùÄ?\n\n### ÎãµÎ≥Ä: Í≥µÍ≥µÏÑ±ÏùÑ Ï∏°Ï†ïÌïòÍ∏∞ ÏúÑÌïú Î™á Í∞ÄÏßÄ Î∞©Î≤ï Ï§ë ÌïòÎÇòÎäî Í≥µÍ≥µÏÑ± Ï≤ôÎèÑÎ•º Íµ¨ÌïòÎäî Í≤ÉÏûÖÎãàÎã§. Ï∏°Ï†ï Î∞©Î≤ïÏùÄ Îã§ÏùåÍ≥º Í∞ôÏäµÎãàÎã§. \n‚ë† ÏÇ¨ÌöåÏÑ±Í≥º Í≤ΩÏ†úÏ†Å Ìö®Í≥º - Í≤ΩÏ†úÏ†Å Ïù¥ÏùµÍ≥º ÏÇ¨Ìöå¬∑Î¨∏ÌôîÏ†Å Ïù¥ÏùµÏùÑ Î™®Îëê Í≥†Î†§ÌïòÎäî Ï¢ÖÌï©ÏÑ±Í≥º Í∏∞Ï§ÄÏóê Îî∞Îùº ÌååÏïÖ.\n\n‚ë° Ï†ïÏ±ÖÏÑ±Í≥º : Í≥µÍ≥µÏÑ±Ïùò ÌèâÍ∞Ä Ï≤ôÎèÑÎäî Í≥µÍ≥µÏ†ïÏ±ÖÏùò Ìö®Í≥ºÎÇò ÏòÅÌñ•ÏúºÎ°ú Ïù∏Ìï¥ Î∞úÏÉùÌïòÎäî ÌòúÌÉùÍ≥º Í≥µÍ≥µ Ï†ïÏ±ÖÏóê ÎåÄÌïú Í∞ÄÏπòÌåêÎã®Ïù¥ÎÇò ÏÑ†Ìò∏Ïóê ÏßÅÏ†ëÏ†ÅÏúºÎ°ú Ïó∞Í≥ÑÎêú Ï†ïÏ±ÖÏ†Å ÏöîÍµ¨ ÏÇ¨Ìï≠ÏùÑ Í∏∞Ï§ÄÏúºÎ°ú Ìï©ÎãàÎã§. \n\n‚ë¢ Ï∞∏Ïó¨ÏÑ±Í≥º Í≥µÎèôÏ≤¥ ÏùòÏãù : Í≥µÍ≥µÏÑ±Ïùò ÌèâÍ∞Ä Ï≤ôÎèÑÎäî Ï†ïÏ±ÖÍ≥º Í¥ÄÎ†®Ìïú ÏãúÎØºÎì§Ïùò ÏßÅÏ†ëÏ∞∏Ïó¨ÏôÄ Í≥µÎèôÏ≤¥ ÏùòÏãù, ÏãúÎØºÎì§Ïù¥ ÎäêÎÅºÎäî Î¨∏Ï†úÎÇò ÏöïÍµ¨, ÏãúÎØºÏùò Î¨∏Ï†úÌï¥Í≤∞ Í≥ºÏ†ïÏóêÏÑúÏùò Í≤ΩÌóòÏù¥ÎÇò ÎäêÎÇå Îì±Ïóê ÎåÄÌïú Í≥µÎèôÏ≤¥ ÏùòÏãùÏûÖÎãàÎã§. \nÎî∞ÎùºÏÑú Í≥µÍ≥µÏÑ±Ïù¥ ÎÜíÏùÄ Ï†ïÏ±ÖÏùºÏàòÎ°ù Í≤ΩÏ†úÏÑ±Í≥º ÏÇ¨Ìöå¬∑Î¨∏ÌôîÏ†Å Ï∏°Î©¥, Ï†ïÏ±ÖÏ†Å Ìö®Í≥º Îì

In [None]:
gen('Ï£ºÏãù ÏãúÏû•ÏóêÏÑú ÏïàÏ†ïÏ†ÅÏúºÎ°ú ÏàòÏùµÏùÑ ÏñªÍ∏∞ ÏúÑÌïú Î∞©Î≤ïÏùÄ?')

Setting `pad_token_id` to `eos_token_id`:2 for open-end generation.


### ÏßàÎ¨∏: Ï£ºÏãù ÏãúÏû•ÏóêÏÑú ÏïàÏ†ïÏ†ÅÏúºÎ°ú ÏàòÏùµÏùÑ ÏñªÍ∏∞ ÏúÑÌïú Î∞©Î≤ïÏùÄ?

### ÎãµÎ≥Ä: Ï£ºÏãù ÏãúÏû•ÏóêÏÑú Íæ∏Ï§ÄÌûà ÎÜíÏùÄ ÏàòÏùµÏùÑ ÏñªÍ∏∞ ÏúÑÌï¥ÏÑúÎäî Î¨¥ÏóáÎ≥¥Îã§ÎèÑ Ï¢ÖÎ™© ÏÑ†ÌÉùÏùÑ ÏûòÌï¥Ïïº Ìï©ÎãàÎã§. ÏΩîÏä§Ìîº ÏßÄÏàòÍ∞Ä ÌïòÎùΩÌïòÎ©¥ ÏàòÏùµÎ•† Í¥ÄÎ¶¨Í∞Ä ÌïÑÏöîÌïòÎ©∞, Ïù¥Î•º ÏúÑÌï¥ Ìà¨Ïûê ÏûêÏÇ∞ Î∞∞Î∂Ñ Î∞è Í∞úÎ≥ÑÏ¢ÖÎ™© ÏÑ†ÌÉùÏù¥ Ï§ëÏöîÌï©ÎãàÎã§. ÌïòÏßÄÎßå Ï£ºÏãù ÏãúÏû•ÏóêÏÑú ÎÜíÏùÄ ÏàòÏùµÏùÑ ÏñªÍ∏∞ ÏúÑÌï¥ÏÑúÎäî Ï¢ÖÎ™© ÏÑ†ÌÉùÏù¥ Ï§ëÏöîÌïòÍ∏∞ ÎïåÎ¨∏Ïóê, Î≥∏Ïù∏Ïùò Ìà¨Ïûê ÏÑ±Ìñ•ÏùÑ ÌååÏïÖÌïòÏó¨ Ïã†Ï§ëÌïú Ìà¨ÏûêÎ•º Ìï¥ÏïºÌï©ÎãàÎã§. Ïù¥Î•º ÏúÑÌï¥ÏÑúÎäî Ï£ºÏãù Ìà¨ÏûêÏóê ÎåÄÌïú Ï†ÑÎ¨∏Ï†ÅÏù∏ Î∂ÑÏÑùÍ≥º Í∏∞ÏóÖ Ï†ïÎ≥¥, Ìà¨Ïûê Ï†ÑÎûµ Îì±ÏùÑ ÏàòÎ¶ΩÌï¥Ïïº Ìï©ÎãàÎã§. Ïù¥Î•º ÏúÑÌï¥ Í∞úÏù∏Ïù¥ ÏßÅÏ†ë Ìà¨ÏûêÎ•º ÌïòÍ∏∞ Î≥¥Îã§Îäî Ï†ÑÎ¨∏Í∞ÄÏùò ÎèÑÏõÄÏùÑ Î∞õÏïÑ Ìà¨ÏûêÌïòÎäî Í≤ÉÏù¥ Ï¢ãÏäµÎãàÎã§. 

ÎßåÏïΩ, Ï†ÑÎ¨∏Ï†ÅÏù∏ Î∂ÑÏÑùÏù¥ Ïñ¥Î†§Ïö∏ Í≤ΩÏö∞ Ïù∏Í≥µÏßÄÎä• ÏÑúÎπÑÏä§Î•º ÌôúÏö©ÌïòÎ©¥ Í∞úÏù∏ÎèÑ ÏâΩÍ≤å ÏÑ±Í≥µÏ†ÅÏù∏ Ìà¨ÏûêÎ•º Ìï† Ïàò ÏûàÏúºÎãà, Ï∞∏Í≥†ÌïòÏãúÍ∏∞ Î∞îÎûçÎãàÎã§. ÎòêÌïú, Ìà¨ÏûêÎ•º Ìï† Îïå

In [None]:
gen('ÌíãÏòµÏÖòÍ≥º ÏΩúÏòµÏÖòÏùò Ï∞®Ïù¥, Í∑∏Î¶¨Í≥† ÏùºÎ∞ò Í∞úÎØ∏ Ìà¨ÏûêÏûêÍ∞Ä ÏÑ†ÌÉùÌï¥Ïïº Ìï† Ìè¨ÏßÄÏÖòÏùÄ?')

Setting `pad_token_id` to `eos_token_id`:2 for open-end generation.


### ÏßàÎ¨∏: ÌíãÏòµÏÖòÍ≥º ÏΩúÏòµÏÖòÏùò Ï∞®Ïù¥, Í∑∏Î¶¨Í≥† ÏùºÎ∞ò Í∞úÎØ∏ Ìà¨ÏûêÏûêÍ∞Ä ÏÑ†ÌÉùÌï¥Ïïº Ìï† Ìè¨ÏßÄÏÖòÏùÄ?

### ÎãµÎ≥Ä: ÌíãÏù¥ÎûÄ Ï£ºÏãùÏùÑ ÏÇ¨Îäî Í≤É(Íµ¨Îß§)ÏùÑ ÏùòÎØ∏ÌïòÍ≥†, ÏΩúÏù¥ÎûÄ Ï£ºÏãùÏùÑ ÌååÎäî(ÌåêÎß§) Í≤ÉÏûÖÎãàÎã§. ÏòàÎ•º Îì§Î©¥, Í∏∞ÏóÖÏùò Ï£ºÍ∞ÄÍ∞Ä 10ÎßåÏõêÏù∏Îç∞, Í∑∏ ÌöåÏÇ¨ÏóêÏÑú Ïã†Í∑ú ÏÇ¨ÏóÖÏùÑ ÏßÑÌñâ Ï§ëÏù¥Îùº, ÏïûÏúºÎ°ú Îß§Ï∂úÏù¥ Ï¶ùÎåÄÎêòÏñ¥ Í∏∞ÏóÖ Í∞ÄÏπòÍ∞Ä ÏÉÅÏäπÌï† Í≤ÉÏúºÎ°ú ÌåêÎã®Ìï¥, Ìï¥Îãπ ÌöåÏÇ¨Ïùò Ï£ºÏãùÏùÑ ÏÇ¨Îäî Í≤ÉÏù¥ ÌíãÏòµÏÖòÏûÖÎãàÎã§. Ïù¥Îäî Ï£ºÏãùÏùÑ ÏÇ¨Îäî ÏãúÏ†êÍ≥º Ìï¥Îãπ ÌöåÏÇ¨Ïùò Ï£ºÍ∞ÄÍ∞Ä ÌïòÎùΩÌï† Í≤ΩÏö∞ÏóêÎäî Í∏∞ÏóÖÏùò Ï£ºÍ∞ÄÏôÄ Ìï®Íªò Ï£ºÍ∞ÄÍ∞Ä ÌïòÎùΩÌïòÎäî Î¶¨Ïä§ÌÅ¨Í∞Ä ÏûàÏäµÎãàÎã§. Î∞òÎ©¥, ÏΩúÏòµÏÖòÏùÄ Ï£ºÏãùÏùÑ ÌååÎäî Í≤ÉÏûÖÎãàÎã§. Ïù¥Îäî Ï£ºÏãùÏùÑ ÌååÎäî ÏãúÏ†êÍ≥º Ìï¥Îãπ ÌöåÏÇ¨Ïùò Ï£ºÍ∞ÄÍ∞Ä Ï¶ùÍ∞ÄÌï¥ÏÑú Ìñ•ÌõÑ Í∏∞ÏóÖ Ï†ÑÎßùÏù¥ Î∞ùÏùÑ Í≤ΩÏö∞ Ï£ºÍ∞ÄÍ∞Ä ÏÉÅÏäπÌïòÎ©∞, Ìï¥Îãπ Ï£ºÏãùÏùò ÌîÑÎ¶¨ÎØ∏ÏóÑÏùÑ Î∞õÏùÑ Ïàò ÏûàÎäî ÏïàÏ†ïÏ†ÅÏù∏ Ìà¨ÏûêÎ≤ïÏûÖÎãàÎã§. ÏùºÎ∞ò Í∞úÎØ∏ Ìà¨ÏûêÏûêÎì§ÏùÄ Ï£ºÏãùÏùÑ Îß§ÏàòÌï† ÎïåÏôÄ Ï£ºÏãùÏùÑ Îß§ÎèÑÌï† Îïå Îã§ÏñëÌïú

In [None]:
gen('ÌíãÏòµÏÖò Îß§ÎèÑÏôÄ ÏΩúÏòµÏÖò Îß§ÏàòÏùò Ï∞®Ïù¥, Í∑∏Î¶¨Í≥† ÏùºÎ∞ò Í∞úÎØ∏ Ìà¨ÏûêÏûêÍ∞Ä ÏÑ†ÌÉùÌï¥Ïïº Ìï† Ìè¨ÏßÄÏÖòÏùÄ?')

Setting `pad_token_id` to `eos_token_id`:2 for open-end generation.


### ÏßàÎ¨∏: ÌíãÏòµÏÖò Îß§ÎèÑÏôÄ ÏΩúÏòµÏÖò Îß§ÏàòÏùò Ï∞®Ïù¥, Í∑∏Î¶¨Í≥† ÏùºÎ∞ò Í∞úÎØ∏ Ìà¨ÏûêÏûêÍ∞Ä ÏÑ†ÌÉùÌï¥Ïïº Ìï† Ìè¨ÏßÄÏÖòÏùÄ?

### ÎãµÎ≥Ä: ÌíãÏò§Î≤Ñ(ÌíãÏùÑ Îß§ÎèÑÌïòÍ≥† Í∑∏Î≥¥Îã§ ÏÉÅÎåÄÏ†ÅÏúºÎ°ú Í∞ÄÍ≤©Ïù¥ ÎÇÆÏùÄ ÏΩúÏùÑ ÏÇ¨Îäî Í≤É)Î•º ÌïúÎã§Í≥† Ìï¥ÏÑú Îçî ÎÜíÏùÄ Í∞ÄÍ≤©Ïùò ÏΩúÏùÑ ÏÇ∞Îã§Îäî ÏùòÎØ∏Îäî ÏïÑÎãôÎãàÎã§. Îçî ÎÇÆÏùÄ Í∞ÄÍ≤©Ïùò ÏΩúÏùÑ ÏÇ¨Îäî Í≤ÉÏù¥ Í∏∞Î≥∏ÏûÖÎãàÎã§. Í∑∏Îü¨ÎÇò ÌíãÏùÑ ÏÇ¨Îäî Í≤ÉÏùÄ ÎßåÍ∏∞ÍπåÏßÄ Î≥¥Ïú†ÌïòÏßÄ ÏïäÍ≥† Ìíã Îß§ÎèÑ Ìè¨ÏßÄÏÖòÏùÄ Ìåî Ïàò ÏûàÍ∏∞ ÎïåÎ¨∏Ïóê Îã®Í∏∞Ï†ÅÏúºÎ°ú ÎÇÆÏùÄ Í∞ÄÍ≤©Ïùò ÏΩúÏùÑ Îß§ÏàòÌïòÎäî Í≤ÉÎ≥¥Îã§ ÎÜíÏùÄ Í∞ÄÍ≤©Ïùò ÏΩúÏùÑ Îß§ÏàòÌï† Ïàò ÏûàÎäî ÏÑ†ÌÉùÏùò Î≤îÏúÑÍ∞Ä ÏûàÎã§Í≥† Ìï† Ïàò ÏûàÏäµÎãàÎã§. Ïù¥Ïóê ÎπÑÌï¥ ÏΩúÏò§Î≤Ñ(ÏΩúÏùÑ Îß§ÎèÑÌïòÍ≥† Í∑∏Î≥¥Îã§ ÏÉÅÎåÄÏ†ÅÏúºÎ°ú Í∞ÄÍ≤©Ïù¥ ÎÇÆÏùÄ ÌíãÏùÑ ÏÇ¨Îäî Í≤É)Î•º ÌïòÎ©¥ ÎßåÍ∏∞ÍπåÏßÄ Î≥¥Ïú†Ìï¥Ïïº ÌïòÍ∏∞ ÎïåÎ¨∏Ïóê Îã®Í∏∞Ï†ÅÏúºÎ°ú ÎÜíÏùÄ Í∞ÄÍ≤©Ïùò ÌíãÏùÑ ÏÇ¨Í±∞ÎÇò ÏΩúÏùÑ ÏÇ¥ ÏàòÎäî ÏóÜÏßÄÎßå, ÎßåÍ∏∞ÍπåÏßÄ Ï£ºÍ∞ÄÍ∞Ä ÏÉÅÏäπÌïúÎã§Î©¥ ÌíãÏò§Î≤ÑÎäî ÏΩúÏò§Î≤ÑÏóê ÎπÑÌï¥ ÎÜíÏùÄ ÏàòÏùµÏùÑ ÎÇº Í∞ÄÎä•ÏÑ±Ïù¥ ÎÜíÏäµÎãà

In [None]:
gen("ÎßàÏßÑÏΩúÏù¥ Î∞úÏÉùÌïòÎäî Ïù¥Ïú†Í∞Ä Î≠êÏïº? Í∑∏Î¶¨Í≥† Ïñ¥ÎñªÍ≤å Ìï¥Ïïº ÎßàÏßÑÏΩúÏùÑ ÎßâÏùÑ Ïàò ÏûàÏñ¥?")

Setting `pad_token_id` to `eos_token_id`:2 for open-end generation.


### ÏßàÎ¨∏: ÎßàÏßÑÏΩúÏù¥ Î∞úÏÉùÌïòÎäî Ïù¥Ïú†Í∞Ä Î≠êÏïº? Í∑∏Î¶¨Í≥† Ïñ¥ÎñªÍ≤å Ìï¥Ïïº ÎßàÏßÑÏΩúÏùÑ ÎßâÏùÑ Ïàò ÏûàÏñ¥?

### ÎãµÎ≥Ä: ÎßàÏßÑÏΩúÏù¥ÎûÄ, ÏÑ†Î¨º/ÏòµÏÖò Îß§Îß§ Ïãú Ï¶ùÍ±∞Í∏àÏù¥ Î∂ÄÏ°±Ìï¥ÏÑú Ï∂îÍ∞ÄÏ¶ùÍ±∞Í∏à ÏòàÏπòÎ•º ÏöîÍµ¨Î∞õÎäî Í≤ÉÏùÑ ÎßêÌï©ÎãàÎã§. Ï∂îÍ∞ÄÏ¶ùÍ±∞Í∏àÏùÄ ÏÑ†Î¨º/ÏòµÏÖò Í±∞Îûò Í≥ÑÏïΩÏãú ÎØ∏Î¶¨ Ï†ïÌï¥ÏßÑ ÏïΩÏ†ïÏ¶ùÍ±∞Í∏àÏóêÏÑú Ï∂îÍ∞ÄÎ°ú Î∂ÄÎã¥Ìï¥Ïïº ÌïòÎäî Ï¶ùÍ±∞Í∏àÏùÑ ÏùòÎØ∏Ìï©ÎãàÎã§. ÌÜµÏÉÅÏ†ÅÏúºÎ°ú Îß§Ïàò Í≥ÑÏïΩ ÏãúÏóêÎäî Í≥ÑÏïΩ Í∏àÏï°Ïùò 10%Î•º Ï¶ùÍ±∞Í∏àÏúºÎ°ú ÏòàÏπòÌïòÍ≥†, Îß§ÎèÑ Í≥ÑÏïΩ ÏãúÏóêÎäî Ìï¥Îãπ Í∏àÏï°Ïùò 3%Î•º ÏòàÏπòÌï©ÎãàÎã§. Îî∞ÎùºÏÑú, Îß§Ïàò Í≥ÑÏïΩ ÏãúÏóêÎäî Ï¥ù 20%Ïùò Ï¶ùÍ±∞Í∏àÏù¥ ÌïÑÏöîÌïòÍ≥†, Îß§ÎèÑ Í≥ÑÏïΩ ÏãúÏóêÎäî Ï¥ù 9%Ïùò Ï¶ùÍ±∞Í∏àÏù¥ ÌïÑÏöîÌï©ÎãàÎã§. Îî∞ÎùºÏÑú, ÏÑ†Î¨º Îß§Îß§ Ï§ëÏóêÎäî Îß§Ïùº Îß§ÏùºÏùò ÏãúÏÑ∏Ïóê Îî∞Îùº Ï¶ùÍ±∞Í∏àÏùÑ ÏàòÏãúÎ°ú ÌïÑÏöîÌïòÎ©¥ Ï∂îÍ∞ÄÎ°ú ÎÇ©Î∂ÄÌï¥ÏïºÌï¥ÏÑú Ï∂îÍ∞ÄÏ¶ùÍ±∞Í∏àÏù¥ Î∞úÏÉùÌïòÎ©∞, Ìï¥Îãπ ÌïÑÏöîÏï° Î∂ÄÏ°±ÏãúÏóêÎäî ÎßàÏßÑÏΩúÏù¥ Î∞úÏÉùÌï©ÎãàÎã§. Îî∞ÎùºÏÑú, ÏÑ†Î¨º Í±∞ÎûòÎ•º ÌïòÍ∏∞ Ï†ÑÏóêÎäî Íº≠ Ï∂îÍ∞ÄÏ¶ùÍ±∞Í∏àÏùÑ Ïò

In [None]:
# ### ÏßàÎ¨∏: ÎßàÏßÑÏΩúÏù¥ Î∞úÏÉùÌïòÎäî Ïù¥Ïú†Í∞Ä Î≠êÏïº? Í∑∏Î¶¨Í≥† Ïñ¥ÎñªÍ≤å Ìï¥Ïïº ÎßàÏßÑÏΩúÏùÑ ÎßâÏùÑ Ïàò ÏûàÏñ¥?

# ### ÎãµÎ≥Ä: ÎßàÏßÑÏΩúÏù¥ÎûÄ, ÏÑ†Î¨º/ÏòµÏÖò Îß§Îß§ Ïãú Ï¶ùÍ±∞Í∏àÏù¥ Î∂ÄÏ°±Ìï¥ÏÑú Ï∂îÍ∞ÄÏ¶ùÍ±∞Í∏à ÏòàÏπòÎ•º ÏöîÍµ¨Î∞õÎäî Í≤ÉÏùÑ ÎßêÌï©ÎãàÎã§. 
# Ï∂îÍ∞ÄÏ¶ùÍ±∞Í∏àÏùÄ ÏÑ†Î¨º/ÏòµÏÖò Í±∞Îûò Í≥ÑÏïΩÏãú ÎØ∏Î¶¨ Ï†ïÌï¥ÏßÑ ÏïΩÏ†ïÏ¶ùÍ±∞Í∏àÏóêÏÑú Ï∂îÍ∞ÄÎ°ú Î∂ÄÎã¥Ìï¥Ïïº ÌïòÎäî Ï¶ùÍ±∞Í∏àÏùÑ ÏùòÎØ∏Ìï©ÎãàÎã§. 
# ÌÜµÏÉÅÏ†ÅÏúºÎ°ú Îß§Ïàò Í≥ÑÏïΩ ÏãúÏóêÎäî Í≥ÑÏïΩ Í∏àÏï°Ïùò 10%Î•º Ï¶ùÍ±∞Í∏àÏúºÎ°ú ÏòàÏπòÌïòÍ≥†, Îß§ÎèÑ Í≥ÑÏïΩ ÏãúÏóêÎäî Ìï¥Îãπ Í∏àÏï°Ïùò 3%Î•º ÏòàÏπòÌï©ÎãàÎã§. 
# Îî∞ÎùºÏÑú, Îß§Ïàò Í≥ÑÏïΩ ÏãúÏóêÎäî Ï¥ù 20%Ïùò Ï¶ùÍ±∞Í∏àÏù¥ ÌïÑÏöîÌïòÍ≥†, Îß§ÎèÑ Í≥ÑÏïΩ ÏãúÏóêÎäî Ï¥ù 9%Ïùò Ï¶ùÍ±∞Í∏àÏù¥ ÌïÑÏöîÌï©ÎãàÎã§. 
# Îî∞ÎùºÏÑú, ÏÑ†Î¨º Îß§Îß§ Ï§ëÏóêÎäî Îß§Ïùº Îß§ÏùºÏùò ÏãúÏÑ∏Ïóê Îî∞Îùº Ï¶ùÍ±∞Í∏àÏùÑ ÏàòÏãúÎ°ú ÌïÑÏöîÌïòÎ©¥ Ï∂îÍ∞ÄÎ°ú ÎÇ©Î∂ÄÌï¥ÏïºÌï¥ÏÑú Ï∂îÍ∞ÄÏ¶ùÍ±∞Í∏àÏù¥ Î∞úÏÉùÌïòÎ©∞, Ìï¥Îãπ ÌïÑÏöîÏï° Î∂ÄÏ°±ÏãúÏóêÎäî ÎßàÏßÑÏΩúÏù¥ Î∞úÏÉùÌï©ÎãàÎã§. 
# Îî∞ÎùºÏÑú, ÏÑ†Î¨º Í±∞ÎûòÎ•º ÌïòÍ∏∞ Ï†ÑÏóêÎäî Íº≠ Ï∂îÍ∞ÄÏ¶ùÍ±∞Í∏àÏùÑ ÏòàÏπòÌïòÎäî Í≤ÉÏù¥ Ï§ëÏöîÌï©ÎãàÎã§. 
# Ï∂îÍ∞ÄÏ¶ùÍ±∞Í∏àÏùÄ 'ÏÑ†Î¨º/ÏòµÏÖò -> Ï¶ùÍ±∞Í∏à -> Ï∂îÍ∞ÄÏòàÏπò/Ï≤≠ÏÇ∞' Î∞©Î≤ïÏúºÎ°ú Ï°∞Ìöå Í∞ÄÎä•ÌïòÎ©∞, Ïù∏ÌÑ∞ÎÑ∑Ïù¥ÎÇò Ï†ÑÌôîÎ°úÎèÑ Ï°∞ÌöåÌï† Ïàò ÏûàÏäµÎãàÎã§. 
# Îòê, 'Ï£ºÎ¨∏/Ï¶ùÍ±∞Í∏à -> Ï∂îÍ∞ÄÏ¶ùÍ±∞Í∏à'ÏùÑ ÌÜµÌï¥ Ï∂îÍ∞ÄÏ¶ùÍ±∞Í∏àÏùÑ ÏûÖÍ∏àÌï† ÏàòÎèÑ ÏûàÏäµÎãàÎã§. Ï∂îÍ∞ÄÏ¶ùÍ±∞Í∏àÏùÑ ÏòàÏπò