In [29]:
!huggingface-cli login --token hf_xxxxxx

huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks...
	- Avoid using `tokenizers` before the fork if possible
	- Explicitly set the environment variable TOKENIZERS_PARALLELISM=(true | false)


The token has not been saved to the git credentials helper. Pass `add_to_git_credential=True` in this function directly or `--add-to-git-credential` if using via `huggingface-cli` if you want to set the git credential as well.
Traceback (most recent call last):
  File "/cpfs/user/boyuan/miniconda3/envs/pretrained/lib/python3.10/site-packages/huggingface_hub/utils/_http.py", line 409, in hf_raise_for_status
    response.raise_for_status()
  File "/cpfs/user/boyuan/miniconda3/envs/pretrained/lib/python3.10/site-packages/requests/models.py", line 1026, in raise_for_status
    raise HTTPError(http_error_msg, response=self)
requests.exceptions.HTTPError: 401 Client Error: Unauthorized for url: https://huggingface.co/api/whoami-v2

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/cpfs/user/boyuan/miniconda3/envs/pretrained/lib/python3.10/site-packages/huggingface_hub/hf_api.py", line 1746, in whoami
    hf_raise_for_status(r)
 

In [30]:
from transformers import AutoModelForCausalLM, AutoTokenizer

model_name = "bigscience/bloomz-560m"
#model_name="bigscience/bloom-1b1"

tokenizer = AutoTokenizer.from_pretrained(model_name)
foundation_model = AutoModelForCausalLM.from_pretrained(model_name)

In [31]:
# 一个简单的推理函数
def get_outputs(model, inputs, max_new_tokens=100):
    outputs = model.generate(
        input_ids=inputs["input_ids"].to(model.device),
        attention_mask=inputs["attention_mask"].to(model.device),
        max_new_tokens=max_new_tokens,
        repetition_penalty=1.5, # 避免模型复读，默认值为1.0
        eos_token_id=tokenizer.eos_token_id
    )
    return outputs

In [32]:
# 测试一下这个推理函数
input_sentences = tokenizer("I love this movie because", return_tensors="pt")
foundational_outputs_sentence = get_outputs(foundation_model, input_sentences, max_new_tokens=50)

print(tokenizer.batch_decode(foundational_outputs_sentence, skip_special_tokens=True))

['I love this movie because it is so funny and I am sure that my friends will enjoy too']


In [33]:
from datasets import load_dataset
dataset = "noob123/imdb_review_3000"

#Create the Dataset to create prompts.
data = load_dataset(dataset)
data = data.map(lambda samples: tokenizer(samples['review']), batched=True)
# train_sample = data["train"].select(range(50))
train_sample = data["train"]


train_sample = train_sample.remove_columns('sentiment')

display(train_sample)

Dataset({
    features: ['review', 'input_ids', 'attention_mask'],
    num_rows: 2999
})

In [34]:
print(train_sample[:1])

{'review': ["One of the other reviewers has mentioned that after watching just 1 Oz episode you'll be hooked. They are right, as this is exactly what happened with me.<br /><br />The first thing that struck me about Oz was its brutality and unflinching scenes of violence, which set in right from the word GO. Trust me, this is not a show for the faint hearted or timid. This show pulls no punches with regards to drugs, sex or violence. Its is hardcore, in the classic use of the word.<br /><br />It is called OZ as that is the nickname given to the Oswald Maximum Security State Penitentary. It focuses mainly on Emerald City, an experimental section of the prison where all the cells have glass fronts and face inwards, so privacy is not high on the agenda. Em City is home to many..Aryans, Muslims, gangstas, Latinos, Christians, Italians, Irish and more....so scuffles, death stares, dodgy dealings and shady agreements are never far away.<br /><br />I would say the main appeal of the show is d

In [35]:
import peft
from peft import LoraConfig, get_peft_model, PeftModel

lora_config = LoraConfig(
    r=4, #As bigger the R bigger the parameters to train.
    lora_alpha=1, # a scaling factor that adjusts the magnitude of the weight matrix. Usually set to 1
    target_modules=["query_key_value"], #You can obtain a list of target modules in the URL above.
    lora_dropout=0.05, #Helps to avoid Overfitting.
    bias="lora_only", # this specifies if the bias parameter should be trained.
    task_type="CAUSAL_LM"
)

In [36]:
peft_model = get_peft_model(foundation_model, lora_config)
print(peft_model.print_trainable_parameters())

trainable params: 393,216 || all params: 559,607,808 || trainable%: 0.07026635339584111
None


In [37]:
import os
working_dir = './'

output_directory = os.path.join(working_dir, "peft_lab_outputs")

In [38]:
import transformers
from transformers import TrainingArguments, Trainer
training_args = TrainingArguments(
    output_dir=output_directory,
    auto_find_batch_size=True, # Find a correct bvatch size that fits the size of Data.
    learning_rate= 3e-2, # Higher learning rate than full fine-tuning.
    num_train_epochs=5,
    use_cpu=False,
    report_to="wandb",
    run_name="lora-bloomz-peft",
    logging_strategy="steps",       # ✅ 每隔几个 step 记录一次
    logging_steps=10,
)

In [39]:
trainer = Trainer(
    model=peft_model,
    args=training_args,
    train_dataset=train_sample,
    data_collator=transformers.DataCollatorForLanguageModeling(tokenizer, mlm=False)
)
trainer.train()



Step,Training Loss


Step,Training Loss
10,4.3339
20,4.8152
30,9.2833
40,10.3511
50,10.3785
60,10.2219


Step,Training Loss
10,10.7808
20,10.8821
30,11.427
40,11.0117
50,11.3711
60,11.4416
70,11.346
80,11.509
90,11.1681
100,11.1441




TrainOutput(global_step=1875, training_loss=19.3436814453125, metrics={'train_runtime': 733.9253, 'train_samples_per_second': 20.431, 'train_steps_per_second': 2.555, 'total_flos': 1.849129512357888e+16, 'train_loss': 19.3436814453125, 'epoch': 5.0})

In [40]:
peft_model_path = os.path.join(output_directory, f"lora_model")

trainer.model.save_pretrained(peft_model_path)

In [41]:
loaded_model = PeftModel.from_pretrained(foundation_model, peft_model_path, is_trainable=False)
input_sentences = tokenizer("I love this movie because", return_tensors="pt")
foundational_outputs_sentence = get_outputs(loaded_model, input_sentences, max_new_tokens=100)

print(tokenizer.batch_decode(foundational_outputs_sentence, skip_special_tokens=True))

['I love this movie because -others covesitoa segueixen ,ˈutipankar adequat Skor negatiu riba Eks Firman nge midaurangiakuwa pah Sulaimanokea kag emos Psik Energi Merah Dini karen Keg complicat tancats Maaf allah arti Capella PBB Menjiamatajian vaidade Kemb kaw Statistik Uji roti waw mínim Alami Prodi Lain Milikwendoreen salat usus daur Eka Naik Kabar Komun Fokus Artikel drafurangən fes eficient Beli Sandi Bawa optimis yuk ambição trista illaakari sumb tand Jaga Malah Teror abu spes Fitur Pujoltzaile contribueix compten oli Jl ilm ganja Makan keny lempunjung fillsoreng interessant']
