First refactor train #3

Ssukriti · 2023-12-26T22:00:56Z

As part of first refactor , I have created a function for train which will accept predefined dataclasses as arguments instead of taking all possible parameters.
This was done for the following reasons:

more readability - now I know exactly what arguments train function accepts
allows for easier parameter validation + usability - since users have to pass either Loraconfig dataclass or promptuning dataclass, they know upfront what parameters are relevant for each type.
modular code - rather than having everything in a big main function, we can have train, load and run in future

The main function is reading all arguments from command line and passing relevant params to train() as an example.

I will move the main() to an example script going forward, and we can continue to call that main() function with command line arguments or let it serve as a reference for users who want to call train() function directly .

Usage of the script has not changed in this PR. All code changes are structural only, and I verified that pt, lora and ft work same was as they do in main branch

python tuning/sft_trainer.py  \               
--model_name_or_path $MODEL_PATH  \
--data_path $DATA_PATH  \
--output_dir $OUTPUT_PATH  \
--num_train_epochs 5  \
--per_device_train_batch_size 4  \
--per_device_eval_batch_size 4  \
--gradient_accumulation_steps 4  \
--evaluation_strategy "no"  \
--save_strategy "epoch"  \
--learning_rate 1e-5  \
--weight_decay 0.  \
--warmup_ratio 0.03  \
--lr_scheduler_type "cosine"  \
--logging_steps 1  \
--include_tokens_per_second  \
--packing False  \
--response_template "\n### Label:"  \
--dataset_text_field "output" \
--use_flash_attn False  \
--tokenizer_name_or_path $MODEL_PATH \
--torch_dtype "float32" \
--peft_method None

Signed-off-by: Sukriti-Sharma4 <sukriti.sharma4@ibm.com>

Ssukriti · 2023-12-26T22:05:08Z

tuning/config/peft_config.py

 @dataclass
-class lora_config:
+class LoraConfig:


dataclass names have to be camel case as per Python naming conventions. I want to add a black formatter to this repo going forward, and it would fail without this change. To handle name resolution with HF peft, I have renamed the file to peft_config. In code we would refer to it as peft_config.LoraConfig
If there is a need to further to distinguish the two in future, we can rename them as CustomLoraConfig

Signed-off-by: Sukriti-Sharma4 <sukriti.sharma4@ibm.com>

Ssukriti · 2023-12-28T05:36:26Z

tuning/config/configs.py

@@ -30,7 +30,6 @@ class DataArguments:

 @dataclass
 class TrainingArguments(transformers.TrainingArguments):
-    peft_method: str = "lora"  # None, pt


I got rid of this. users can pass the relevant peft config object in train() that will be passed directly to the trainer

Ssukriti · 2023-12-28T05:37:24Z

tuning/config/peft_config.py

    r: int = 8
    lora_alpha: int = 32
    target_modules: List[str] = field(default_factory=lambda: ["q_proj", "v_proj"])
    bias = "none"
-    task_type: str = "CAUSAL_LM"


since the repo only supports CausalLMs we need not expose task_type to user as an argument.

Ssukriti · 2023-12-28T05:41:49Z

tuning/utils/config_utils.py

-        config = prompt_tuning_config()
-        update_config(config, **kwargs)
-        peft_config = PromptTuningConfig(**asdict(config))
+def get_hf_peft_config(task_type, tuning_config):


since train() now accepts the peft_config objects, it makes sense to use those to get the corresponding HF peft config.
Earlier functionality to pass all kwargs has been moved to create_tuning_config utility which can be combined with get_hf_peft_config if needed.

Signed-off-by: Sukriti-Sharma4 <sukriti.sharma4@ibm.com>

raghukiran1224

lgtm

* unsloth-gptq-tritonv2-mixtral * addressed changes to plugin assertions * Update tuning/acceleration/plugins/framework_plugin_unsloth_autogptq.py Co-authored-by: Yu Chin Fabian Lim <fabianlim@users.noreply.github.com> Signed-off-by: achew010 <165894159+achew010@users.noreply.github.com> * pass in explicit dtype to FastLanguageModel in model_loader * Update tuning/acceleration/plugins/framework_plugin_unsloth_autogptq.py Co-authored-by: Yu Chin Fabian Lim <fabianlim@users.noreply.github.com> Signed-off-by: achew010 <165894159+achew010@users.noreply.github.com> * Update tuning/acceleration/plugins/framework_plugin_unsloth_autogptq.py Co-authored-by: Yu Chin Fabian Lim <fabianlim@users.noreply.github.com> Signed-off-by: achew010 <165894159+achew010@users.noreply.github.com> * Update tuning/acceleration/plugins/framework_plugin_unsloth_autogptq.py Co-authored-by: Yu Chin Fabian Lim <fabianlim@users.noreply.github.com> Signed-off-by: achew010 <165894159+achew010@users.noreply.github.com> * removed device_map argument from model loading --------- Signed-off-by: achew010 <165894159+achew010@users.noreply.github.com> Co-authored-by: Yu Chin Fabian Lim <fabianlim@users.noreply.github.com>

…s/konflux updating konflux pipeline timeout

Ssukriti added 3 commits December 26, 2023 12:19

first refactor on train

899a64d

Signed-off-by: Sukriti-Sharma4 <sukriti.sharma4@ibm.com>

fix:None peft type for fine tuning

f13843f

Signed-off-by: Sukriti-Sharma4 <sukriti.sharma4@ibm.com>

accept None as tuningconfig

fd38ffb

Signed-off-by: Sukriti-Sharma4 <sukriti.sharma4@ibm.com>

Ssukriti requested review from raghukiran1224 and lchu-ibm December 26, 2023 22:00

Ssukriti commented Dec 26, 2023

View reviewed changes

Ssukriti added 2 commits December 26, 2023 15:16

correct docstrings

b544203

Signed-off-by: Sukriti-Sharma4 <sukriti.sharma4@ibm.com>

refactor of get_hf_peft_config

d051a82

Signed-off-by: Sukriti-Sharma4 <sukriti.sharma4@ibm.com>

Ssukriti commented Dec 28, 2023

View reviewed changes

Ssukriti mentioned this pull request Dec 28, 2023

validate parameters and edge cases #4

Closed

Ssukriti linked an issue Dec 28, 2023 that may be closed by this pull request

validate parameters and edge cases #4

Closed

Ssukriti added 3 commits December 28, 2023 11:50

semantics for python 3.9

b87a383

Signed-off-by: Sukriti-Sharma4 <sukriti.sharma4@ibm.com>

support python 3.9

4c1b423

Signed-off-by: Sukriti-Sharma4 <sukriti.sharma4@ibm.com>

check accumulate steps>0

03d7f4c

Signed-off-by: Sukriti-Sharma4 <sukriti.sharma4@ibm.com>

raghukiran1224 approved these changes Jan 2, 2024

View reviewed changes

Ssukriti merged commit 49223e8 into main Jan 2, 2024

kpouget pushed a commit to kpouget/fms-hf-tuning that referenced this pull request Jul 19, 2024

Merge pull request foundation-model-stack#3 from red-hat-data-service…

3e56600

…s/konflux updating konflux pipeline timeout

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

First refactor train #3

First refactor train #3

Ssukriti commented Dec 26, 2023 •

edited

Loading

Ssukriti Dec 26, 2023

Ssukriti Dec 28, 2023

Ssukriti Dec 28, 2023

Ssukriti Dec 28, 2023

raghukiran1224 left a comment

First refactor train #3

First refactor train #3

Conversation

Ssukriti commented Dec 26, 2023 • edited Loading

Ssukriti Dec 26, 2023

Choose a reason for hiding this comment

Ssukriti Dec 28, 2023

Choose a reason for hiding this comment

Ssukriti Dec 28, 2023

Choose a reason for hiding this comment

Ssukriti Dec 28, 2023

Choose a reason for hiding this comment

raghukiran1224 left a comment

Choose a reason for hiding this comment

Ssukriti commented Dec 26, 2023 •

edited

Loading