-
Notifications
You must be signed in to change notification settings - Fork 5.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[AutoTuner] Add auto tuner to obtain optima configuration #54460
Conversation
你的PR提交成功,感谢你对开源项目的贡献! |
90ea799
to
6d47943
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM for set_tests_properties(test_auto_tuner PROPERTIES LABELS "RUN_TYPE=EXCLUSIVE" TIMEOUT 100)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, u can refine code with the comments in the next pr.
|
||
process = subprocess.Popen(cmd) | ||
process.wait() | ||
self.assertEqual(process.returncode, 0) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Check the config searched?
import copy | ||
import json | ||
import signal | ||
import sys | ||
import time | ||
|
||
from ..auto_tuner.tuner import AutoTuner | ||
from ..auto_tuner.utils import gen_new_args | ||
from . import controllers |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Better import at the top of the file
cur_cfg = auto_tuner.search_once() | ||
|
||
# get max time per task run | ||
max_time_per_task = tuner_cfg.get("max_time_per_task", 1800) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
max_time_per_task -> max_time_in_seconds_per_task?
|
||
def __init__(self, tuner_cfg): | ||
self.cur_task_id = 1 | ||
self.task_limit = tuner_cfg.get("task_limit", 100) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
DEFAULT_MAX_TASK_LIMIT = 100 ?
PR types
New features
PR changes
Others
Description
Pcard-72023
The optimal configuration for large model with distributed training/inference often requires designing multiple sets of experiments based on experiences (network, parameter size, gpu memory or flops, etc.), and comparing the results to determine the optimal configuration. This process heavily relies on human experience, and the determined optimal configuration may not be the global optimal configuration. When any condition changes, the above process needs to be repeated repeatedly, resulting in poor usability of large models.
To address the above issues, we have implemented AutoTuner based on Profiling, with the main modules as follows:
At present, we have built-in grid search support for 8 dimensions, including dp degree, mp degree, pp degree, mbs, sharding degree, sharding stage, recompute, and recompute granularity. The example JSON is as follows:
The usage is as follows:
python -m paddle.distributed.launch --devices "0,1,2,3,4,5,6,7" --auto_tuner_json=test.json your_train.py your_args
NOTE: Since the auto_tuner is non-invasive, users need to expose args in their script to enable the configuration generated by auto_tuner be executed.