-
Notifications
You must be signed in to change notification settings - Fork 5.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Train] Update docstring and user guides for train_loop_config
#43691
Merged
matthewdeng
merged 9 commits into
ray-project:master
from
woshiyyya:update_train_loop_config
Mar 8, 2024
Merged
Changes from all commits
Commits
Show all changes
9 commits
Select commit
Hold shift + click to select a range
df46a30
update docstring for train_loop_config
woshiyyya 1f30c72
updating
woshiyyya f8644ed
replace note with warning
woshiyyya 6aa01bd
remove all config arguments in train_func()
woshiyyya 5c6e520
take out the train_func configuration into a separate doc
woshiyyya 6e55db7
Update torch-configure-train_func.rst
woshiyyya 6ce05f3
Merge branch 'master' into update_train_loop_config
woshiyyya 0a5cfb5
update
woshiyyya 22d1177
Merge remote-tracking branch 'origin/update_train_loop_config' into u…
woshiyyya File filter
Filter by extension
Conversations
Failed to load comments.
Jump to
Jump to file
Failed to load files.
Diff view
Diff view
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,51 @@ | ||
First, update your training code to support distributed training. | ||
Begin by wrapping your code in a :ref:`training function <train-overview-training-function>`: | ||
|
||
.. testcode:: | ||
:skipif: True | ||
|
||
def train_func(): | ||
# Your model training code here. | ||
... | ||
|
||
Each distributed training worker executes this function. | ||
|
||
You can also specify the input argument for `train_func` as a dictionary via the Trainer's `train_loop_config`. For example: | ||
|
||
.. testcode:: python | ||
:skipif: True | ||
|
||
def train_func(config): | ||
lr = config["lr"] | ||
num_epochs = config["num_epochs"] | ||
|
||
config = {"lr": 1e-4, "num_epochs": 10} | ||
trainer = ray.train.torch.TorchTrainer(train_func, train_loop_config=config, ...) | ||
|
||
.. warning:: | ||
|
||
Avoid passing large data objects through `train_loop_config` to reduce the | ||
serialization and deserialization overhead. Instead, it's preferred to | ||
initialize large objects (e.g. datasets, models) directly in `train_func`. | ||
|
||
.. code-block:: diff | ||
|
||
def load_dataset(): | ||
# Return a large in-memory dataset | ||
... | ||
|
||
def load_model(): | ||
# Return a large in-memory model instance | ||
... | ||
|
||
-config = {"data": load_dataset(), "model": load_model()} | ||
|
||
def train_func(config): | ||
- data = config["data"] | ||
- model = config["model"] | ||
|
||
+ data = load_dataset() | ||
+ model = load_model() | ||
... | ||
|
||
trainer = ray.train.torch.TorchTrainer(train_func, train_loop_config=config, ...) |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not showing
config
argument in the first place, since we didn't specifytrain_loop_config
inTorchTrainer
in this code snippet. Users will be confused about where to put thetrain_func
arguments.