-
Notifications
You must be signed in to change notification settings - Fork 4.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Need more runtime hooks during a training step #250
Comments
Data fetching, forward pass and back prop are implemented in the schedule. Thus, I don't think they are trainer hooks. Is there any use case for such hooks? |
Correct, and that is why I didnt call them trainer hooks. |
I do agree that this is not supported by Colossal-AI. I found these use cases are indeed not related to schedule if we are adding hooks to schedule.Splitting the batch can be done at the dataset/dataloader or the first layer of model and applying mixup should be done at the dataset/dataloader. |
I am also not sure how to implement such hooks. Just open the issue to collect ideas. |
I think if we can abstract this part, it will provide some flexibility and extensibility to the schedule class. For example, there is a |
We have updated a lot. This issue was closed due to inactivity. Thanks. |
Describe the feature
In the PyTorch fashion, we usually train a model like
In the trainer of Colossal-AI, it is only allowed to add hooks before and after a training step, while users cannot customize the behaviors between fetching an input batch and forward pass, or between forward and backward pass.
Also, since the OpHook is applied to modules recursively, it is not appropriate for this issue either. We may need to add at least two more hooks as mentioned above.
The text was updated successfully, but these errors were encountered: