-
Notifications
You must be signed in to change notification settings - Fork 11
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Regarding the issue of setup(model, optimizer) in fabric FSDP #3
Comments
Thanks for the note and sorry about the hassle. I think that's because I had the current dev version installed, which has not been released yet. The new version allows you to combine the two lines model = fabric.setup_module(model)
optimizer = fabric.setup_optimizers(optimizer) into a single line model, optimizer = fabric.setup(model, optimizer) But for the latest stable release, the 2 separate lines are necessary. I updated the code accordingly. |
I tried the modification method you mentioned, but the optimizer still reported an error: File "/project/Code/Skin/aa.py", line 21, in <module>
main()
File "/project/Code/Skin/aa.py", line 15, in main
optimizer = fabric.setup_optimizers(optimizer)
File "/opt/conda/envs/wsk-py310/lib/python3.10/site-packages/lightning/fabric/fabric.py", line 289, in setup_optimizers
optimizers = [self._strategy.setup_optimizer(optimizer) for optimizer in optimizers]
File "/opt/conda/envs/wsk-py310/lib/python3.10/site-packages/lightning/fabric/fabric.py", line 289, in <listcomp>
optimizers = [self._strategy.setup_optimizer(optimizer) for optimizer in optimizers]
File "/opt/conda/envs/wsk-py310/lib/python3.10/site-packages/lightning/fabric/strategies/fsdp.py", line 213, in setup_optimizer
raise ValueError(
ValueError: The optimizer does not seem to reference any FSDP parameters. HINT: Make sure to create the optimizer after setting up the model. |
When I follow your code guidance, the environment configuration is consistent with yours, but fabric.setup(model, optimizer) reports an error. I see that your code is also written like this. Have you ever encountered this problem? Looking forward to your reply!
The source code is as follows:
`import torch
import torchvision as tv
import lightning as L
def main():
fabric = L.Fabric(accelerator='cuda', precision='bf16-mixed', devices=torch.cuda.device_count(), strategy='fsdp')
fabric.launch()
if name == 'main':
main()`
The error is reported as follows:
Traceback (most recent call last): File "/project/Code/Skin/aa.py", line 23, in <module> model, optimizer = fabric.setup(model, optimizer) File "/opt/conda/envs/wsk-py310/lib/python3.10/site-packages/lightning/fabric/fabric.py", line 198, in setup self._validate_setup(module, optimizers) File "/opt/conda/envs/wsk-py310/lib/python3.10/site-packages/lightning/fabric/fabric.py", line 856, in _validate_setup main() File "/project/Code/Skin/aa.py", line 14, in main model, optimizer = fabric.setup(model, optimizer) File "/opt/conda/envs/wsk-py310/lib/python3.10/site-packages/lightning/fabric/fabric.py", line 198, in setup raise RuntimeError( RuntimeError: The
Fabricrequires the model and optimizer(s) to be set up separately. Create and set up the model first through
model = self.setup_model(model). Then create the optimizer and set it up:
optimizer = self.setup_optimizer(optimizer). self._validate_setup(module, optimizers) File "/opt/conda/envs/wsk-py310/lib/python3.10/site-packages/lightning/fabric/fabric.py", line 856, in _validate_setup raise RuntimeError( RuntimeError: The
Fabricrequires the model and optimizer(s) to be set up separately. Create and set up the model first through
model = self.setup_model(model). Then create the optimizer and set it up:
optimizer = self.setup_optimizer(optimizer).
The text was updated successfully, but these errors were encountered: