Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We鈥檒l occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add a tutorial for distributed inference #100

Closed
awaelchli opened this issue Oct 11, 2021 · 3 comments
Closed

Add a tutorial for distributed inference #100

awaelchli opened this issue Oct 11, 2021 · 3 comments
Labels
documentation Improvements or additions to documentation Example Example / Demo / Tutorial help wanted Extra attention is needed won't fix This will not be worked on

Comments

@awaelchli
Copy link
Member

馃殌 Feature

Users are asking for examples how to predict with models in a distributed setting.

Motivation

We could link such a tutorial in the PL main docs.

Pitch

Add tutorial page for prediction on single GPU, multiple GPU / multiple nodes.
It should cover the PredictionWriterCallback and how to use it.

Alternatives

Additional context

Related PR #52

@awaelchli awaelchli added enhancement New feature or request help wanted Extra attention is needed documentation Improvements or additions to documentation Example Example / Demo / Tutorial and removed enhancement New feature or request labels Oct 11, 2021
@gianscarpe
Copy link

Just pointing out a useful and elegant solution to multi-gpu prediction (does not involve creating new files)
https://github.com/open-mmlab/mmdetection/blob/482f60fe55c364e50e4fc4b50893a25d8cc261b0/mmdet/apis/test.py#L160

@stale
Copy link

stale bot commented Jan 19, 2022

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

@stale stale bot added the won't fix This will not be worked on label Jan 19, 2022
@stale stale bot closed this as completed Feb 8, 2022
@allanchan339
Copy link

@awaelchli
Any update on how to gather all validation_step_outputs to a single device {rank 0} at validation_epoch_end for further metrics calculation? I have redirected from discussion Lightning-AI/pytorch-lightning#5788 to here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
documentation Improvements or additions to documentation Example Example / Demo / Tutorial help wanted Extra attention is needed won't fix This will not be worked on
Projects
None yet
Development

No branches or pull requests

3 participants