Extend Dry Run Mode to Cover Trainer Initialization in TorchTitan

The recent pull request [[#2012](https://github.com/pytorch/torchtitan/pull/2012/)](https://github.com/pytorch/torchtitan/pull/2012/) introduces a dry run mode for TorchTitan. However, the current implementation restricts the dry run functionality to the configuration system only. This limitation means that other components, such as the `Trainer.__init__()` method, are not covered by the dry run mode.

To enhance the utility of dry run mode, we should consider leveraging the fake PG (Process Group) mode. By doing so, the dry run mode can be extended to encompass the entire `Trainer.__init__()` process, allowing for more comprehensive validation and testing without requiring a full environment setup.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Extend Dry Run Mode to Cover Trainer Initialization in TorchTitan #2044

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Extend Dry Run Mode to Cover Trainer Initialization in TorchTitan #2044

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions