Skip to content

Conversation

@deep-diver
Copy link
Contributor

This PR extends the existing example on Alignment Handbook on multiple node configurations.

@peterschmidt85
Copy link
Contributor

@deep-diver I don't see the multi-node task configuration

ports:
- 6006
resources:
gpu: 1..2
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just curious, why 1..2? Shall it be 24GB:2?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure. No problem!
If there is a way to configure this at runtime, that would be much better IMO

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My preference is always the declarative approach over runtime. In the runtime, the user always can override if needed.

@deep-diver
Copy link
Contributor Author

@deep-diver I don't see the multi-node task configuration

What do you mean?

@peterschmidt85
Copy link
Contributor

What do you mean?
@deep-diver I mean you have only train.dstack.yaml with a single node but you don't have a train-distrib.dstack.yaml with multiple nodes as well as other configuration files.
As a user how do I run the multi-node configuration with a single command without writing the configuration file myself?

@deep-diver
Copy link
Contributor Author

@peterschmidt85 updated!

@peterschmidt85 peterschmidt85 merged commit 609036d into dstackai:master Jul 11, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants