-
Notifications
You must be signed in to change notification settings - Fork 42
Use a shell script as the entry point for AI Dynamo #615
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
a6aa3d9 to
43d2fbb
Compare
|
@karya0, please review and provide feedback when you have a chance. |
b5a2eaa to
9f1b97c
Compare
|
Given AI dynamo changes a lot and actively getting developed, Kapil's script encapsulates and absorbs these without having to do lot of code changes on CloudAI side. Provides more flexibility and value to the actual end user. We saw similar issues with Grok/Nemo and now with AI-dynamo. One thing that is common across is the need for flexibility and fast iteration. The user is fully aware and responsible for the flags that gets passed. Maybe once the AI dynamo gets stable and benchmark matures, the verification/validation aspects can be brought back. |
|
Re: #4554587: AI-Dynamo: Dry-Run with DSE fails @TaekyungHeo What is the fix for this? I don't see any handler or |
|
@srivatsankrishnan , explicit code changes were not needed to fix #4554587: AI-Dynamo: Dry-Run with DSE fails. Please find the log below. It works after merging the changes |
What caused this issue for Kapil then? Was it mis attribution? |
|
@srivatsankrishnan, maybe. The bug was actually reported from his fork, not the public main branch. It seems to have originated from his local changes. |
We are also not testing with DSE config? Correct? I think dry run works for non-dse cases. For DSE, it was causing this issue? |
|
Had a call with @srivatsankrishnan . This is not satisfied and removed.
I had to run a command using the DSE-enabled test configuration to validate this feature. |
|
Waiting for @srivatsankrishnan's approval. I spoke with @karya0 yesterday. This is not the final version of run.sh. I will match the functionality of his run.sh by merging other PRs and creating additional ones. |
Clarification on this. ideally lot of his changes should be contained within his |
Summary
The goal of this PR is to support a custom
run.shto launch AI Dynamo.Kapil's requirements met by this PR.:
model_config = ConfigDict(extra="forbid", populate_by_name=True)RM4548255
Test Plan
Take https://github.com/Mellanox/cloudaix/pull/319.
https://drive.google.com/drive/folders/1e5L80zLqZUywJ0SpCgT-s-tJyid13zBf?usp=sharing