Support GCP region override for vertex ai jobs#294
Conversation
|
/unit_test |
|
/integration_test |
|
/e2e_test |
GiGL Automation@ 24:19:17UTC : 🔄 @ 24:58:36UTC : ✅ Workflow completed successfully. |
GiGL Automation@ 24:19:20UTC : 🔄 @ 01:39:49UTC : ✅ Workflow completed successfully. |
GiGL Automation@ 24:19:22UTC : 🔄 @ 01:15:47UTC : ✅ Workflow completed successfully. |
svij-sc
left a comment
There was a problem hiding this comment.
The tests feel like overkill for a 2 line change, and prone to constant refactoring.
An alternative here is to do this in the wrapper or use omegaconf resolutions so we dont need to add conditionals in our trainer/inferencer code.
Synced offline, let's do this in the wrapper :) |
|
/unit_test |
GiGL Automation@ 17:35:15UTC : 🔄 @ 18:12:10UTC : ✅ Workflow completed successfully. |
xgao4-sc
left a comment
There was a problem hiding this comment.
Looks good to me. Thanks a lot for the work!
svij-sc
left a comment
There was a problem hiding this comment.
two minor comments, otherwise LGTM - thanks for the iteration
Add
gcp_region_overrideas a proto field to allow launching VAI jobs is a separate region.We do this so that if there are more GPUS in one region we can launch training jobs there, even if there are generally more resources in the region specified in
CommonComputeConfig.region.