Add VertexAIService.launch_graph_store_job #350
Conversation
|
/unit_test |
|
/integration_tests |
|
/e2e_test |
GiGL Automation@ 17:22:07UTC : 🔄 @ 17:28:21UTC : ❌ Workflow failed. |
GiGL Automation@ 17:22:12UTC : 🔄 @ 18:31:11UTC : ✅ Workflow completed successfully. |
GiGL Automation@ 17:22:21UTC : 🔄 @ 18:42:28UTC : ✅ Workflow completed successfully. |
|
/unit_test |
GiGL Automation@ 18:54:41UTC : 🔄 @ 19:46:25UTC : ✅ Workflow completed successfully. |
|
/unit_test |
|
/integration_test |
GiGL Automation@ 16:30:50UTC : 🔄 @ 17:23:00UTC : ✅ Workflow completed successfully. |
GiGL Automation@ 16:30:58UTC : 🔄 @ 17:32:29UTC : ✅ Workflow completed successfully. |
mkolodner-sc
left a comment
There was a problem hiding this comment.
Thanks Kyle, generally LGTM -- just would prefer us to use existing VAI environment vars that are autopopulated instead of creating our own here
The reliance isn't on this side, it's on the "application code" e.g. whatever code we run in these jobs). If we rely on the VAI CLUSTER_SPEC then any of that code is going to be harder to update in the future. We could expose this all through some utility but I'm not really sure what the downside to adding new env variables that we are in control of is here. |
There was a problem hiding this comment.
Gotcha, thanks for the clarification. I don't feel too strongly here as that makes sense to me as well -- will defer to @svij-sc here. Approving to unblock on my end
|
/unit_test |
|
/integration_test |
GiGL Automation@ 20:50:34UTC : 🔄 @ 21:41:16UTC : ✅ Workflow completed successfully. |
GiGL Automation@ 20:50:39UTC : 🔄 @ 20:56:51UTC : ✅ Workflow completed successfully. |
There was a problem hiding this comment.
Sorry for another round of comments, but seems minimal changes- althought test cycles for worker pool 2 might / switching around compute / storage cluster might take a little time.
I will go ahead and pre-emptively approve given that the rest of the comments will be address pre-merge.
Sure, we can put the compute pool first. For whatever reason, VAI rejects it when I put anything (gpu or not) into worker pool 2. idk why. |
oh lol it randomly works now... |
Scope of work done
Add ability to launch heterogeneous VAI clusters via new
VertexAIService.launch_graph_store_jobAPIWhere is the documentation for this feature?: In doc comments
Did you add automated tests or write a test plan? Added unit tests
Updated Changelog.md? NO
Ready for code review?: YES