Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Enhance integration manager to enable external controllers to provide Kueue integrations #2003

Closed
3 tasks
dgrove-oss opened this issue Apr 17, 2024 · 3 comments · Fixed by #2059
Closed
3 tasks
Labels
kind/feature Categorizes issue or PR as related to a new feature.

Comments

@dgrove-oss
Copy link
Contributor

What would you like to be added:
We need a mechanism to enable Kueue's jobframework.integrationManager to recognize GVKs that are being managed by an instantiation of jobframework.ReconcilerFactory that is defined/running in an external controller. In particular, methods like jobframework.IsOwnerManagedByKueue and GetEmptyOwnerObject should be extended to consult an additional table of GVKs that are known to be Kueue enabled. This table would be populated from information added to the Integrations sub-structure of Kueue's Configuration Resource.

Why is this needed:
Being able to cleanly extend Kueue with the ability to manage additional GVKs without needing to modify Kueue itself would make it easier to grow the Kueue ecosystem.

In the AppWrapper project (https://github.com/project-codeflare/appwrapper), we have a working example of such an external controller that extends Kueue to manage a new GVK. As described in more detail in the Working Group call of 4/11 and this presentation the inability to inform the integrationManager of the new Kueue managed type results in a failure to correctly recognize child jobs, which then requires a fragile workaround (our child admission controller).

Completion requirements:

With this enhancement, when an AppWrapper containing another Kueue-managed GVK (for example a PyTorch Job) is admitted by Kueue, the wrapped PyTorch Job should be properly recognized by Kueue as a child job of an already admitted Job and be admitted. To test, we would disable our child admission controller in the AppWrapper operator and verify that the child was admitted as expected.

  • Design doc
  • API change
  • Docs update

The artifacts should be linked in subsequent comments.

@dgrove-oss dgrove-oss added the kind/feature Categorizes issue or PR as related to a new feature. label Apr 17, 2024
@dgrove-oss
Copy link
Contributor Author

I intend to work on an implementation once there is agreement on a design. I think an extension to the Configuration.Integrations to add a new array of strings listing externally-managed GVKs should work, but I'm happy to do something else if there is a preferred alternative.

@mimowo
Copy link
Contributor

mimowo commented Apr 18, 2024

/cc @alculquicondor @tenzen-y

@alculquicondor
Copy link
Contributor

I think I would add a field parallel to this

Frameworks []string `json:"frameworks,omitempty"`

externalFrameworks?

Another option could be to just add all the frameworks into the same frameworks list, and have kueue identify which frameworks are built in. But then it wouldn't be possible for users to "override" a built-in framework, if they have such need.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/feature Categorizes issue or PR as related to a new feature.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants