Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(common): allow nvidia runtimeclass outside of scaleGPU on SCALE #704

Merged
merged 7 commits into from
Feb 17, 2024

Conversation

Ornias1993
Copy link
Member

Description
This is an early draft to correctly set the runtimeClass name, when using nvidia GPU on SCALE, but without the SCALE GPU GUI.

⚙️ Type of change

  • ⚙️ Feature/App addition
  • 🪛 Bugfix
  • ⚠️ Breaking change (fix or feature that would cause existing functionality to not work as expected)
  • 🔃 Refactor of current code

🧪 How Has This Been Tested?

📃 Notes:

✔️ Checklist:

  • ⚖️ My code follows the style guidelines of this project
  • 👀 I have performed a self-review of my own code
  • #️⃣ I have commented my code, particularly in hard-to-understand areas
  • 📄 I have made corresponding changes to the documentation
  • ⚠️ My changes generate no new warnings
  • 🧪 I have added tests to this description that prove my fix is effective or that my feature works
  • ⬆️ I increased versions for any altered app according to semantic versioning

➕ App addition

If this PR is an app addition please make sure you have done the following.

  • 🖼️ I have added an icon in the Chart's root directory called icon.png

Please don't blindly check all the boxes. Read them and only check those that apply.
Those checkboxes are there for the reviewer to see what is this all about and
the status of this PR with a quick glance.

@stavros-k
Copy link
Member

stavros-k commented Feb 17, 2024

TODO:

  • actually render additional resources under limits
  • handle nvidia.com/gpu from resources in fixedEnv

@stavros-k stavros-k self-assigned this Feb 17, 2024
@stavros-k stavros-k marked this pull request as ready for review February 17, 2024 13:15
@stavros-k
Copy link
Member

This should allow arbitrary resources under limits (in both top level and container level)

NVIDIA_* envs are handled for both scale and native helm (checks the resources.limits.nvidia.com/gpu)

For scale ONLY, the runtime is set to nvidia when nvidia.com/gpu is detected under resources or the to runtime middleware indicates when scaleGPU is used.

Native helm users have to manual set runtime if the default one does not support nvidia. (or whatever resource might assign). This is because native helm users can name their runtimes whatever the want.

@Ornias1993 Ornias1993 merged commit 73a90f0 into main Feb 17, 2024
128 checks passed
@stavros-k stavros-k deleted the gpu-runtime-scale branch February 17, 2024 13:47
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants