-
Notifications
You must be signed in to change notification settings - Fork 21
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: install nvidia driver using runfile #482
Conversation
Minimum allowed coverage is Generated by 🐒 cobertura-action against cefc034 |
e4b3f1f
to
92adde0
Compare
fc0b460
to
2160e81
Compare
$(MAKE) devkit.run WHAT="make build-$* \ | ||
BUILD_DRY_RUN=${BUILD_DRY_RUN} \ | ||
VERBOSITY=$(VERBOSITY) \ | ||
ADDITIONAL_ARGS="--instance-type=g4dn.2xlarge$(if $(ADDITIONAL_ARGS),$(SPACE)$(ADDITIONAL_ARGS))" \ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this instance is what kaptain uses for their testing. we should do also test on this regularly. we've seen issues with this in the past
|
||
# to pick up the changes to unload nouveau | ||
# some hardware like aws g4dn.2xlarge require this | ||
- name: unconditionally reboot the machine with all defaults |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will the machine also reboot for preprovisioned, or is sysprep==false
in that case?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it's false.
however, they might need to reboot it on their own depending on OS if they were previously using nouveu drivers
- name: run nvidia-smi | ||
shell: | ||
executable: /bin/bash | ||
cmd: nvidia-smi |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it should add it to the path on install.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great work. 🎉
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
not sure I understand all of it but looks good to me. One comment I would make is the naming for certain places has inconsistent nvidia naming, not a huge issue but something to consider in the future.
|
||
# to pick up the changes to unload nouveau | ||
# some hardware like aws g4dn.2xlarge require this | ||
- name: unconditionally reboot the machine with all defaults |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will the machine also reboot for preprovisioned, or is sysprep==false
in that case?
apt: | ||
name: linux-headers-{{ ansible_kernel }} | ||
name: | ||
- linux-headers-{{ ansible_kernel }} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
$ apt-cache depends build-essential
build-essential
|Depends: libc6-dev
Depends: <libc-dev>
libc6-dev
Depends: gcc
Depends: g++
Depends: make
make-guile
Depends: dpkg-dev
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
makes it a little cleaner :)
just want to make clear that this patch deleted the vault functionality to make sure staying with the centos minor version kernel. That means whenever RedHat releases a 7.10 release we'll either make a half 7.9 half 7.10 image as we'll install 7.10 patch in a 7.9 image or we'll simply have to immediately drop support for 7.9 to follow 7.10. But the minor version is hard coded into every KIB build. So @faiq I really really suggest keeping this backward compatibility as we render previous KIB versions unusable as soon as there is a new RHEL7/CentOS7 release CentOS moves almost instant the kernel packages of 7.9 into vault once a new release is there. This is the reason for the vault code as it drops the dependency on CentOS releases and timeline |
What problem does this PR solve?:
Starting work on installing drivers with runfiles
Which issue(s) does this PR fix?:
Special notes for your reviewer:
Does this PR introduce a user-facing change?: