You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The template below is mostly useful for bug reports and support questions. Feel free to remove anything which doesn't apply to you and add more information where it makes sense.
i check the hami-device-plugin yaml the env nodename is exist
kubectl logs -f -n kube-system hami-device-plugin-gt9h5 -c device-plugin
E0422 08:03:18.033017 27021 register.go:171] get node error resource name may not be empty
E0422 08:03:18.033309 27021 register.go:193] Failed to register annotation: resource name may not be empty
kubectl logs -f -n kube-system hami-scheduler-774d56586c-98hjg vgpu-scheduler-extender
E0422 08:02:30.014651 1 scheduler.go:289] get node xxxxx device error, node xxxxx not found
E0422 08:02:30.014749 1 scheduler.go:289] get node xxxxx device error, node xxxxx not found
E0422 08:02:30.014774 1 scheduler.go:289] get node xxxxx device error, node xxxxx not found
E0422 08:02:30.014781 1 scheduler.go:289] get node xxxxx device error, node xxxxx not found
E0422 08:02:30.014789 1 scheduler.go:289] get node xxxxx device error, node xxxxx not found
E0422 08:02:30.014799 1 scheduler.go:289] get node xxxxx device error, node xxxxx not found
2. Steps to reproduce the issue
3. Information to attach (optional if deemed irrelevant)
Common error checking:
The output of nvidia-smi -a on your host
Your docker or containerd configuration file (e.g: /etc/docker/daemon.json)
The vgpu-device-plugin container logs
The vgpu-scheduler container logs
The kubelet logs on the node (e.g: sudo journalctl -r -u kubelet)
Additional information that might help better understand your environment and reproduce the bug:
Docker version from docker version
Docker command, image and tag used
Kernel version from uname -a
Any relevant kernel output lines from dmesg
The text was updated successfully, but these errors were encountered:
Hi @betterxu,
Thanks for opening an issue!
We will look into it as soon as possible.
Details
Instructions for interacting with me using comments are available here.
If you have questions or suggestions related to my behavior, please file an issue against the gh-ci-bot repository.
I encountered same situation before. The reason in my case is that, the env name "NODE_NAME" is invalid in some versions. I added the env below and the problem was resolved.
I encountered same situation before. The reason in my case is that, the env name "NODE_NAME" is invalid in some versions. I added the env below and the problem was resolved.
The template below is mostly useful for bug reports and support questions. Feel free to remove anything which doesn't apply to you and add more information where it makes sense.
i check the hami-device-plugin yaml the env nodename is exist
1. Issue or feature description
kubectl logs -f -n kube-system hami-device-plugin-gt9h5 -c device-plugin
E0422 08:03:18.033017 27021 register.go:171] get node error resource name may not be empty
E0422 08:03:18.033309 27021 register.go:193] Failed to register annotation: resource name may not be empty
kubectl logs -f -n kube-system hami-scheduler-774d56586c-98hjg vgpu-scheduler-extender
E0422 08:02:30.014651 1 scheduler.go:289] get node xxxxx device error, node xxxxx not found
E0422 08:02:30.014749 1 scheduler.go:289] get node xxxxx device error, node xxxxx not found
E0422 08:02:30.014774 1 scheduler.go:289] get node xxxxx device error, node xxxxx not found
E0422 08:02:30.014781 1 scheduler.go:289] get node xxxxx device error, node xxxxx not found
E0422 08:02:30.014789 1 scheduler.go:289] get node xxxxx device error, node xxxxx not found
E0422 08:02:30.014799 1 scheduler.go:289] get node xxxxx device error, node xxxxx not found
2. Steps to reproduce the issue
3. Information to attach (optional if deemed irrelevant)
Common error checking:
nvidia-smi -a
on your host/etc/docker/daemon.json
)sudo journalctl -r -u kubelet
)Additional information that might help better understand your environment and reproduce the bug:
docker version
uname -a
dmesg
The text was updated successfully, but these errors were encountered: