The GPU section fails kubernetes-worker on localhost (conjure-up) deployments #248

Closed
mbruzek opened this Issue Apr 4, 2017 · 2 comments

Comments

Projects
None yet
3 participants
Contributor

mbruzek commented Apr 4, 2017

I was testing the 1.6.x release candidate for charms. We added GPU support and it properly detected a GPU on my laptop. The installer failed to install on my machine with the following sections in the log.

Job for bluetooth.service failed because the control process exited with error code. Se
e "systemctl status bluetooth.service" and "journalctl -xe" for details.
invoke-rc.d: initscript bluetooth, action "start" failed.
● bluetooth.service - Bluetooth service
   Loaded: loaded (/lib/systemd/system/bluetooth.service; enabled; vendor preset: enabl
ed)
   Active: failed (Result: exit-code) since Tue 2017-04-04 19:18:24 UTC; 9ms ago
     Docs: man:bluetoothd(8)
  Process: 10009 ExecStart=/usr/lib/bluetooth/bluetoothd (code=exited, status=1/FAILURE
)
 Main PID: 10009 (code=exited, status=1/FAILURE)
   Status: "Starting up"

Apr 04 19:18:24 juju-c77d20-3 systemd[1]: Starting Bluetooth service...
Apr 04 19:18:24 juju-c77d20-3 bluetoothd[10009]: Bluetooth daemon 5.37
Apr 04 19:18:24 juju-c77d20-3 systemd[1]: bluetooth.service: Main process ex...E
Apr 04 19:18:24 juju-c77d20-3 systemd[1]: Failed to start Bluetooth service.
Apr 04 19:18:24 juju-c77d20-3 systemd[1]: bluetooth.service: Unit entered fa....
Apr 04 19:18:24 juju-c77d20-3 systemd[1]: bluetooth.service: Failed with res....
Hint: Some lines were ellipsized, use -l to show in full.
dpkg: error processing package bluez (--configure):

...

dpkg: dependency problems prevent configuration of indicator-bluetooth:
 indicator-bluetooth depends on bluez (>= 5); however:
  Package bluez is not configured yet.
 indicator-bluetooth depends on gnome-bluetooth | ubuntu-system-settings; however:
  Package gnome-bluetooth is not configured yet.
  Package ubuntu-system-settings is not installed.

dpkg: error processing package indicator-bluetooth (--configure):
 dependency problems - leaving unconfigured
No apport report written because MaxReports is reached already
dpkg: dependency problems prevent configuration of unity-control-center:
 unity-control-center depends on indicator-bluetooth; however:
  Package indicator-bluetooth is not configured yet.

dpkg: error processing package unity-control-center (--configure):
 dependency problems - leaving unconfigured
Setting up unity-control-center-faces (15.04.0+16.04.20160705-0ubuntu1) ...
No apport report written because MaxReports is reached already
dpkg: dependency problems prevent configuration of unity-control-center-signon:
 unity-control-center-signon depends on unity-control-center; however:
  Package unity-control-center is not configured yet.

dpkg: error processing package unity-control-center-signon (--configure):
 dependency problems - leaving unconfigured
Setting up upstart (1.13.2-0ubuntu21.1) ...
No apport report written because MaxReports is reached already
/usr/sbin/grub-probe: error: failed to get canonical path of `/dev/sdb1'.
Setting up usbmuxd (1.1.0-2) ...
Warning: The home dir /var/lib/usbmux you specified can't be accessed: No such file or 
directory

...

Errors were encountered while processing:
 bluez
 gnome-bluetooth
 gnome-user-share
 indicator-bluetooth
 unity-control-center
 unity-control-center-signon
E: Sub-process /usr/bin/dpkg returned an error code (1)
Traceback (most recent call last):
  File "/var/lib/juju/agents/unit-kubernetes-worker-0/charm/hooks/install", line 19, in
 <module>
    main()
  File "/usr/local/lib/python3.5/dist-packages/charms/reactive/__init__.py", line 78, i
n main
    bus.dispatch()
  File "/usr/local/lib/python3.5/dist-packages/charms/reactive/bus.py", line 434, in di
spatch
    _invoke(other_handlers)
  File "/usr/local/lib/python3.5/dist-packages/charms/reactive/bus.py", line 417, in _i
nvoke
    handler.invoke()
  File "/usr/local/lib/python3.5/dist-packages/charms/reactive/bus.py", line 371, in in
voke
    subprocess.check_call([self._filepath, '--invoke', self._test_output], env=os.envir
on)
  File "/usr/lib/python3.5/subprocess.py", line 581, in check_call
    raise CalledProcessError(retcode, cmd)
subprocess.CalledProcessError: Command '['/var/lib/juju/agents/unit-kubernetes-worker-0
/charm/reactive/cuda.sh', '--invoke', b'install_cuda\n']' returned non-zero exit status 100
unit-kubernetes-worker-0: 14:19:12 ERROR juju.worker.uniter.operation hook "install" failed: exit status 1
unit-kubernetes-worker-0: 14:19:12 DEBUG juju.worker.uniter.operation lock released

So it looks like the GPU packages is bringing in additional dependencies and failing to install or start them. Do we really need the bluetooth package if we install the GPUs?

@mbruzek mbruzek changed the title from The GPU enablement fails on localhost. to The GPU section fails kubernetes-worker on localhost (conjure-up) deployments Apr 4, 2017

Contributor

mbruzek commented Apr 4, 2017

I tried several things to work around this problem such as removing the apt-get upgrade -yqq here but got similar errors because bluetooth would not start in the lxd container. In my debugging I found that we were unable to remove the nouveau package.

https://github.com/juju-solutions/layer-nvidia-cuda/blob/master/reactive/cuda.sh#L336

Invoking bash reactive handler: install_cuda
++ install_cuda
++ apt-get update -qq
++ apt-get remove -yqq --purge 'libdrm-nouveau*'
E: Error, pkgProblemResolver::Resolve generated breaks, this may be caused by held packages.
Traceback (most recent call last):
  File "hooks/install", line 19, in <module>
    main()
  File "/usr/local/lib/python3.5/dist-packages/charms/reactive/__init__.py", line 78, in main
    bus.dispatch()
  File "/usr/local/lib/python3.5/dist-packages/charms/reactive/bus.py", line 434, in dispatch
    _invoke(other_handlers)
  File "/usr/local/lib/python3.5/dist-packages/charms/reactive/bus.py", line 417, in _invoke
    handler.invoke()
  File "/usr/local/lib/python3.5/dist-packages/charms/reactive/bus.py", line 371, in invoke
    subprocess.check_call([self._filepath, '--invoke', self._test_output], env=os.environ)
  File "/usr/lib/python3.5/subprocess.py", line 581, in check_call
    raise CalledProcessError(retcode, cmd)
subprocess.CalledProcessError: Command '['/var/lib/juju/agents/unit-kubernetes-worker-1/charm/reactive/cuda.sh', '--invoke', b'install_cuda\n']' returned non-zero exit status 100

Member

tvansteenburgh commented Apr 4, 2017

Here's a link to a similar full log from @ktsakalozos: http://paste.ubuntu.com/24308435/

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment