New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

domain-0 is using > 100% CPU time and VMs hang #4073

Open
stuart12 opened this Issue Jul 13, 2018 · 3 comments

Comments

Projects
None yet
4 participants
@stuart12

Qubes OS version:

R4.0

Affected component(s):

domain-0


Steps to reproduce the behavior:

unknown

Expected behavior:

Machine remains usable

Actual behavior:

Some VMs appear to hang (terminals do not receive the focus for example) and xentop says that Domain-0 is using > 100% CPU (12% Memory). The fan becomes audible.

General notes:

If I wait long enough the stuck VMs come back to life (and CPU usage of Domain-0 drops). None of the logs in /var/log/xen/console contain anything obvious to me (except "Temperature above threshold").


Related issues:

@Aekez

This comment has been minimized.

Show comment
Hide comment
@Aekez

Aekez Jul 13, 2018

@stuart12 I'm no expert, but I think if you provide a log on the exact PID's eating up your system resources in dom0, it might provide good extra insight to the experts to follow up on.

Top should by default sort PID's by default with the most taxing CPU resources at the top.

Neither "top > output.txt" nor "script -c top output.txt" seem to save the top output correctly, but a work-around you can do:

  1. Run top in dom0, just like you did with xentop.
  2. Then exit top (ctrl+c).
  3. Output from top should remain in the dom0 terminal, select it all with your mouse, right click and click "copy".
  4. Then in dom0 terminal type "nano top-moment-print.txt" (or what you prefer to call it).
  5. Save nano by (ctrl+x) and followup with (y) and then (enter) to confirm the log name.
  6. Then in dom0 terminal type "qvm-copy-to-vm targetVM /path/to/your/log.txt" If you did it like you did in step 4, it'll be in your dom0 home folder.
  7. Then in your targetVM where you access GitHub, drag and drop the file into your GitHub comment field here to upload it, or instead click the blue "selecting them" link at the bottom of the GitHub comment field.

This way it can be more clear exactly what process is eating up your system in dom0.

Aekez commented Jul 13, 2018

@stuart12 I'm no expert, but I think if you provide a log on the exact PID's eating up your system resources in dom0, it might provide good extra insight to the experts to follow up on.

Top should by default sort PID's by default with the most taxing CPU resources at the top.

Neither "top > output.txt" nor "script -c top output.txt" seem to save the top output correctly, but a work-around you can do:

  1. Run top in dom0, just like you did with xentop.
  2. Then exit top (ctrl+c).
  3. Output from top should remain in the dom0 terminal, select it all with your mouse, right click and click "copy".
  4. Then in dom0 terminal type "nano top-moment-print.txt" (or what you prefer to call it).
  5. Save nano by (ctrl+x) and followup with (y) and then (enter) to confirm the log name.
  6. Then in dom0 terminal type "qvm-copy-to-vm targetVM /path/to/your/log.txt" If you did it like you did in step 4, it'll be in your dom0 home folder.
  7. Then in your targetVM where you access GitHub, drag and drop the file into your GitHub comment field here to upload it, or instead click the blue "selecting them" link at the bottom of the GitHub comment field.

This way it can be more clear exactly what process is eating up your system in dom0.

@stuart12

This comment has been minimized.

Show comment
Hide comment
@stuart12

stuart12 Jul 16, 2018

Well, I'd love to copy and paste the output of top(1) into this ticket but the copy-and-paste from dom0 is not working at the moment. copypaste-from-dom0 worked last time that I tried it. Qubes OS is somewhat frustrating in that I don't know where to look when things stop working.

OK, so I used the qvm-copy-to-vm mthod and see:

  PID USER      PR  NI    VIRT    RES    SHR S  %CPU %MEM     TIME+ COMMAND
30023 root       0 -20       0      0      0 I  14.2  0.0   0:13.15 kworker/u9:4
30165 root       0 -20       0      0      0 I  13.9  0.0   0:08.60 kworker/u9:1
30255 root       0 -20       0      0      0 I  10.6  0.0   0:05.27 kworker/u9:0
30239 root       0 -20       0      0      0 I   8.3  0.0   0:06.07 kworker/u9:3
30070 root       0 -20       0      0      0 I   7.3  0.0   0:11.02 kworker/u9:2
30032 root      20   0       0      0      0 R   6.9  0.0   0:05.58 kworker/u8:2
30271 root      20   0       0      0      0 I   6.6  0.0   0:00.62 kworker/u8:3
24617 root      20   0       0      0      0 S   4.3  0.0   0:03.87 7.xvda-2
24618 root      20   0       0      0      0 S   3.3  0.0   0:03.84 7.xvda-3
22850 root      20   0  949452 261888 224992 S   2.6 13.0   4:39.41 Xorg
24615 root      20   0       0      0      0 S   2.3  0.0   0:04.14 7.xvda-0
23019 s.pook    20   0  536396  35376  25344 S   2.0  1.8   0:00.68 xfce4-terminal
24616 root      20   0       0      0      0 S   2.0  0.0   0:03.25 7.xvda-1
    7 root      20   0       0      0      0 S   0.7  0.0   0:00.72 ksoftirqd/0
22922 s.pook    20   0  121524   2252   1916 S   0.7  0.1   0:18.67 qrexec-client
30272 s.pook    20   0   48612   4420   3388 R   0.7  0.2   0:00.11 top

Well, I'd love to copy and paste the output of top(1) into this ticket but the copy-and-paste from dom0 is not working at the moment. copypaste-from-dom0 worked last time that I tried it. Qubes OS is somewhat frustrating in that I don't know where to look when things stop working.

OK, so I used the qvm-copy-to-vm mthod and see:

  PID USER      PR  NI    VIRT    RES    SHR S  %CPU %MEM     TIME+ COMMAND
30023 root       0 -20       0      0      0 I  14.2  0.0   0:13.15 kworker/u9:4
30165 root       0 -20       0      0      0 I  13.9  0.0   0:08.60 kworker/u9:1
30255 root       0 -20       0      0      0 I  10.6  0.0   0:05.27 kworker/u9:0
30239 root       0 -20       0      0      0 I   8.3  0.0   0:06.07 kworker/u9:3
30070 root       0 -20       0      0      0 I   7.3  0.0   0:11.02 kworker/u9:2
30032 root      20   0       0      0      0 R   6.9  0.0   0:05.58 kworker/u8:2
30271 root      20   0       0      0      0 I   6.6  0.0   0:00.62 kworker/u8:3
24617 root      20   0       0      0      0 S   4.3  0.0   0:03.87 7.xvda-2
24618 root      20   0       0      0      0 S   3.3  0.0   0:03.84 7.xvda-3
22850 root      20   0  949452 261888 224992 S   2.6 13.0   4:39.41 Xorg
24615 root      20   0       0      0      0 S   2.3  0.0   0:04.14 7.xvda-0
23019 s.pook    20   0  536396  35376  25344 S   2.0  1.8   0:00.68 xfce4-terminal
24616 root      20   0       0      0      0 S   2.0  0.0   0:03.25 7.xvda-1
    7 root      20   0       0      0      0 S   0.7  0.0   0:00.72 ksoftirqd/0
22922 s.pook    20   0  121524   2252   1916 S   0.7  0.1   0:18.67 qrexec-client
30272 s.pook    20   0   48612   4420   3388 R   0.7  0.2   0:00.11 top
@marmarek

This comment has been minimized.

Show comment
Hide comment
@marmarek

marmarek Jul 16, 2018

Member

It looks like domain 7 (see xl list) is heavily using disk, which cause dom0 CPU usage on encryption etc. Too little memory there and something is heavily swapping?

Member

marmarek commented Jul 16, 2018

It looks like domain 7 (see xl list) is heavily using disk, which cause dom0 CPU usage on encryption etc. Too little memory there and something is heavily swapping?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment