Join GitHub today
GitHub is home to over 28 million developers working together to host and review code, manage projects, and build software together.
Sign upFix memory leak in qubes manager #860
Comments
marmarek
added this to the
Release 2.1 (post R2) milestone
Mar 8, 2015
marmarek
added
bug
C: qubes-manager
P: minor
labels
Mar 8, 2015
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
nvesely
Apr 25, 2015
I'm getting 851M VIRT and 63792 RES in htop while Qube Manager is iconified. 923M VIRT and 68036 while open.
nvesely
commented
Apr 25, 2015
|
I'm getting 851M VIRT and 63792 RES in htop while Qube Manager is iconified. 923M VIRT and 68036 while open. |
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
tasket
Jun 30, 2016
On R3.2rc1, it appears to gain 4KB at each increment at about 56KB per minute.
I wonder if moving to python 3 would solve or improve the issue.
As a quick remedy, perhaps QM could restart itself at set intervals if there has been no user input for at least N minutes?
tasket
commented
Jun 30, 2016
|
On R3.2rc1, it appears to gain 4KB at each increment at about 56KB per minute. I wonder if moving to python 3 would solve or improve the issue. As a quick remedy, perhaps QM could restart itself at set intervals if there has been no user input for at least N minutes? |
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
tasket
Jun 30, 2016
Also, closing the QM window appears to halt the increase, while re-opening it causes it to resume.
tasket
commented
Jun 30, 2016
|
Also, closing the QM window appears to halt the increase, while re-opening it causes it to resume. |
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
alex-mazzariol
Jun 30, 2016
I profiled the memory usage of the Qubes Manager, and measured that instances of "dict" and "instancemethod" are always increasing in number. Specifically, the instancemethod objects are all
<bound method QubesWatch._retry_register_watches of <qubes.qubesutils.QubesWatch object at 0x7f2bd40da2d0>>
while all the ever-growing dictionaries are all
{'cb': <bound method QubesWatch._retry_register_watches of <qubes.qubesutils.QubesWatch object at 0x7f2bd40da2d0>>, 'opaque': <libvirt.virDomain object at 0x7f2bd40da710>}
The backreference graphs for these instances are nearly useless - maybe it's just me, I'm not that expert in debugging python, and being in dom0 does not really help.
Hope this can help, let me know if I may assist further.
alex-mazzariol
commented
Jun 30, 2016
|
I profiled the memory usage of the Qubes Manager, and measured that instances of "dict" and "instancemethod" are always increasing in number. Specifically, the instancemethod objects are all |
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
marmarek
Jul 1, 2016
Member
The code responsible for this is here: https://github.com/QubesOS/qubes-core-admin/blob/master/core/qubesutils.py#L671-L799
But first, it should be automatically unregistered after 0.5sec timeout:
def _register_watches(self, libvirt_domain):
(...)
libvirt.virEventAddTimeout(500, self._retry_register_watches,
libvirt_domain)
(...)
def _retry_register_watches(self, timer, libvirt_domain):
libvirt.virEventRemoveTimeout(timer)
self._register_watches(libvirt_domain)
And second, _register_watches shouldn't be called that often - even if some connection fails each time, it should just retry it later (remove one timeout object, create another one - none should be leaked). Can you track where, for which domains etc _register_watches is called that often?
|
The code responsible for this is here: https://github.com/QubesOS/qubes-core-admin/blob/master/core/qubesutils.py#L671-L799 But first, it should be automatically unregistered after 0.5sec timeout:
And second, |
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
alex-mazzariol
Jul 1, 2016
I added a little logging around the various calls to libvirt, like this:
def _register_watches(self, libvirt_domain):
if libvirt_domain:
name = libvirt_domain.name()
if name in self._qdb:
return
if not libvirt_domain.isActive():
return
# open separate connection to Qubes DB:
# 1. to not confuse pull() with responses to real commands sent from
# other threads (like read, write etc) with watch events
# 2. to not think whether QubesDB is thread-safe (it isn't)
try:
self._qdb[name] = QubesDB(name)
except Error as e:
if e.args[0] != 2:
raise
try:
with open("/tmp/qwatch_log", "a") as qwatchlog:
qwatchlog.write("Adding timeout for " + name + ", " + str(e) + "\n")
except:
pass
libvirt.virEventAddTimeout(500, self._retry_register_watches,
libvirt_domain)
return
else:
name = "dom0"
self._qdb[name] = QubesDB(name)
try:
with open("/tmp/qwatch_log", "a") as qwatchlog:
qwatchlog.write("Registering for " + name + "\n")
except:
pass
try:
self._qdb[name].watch('/qubes-block-devices')
except Error as e:
if e.args[0] == 102: # Connection reset by peer
# QubesDB daemon not running - most likely we've connected to
# stale daemon which just exited; retry later
libvirt.virEventAddTimeout(500, self._retry_register_watches,
libvirt_domain)
return
self._qdb_events[name] = libvirt.virEventAddHandle(
self._qdb[name].watch_fd(),
libvirt.VIR_EVENT_HANDLE_READABLE,
self._qdb_handler, name)
and it logged something like this starting qubes-manager:
Registering for netvm
Registering for firewallvm
Registering for personal
Registering for work
Registering for skype-1
Adding timeout for Domain-0, (2, 'No such file or directory')
Registering for dom0
Adding timeout for Domain-0, (2, 'No such file or directory')
Adding timeout for Domain-0, (2, 'No such file or directory')
Adding timeout for Domain-0, (2, 'No such file or directory')
Adding timeout for Domain-0, (2, 'No such file or directory')
Adding timeout for Domain-0, (2, 'No such file or directory')
Adding timeout for Domain-0, (2, 'No such file or directory')
Adding timeout for Domain-0, (2, 'No such file or directory')
Adding timeout for Domain-0, (2, 'No such file or directory')
Adding timeout for Domain-0, (2, 'No such file or directory')
Adding timeout for Domain-0, (2, 'No such file or directory')
Adding timeout for Domain-0, (2, 'No such file or directory')
I don't remember ever creating a vm with that name (Domain-0), and it looks like QubesDB can't either. A quick find / -name Domain* in dom0 doesn't yield anything, either.
alex-mazzariol
commented
Jul 1, 2016
•
|
I added a little logging around the various calls to libvirt, like this:
and it logged something like this starting qubes-manager:
I don't remember ever creating a vm with that name (Domain-0), and it looks like QubesDB can't either. A quick |
added a commit
to marmarek/old-qubes-core-admin
that referenced
this issue
Jul 1, 2016
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
marmarek
Jul 1, 2016
Member
This is dom0. Looks like virsh list list it as Domain-0.
Try this commit: marmarek/qubes-core-admin@6bc7733
|
This is |
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
alex-mazzariol
Jul 1, 2016
In implementing your commit I added a check for libvirt_domain being a valid object before accessing the ID() method, and nothing else.
From what I can see, two watches are registered for dom0, but this does not seem to affect anything else; there is no more loop. Qubes Manager is running, will collect memory new statistics in a few hours and report back.
alex-mazzariol
commented
Jul 1, 2016
|
In implementing your commit I added a check for libvirt_domain being a valid object before accessing the From what I can see, two watches are registered for dom0, but this does not seem to affect anything else; there is no more loop. Qubes Manager is running, will collect memory new statistics in a few hours and report back. |
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
alex-mazzariol
Jul 1, 2016
After applying the patch with mentioned difference (added check for libvirt_domain before accessing ID) there is no noticeable increase of any object count. This may fix the issue.
alex-mazzariol
commented
Jul 1, 2016
|
After applying the patch with mentioned difference (added check for libvirt_domain before accessing ID) there is no noticeable increase of any object count. This may fix the issue. |
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
marmarek
Jul 1, 2016
Member
Other events worth checking (whether cause increase of object count):
- starting/stopping some VM (either from qubes manager, or independently)
- changing some VM setting
|
Other events worth checking (whether cause increase of object count):
|
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
alex-mazzariol
Jul 1, 2016
From quick estimates (looking at the counts I took while investigating) some other object types increase with user actions, but not nearly as much as the wrong handling of the name of dom0 did.
I will produce more exact results later, but I think you can close the issue just by fixing your commit (adding a check for libvirt_domain being null) - user interactions do not leak this much memory!
alex-mazzariol
commented
Jul 1, 2016
|
From quick estimates (looking at the counts I took while investigating) some other object types increase with user actions, but not nearly as much as the wrong handling of the name of dom0 did. I will produce more exact results later, but I think you can close the issue just by fixing your commit (adding a check for libvirt_domain being null) - user interactions do not leak this much memory! |
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
|
Thanks! |
marmarek
closed this
in
marmarek/old-qubes-core-admin@597f9b3
Jul 1, 2016
added a commit
to marmarek/old-qubes-core-admin
that referenced
this issue
Jul 1, 2016
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
alex-mazzariol
Jul 2, 2016
I'm back with some measurement as @marmarek asked.
I measured the objects in heap after two forced garbage collections, after performing an action and its reverse once and three times, to look for object count increase. It's the same methodology I used for the other memory leak.
First the good news! Starting and stopping a VM thrice instead of once does not lead to any object count increase, so I think that's OK from this point of view.
Then the not-so-good news: changing settings six times (three actual modification from base status, and three reverts) instead of twice (one actual modification and a revert) does lead to leaked objects - not a big problem, though.
The settings I changed are the template and vcpu count for a vm, and nothing else. The last time I left the modified vcpu count, so I did not even open the "advanced" tab.
It seems that there were 7 "dict" objects and 14 "list" objects more than when I changed the settings only twice. The dicts were some PyQt4.QtGui.QMenu objects and some dicts with keys "menu", "tray_notifier_type", "tray_notifier", "blk_menu" and "blk_manager", but I can't check the exact count.
The lists were... numbers. I think they may have something to do with the window layout, but cannot confirm. A lot of them were empty lists.
I don't think these leaks will lead to actual problems.
alex-mazzariol
commented
Jul 2, 2016
|
I'm back with some measurement as @marmarek asked. |
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
marmarek
Jul 12, 2016
Member
Automated announcement from builder-github
The package qubes-core-dom0-3.2.5-1.fc23 has been pushed to the r3.2 testing repository for dom0.
To test this update, please install it with the following command:
sudo qubes-dom0-update --enablerepo=qubes-dom0-current-testing
|
Automated announcement from builder-github The package
|
marmarek
added
the
r3.2-dom0-cur-test
label
Jul 12, 2016
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
marmarek
Jul 28, 2016
Member
Automated announcement from builder-github
The package qubes-core-dom0-3.2.7-1.fc23 has been pushed to the r3.2 stable repository for dom0.
To install this update, please use the standard update command:
sudo qubes-dom0-update
Or update dom0 via Qubes Manager.
|
Automated announcement from builder-github The package
Or update dom0 via Qubes Manager. |
marmarek
added
r3.2-dom0-stable
and removed
r3.2-dom0-cur-test
labels
Jul 28, 2016
added a commit
to QubesOS/qubes-core-admin
that referenced
this issue
Nov 20, 2016
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
marmarek
Nov 20, 2016
Member
Automated announcement from builder-github
The package qubes-core-dom0-3.1.18-1.fc20 has been pushed to the r3.1 testing repository for dom0.
To test this update, please install it with the following command:
sudo qubes-dom0-update --enablerepo=qubes-dom0-current-testing
|
Automated announcement from builder-github The package
|
marmarek
added
the
r3.1-dom0-cur-test
label
Nov 20, 2016
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
marmarek
Dec 4, 2016
Member
Automated announcement from builder-github
The package qubes-core-dom0-3.1.18-1.fc20 has been pushed to the r3.1 stable repository for dom0.
To install this update, please use the standard update command:
sudo qubes-dom0-update
Or update dom0 via Qubes Manager.
|
Automated announcement from builder-github The package
Or update dom0 via Qubes Manager. |
marmarek commentedMar 8, 2015
Reported by marmarek on 24 May 2014 02:11 UTC
After about 3 weeks Qubes Manager uses over 1GB virtual memory (over 200MB resident). Its window was active (not iconified) almost all the time. Visible columns:
Migrated-From: https://wiki.qubes-os.org/ticket/860