-
Notifications
You must be signed in to change notification settings - Fork 183
Description
Summary
flatpak-update-worker processes launched with -u (update-packages) enter an infinite busy loop consuming ~46% CPU each when the parent mintUpdate process dies or closes the stdin pipe. The workers become orphaned (PPID 1) and spin indefinitely until manually killed.
Environment
- Linux Mint 22.3 (Zena)
- mintupdate 7.1.4
- mint-common 2.5.1
- flatpak 1.14.6-1ubuntu0.1
- Python 3.12.3
Bug
The bug is in message_from_updater() in flatpak-update-worker.py (line 312). When the parent closes the stdin pipe, read_bytes_async completes immediately with empty bytes (EOF). The if bytes_read: block is skipped, but read_bytes_async is re-scheduled unconditionally at line 312:
def message_from_updater(self, pipe, res):
if self.cancellable is None or self.cancellable.is_cancelled():
return
try:
bytes_read = pipe.read_bytes_finish(res)
except GLib.Error as e:
if e.code != Gio.IOErrorEnum.CANCELLED:
warn("Error reading from updater: %s" % e.message)
return
if bytes_read:
message = bytes_read.get_data().decode().strip("\n")
# ... handle message ...
# BUG: this runs even when bytes_read is empty (EOF)
pipe.read_bytes_async(4096, GLib.PRIORITY_DEFAULT, self.cancellable, self.message_from_updater)Each EOF read completes immediately, the callback fires, and it re-schedules another read. This creates a tight loop that never blocks.
Suggested fix
Move the re-schedule inside the if bytes_read: block and handle EOF explicitly:
if bytes_read:
message = bytes_read.get_data().decode().strip("\n")
# ... handle message ...
pipe.read_bytes_async(4096, GLib.PRIORITY_DEFAULT, self.cancellable, self.message_from_updater)
else:
# EOF — parent closed the pipe
self.quit()Evidence
Two orphaned workers were found running for over 24 hours, consuming ~46% CPU each.
I/O stats confirm a busy loop — ~1.18 billion read syscalls with zero bytes actually read:
rchar: 7834362
syscr: 1181281207
read_bytes: 65536
Sampling over 2 seconds showed ~4,600 syscalls/sec with no bytes transferred.
Thread states:
| Thread | State | Activity |
|---|---|---|
| Main (flatpak-update-) | R (running) | 100% of process CPU time |
| pool-spawner | S (sleeping) | Idle |
| gmain | S (sleeping) | Idle, poll with timeout=-1 |
| gdbus | S (sleeping) | Idle, poll with timeout=-1 |
| dconf worker | S (sleeping) | Idle |
| flatpak-update- (secondary) | S (sleeping) | Blocked on GIL (futex wait) |
GDB backtrace of main thread (both workers show the same pattern):
#0 __stpcpy_sse2_unaligned () at strcpy-sse2-unaligned.S:45
#1 g_strconcat () from libglib-2.0.so.0
#2-#4 libgio internals
#5 g_input_stream_read_async () from libgio-2.0.so.0
#6 g_input_stream_read_bytes_async () from libgio-2.0.so.0
#7-#9 libffi
#10-#12 gi._gi.cpython-312 (Python → GI call)
#13 _PyObject_MakeTpCall ()
#14 _PyEval_EvalFrameDefault () ← message_from_updater calling pipe.read_bytes_async()
#15-#16 gi._gi (completion callback from previous read)
#17-#18 libffi
#19-#22 libgio (dispatching read completion)
#23-#24 libgio (input stream internals)
#25 g_main_context dispatch
#26-#27 g_main_loop_run () from libglib-2.0.so.0
#28 gtk_main () from libgtk-3.so.0
The secondary flatpak thread is blocked waiting for the GIL, called from a g_signal_emit in libflatpak.so.0 — it can never acquire the lock because the main thread never yields.
Additional context from xsession-errors:
Multiple prior warnings suggest a broader issue with concurrent flatpak cache operations that may have contributed to the parent dying:
flatpak-update-worker (WARN): cache not valid, refreshing
flatpak-WARNING: Invalid checksum for indexed summary ...
mint-common (WARN): Could not update appstream for flathub: Error deploying appstream: Error moving file ... File exists
mint-common (WARN): Could not update appstream for flathub: Error deploying appstream: Error moving file ... Directory not empty
These "File exists" / "Directory not empty" errors appear dozens of times, suggesting a race condition when two workers concurrently update the appstream cache.