Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

synchronous IO (read) blocks entire GNOME-Shell (desktop) #496

Open
return42 opened this issue Jan 25, 2019 · 23 comments · Fixed by #497
Open

synchronous IO (read) blocks entire GNOME-Shell (desktop) #496

return42 opened this issue Jan 25, 2019 · 23 comments · Fixed by #497

Comments

@return42
Copy link
Contributor

return42 commented Jan 25, 2019

In extensions calling synchronous IO functions like Shell.get_file_contents_utf8_sync will block GNOME-Shell's main loop. This means nearly everything is blocked: mouse feels sticky, videos are stutter, keyboard-repeat lags and so on.

I can not say for sure, but I suspect issues (like) #433, #439, #417, #350, #368, #263, #235, #207, #202 and #187 are related to such synchronous IO.

I will try to get rid of these synchronous calls in my fork and come back.

UPDATE: Most of the issues have been closed in the past. Still open:

return42 added a commit to return42/handsOn that referenced this issue Jan 28, 2019
interim solution as long as [1] is open

[1] paradoxxxzero/gnome-shell-system-monitor-applet#496

Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
@return42
Copy link
Contributor Author

return42 commented Feb 8, 2019

@chrisspen thanks for merging PR #497 ! But this was only the first step, there are some sync-IO calls left (especially Disk.refresh). As soon as I have time for this, I want to take care of it. So my question is, can we left this issue open for the first? / Thanks!

@chrisspen
Copy link
Collaborator

Sure, Github auto-closed the referenced PR when merged. I'll reopen.

@chrisspen chrisspen reopened this Feb 8, 2019
@pwaller
Copy link

pwaller commented Feb 13, 2019

@return42 - I just found myself here. Suddenly noticed my buttery smooth system was not behaving well, and happened to realise that what I did recently was enable the system monitor. Just did some investigating and found my way to #202 and this PR. So glad to see someone tackling it, the pauses are pretty horrendous on my powerful system, causing missed inputs and all sorts of things!

Just wanted to say thanks for working on this ❤️

@pwaller
Copy link

pwaller commented Feb 13, 2019

By the way, I just tried out the master branch which has #497 merged. The Jankiness is gone 🚀 , but now the CPU freq monitor doesn't work. It's just empty. If I show digits, there is nothing shown.

@return42
Copy link
Contributor Author

@pwaller sorry for delay - I just pulled master from origin but can't reproduce any issue with CPU freq. I'am sorry for not having any clue whats going wrong here. What the extension does to get CPU freq is very simple; it reads the vales from /sys/. Can you please check if you get some CPU freq values (kHz):

 $ cat /sys/devices/system/cpu/cpu*/cpufreq/scaling_cur_freq
 1374140
 1361529
 ...

And can you please check your GNOME shell version, here I tested with:

$ gnome-shell --version
GNOME Shell 3.28.3

@pwaller
Copy link

pwaller commented Feb 14, 2019

Yes, that does show frequencies. I'm get the same version number back.

@return42
Copy link
Contributor Author

hm, I do not have any clue .. very sorry .. I can only ask common places like; what dist you are using, can you reproduce this issue with another system and/or user login?

@return42
Copy link
Contributor Author

@pwaller if problem resist, can you please open a new issue like CPU freq monitor doesn't work any longer with current HEAD. So that others who have the same problem can join them. This may give us more information about this problem. Hint: most people will install the release and not the HEAD. We need more testers for current HEAD (before a new release is build) / thanks!

@pwaller
Copy link

pwaller commented Feb 20, 2019

The problem is now no longer manifesting. Thanks very much for fixing the lag. I've been using it for a while now with no obvious issues.

@return42
Copy link
Contributor Author

return42 commented Feb 20, 2019

These are good news ;) .. thanks a lot for giving feedback.

@roadrunner2
Copy link

@return42 Since e16e803 the cpu-frequency graph is pretty much useless (on my system at least), because it's almost always near max. This is very obvious with the #431 patches where the freq graph is scaled to min-max range. I think it may be an unfortunate timing thing, where the async read happens just late enough that it captures the frequency after it was temporarily ramped up because of the system-monitor activity (and for some reason the synchronous read is fast or slow enough to capture the frequency before or after the temporary spike). I come to this conclusion because monitoring the frequencies via a simple

 watch -n 1 cat /sys/devices/system/cpu/cpu*/cpufreq/scaling_cur_freq

shows the frequency at min most of the time.

Ideally we'd have some way of getting the average frequency since the last sample (or in the last second or something) rather than an instantaneous value, like for most (all?) of the other metrics, which would solve this issue. But I'm not aware of any way to do so.

@return42
Copy link
Contributor Author

@roadrunner2 thanks a lot for your evaluation.

But I'm not aware of any way to do so.

If so and if it is OK for you, can you please close PR #431 / Thanks!

@roadrunner2
Copy link

But I'm not aware of any way to do so.

If so and if it is OK for you, can you please close PR #431 / Thanks!

Sorry, but I don't understand why, unless you are planning to remove the cpufreq monitoring completely? #431 does not change this issue in any way (one of the patches in that pull just makes things more obvious, but shooting the messenger is not a good strategy IMO).

@roadrunner2
Copy link

A couple more things.

First, I ran across cpufreq stats (CONFIG_CPU_FREQ_STAT which provides a sys/.../cpugreq/stats/ directory) which could be used to get better stats, but that only seems to be available on some distros (e.g. enabled and available on Ubuntu, but on Fedora while enabled (config set) the stats directory is not present for some reason). See also cpufreq-stats.txt.

Second, a thought occurred to me and I did a little experimenting, and the following patch makes the cpufreq values appear much more sane again:

--- a/system-monitor@paradoxxx.zero.gmail.com/extension.js
+++ b/system-monitor@paradoxxx.zero.gmail.com/extension.js
@@ -1538,6 +1538,7 @@ const Freq = new Lang.Class({
             let num_cpus = (this.cpuid < 0) ? GTop.glibtop_get_sysinfo().ncpu : 1;
             let i = (this.cpuid < 0) ? 0 : this.cpuid;
             let file = Gio.file_new_for_path(`/sys/devices/system/cpu/cpu${i}/cpufreq/scaling_cur_freq`);
+            Mainloop.timeout_add(50, Lang.bind(this, function() {
                 file.load_contents_async(null, Lang.bind(this, function cb (source, result) {
                     let as_r = source.load_contents_finish(result);
                     total_frequency += parseInt(as_r[1]);
@@ -1549,6 +1550,8 @@ const Freq = new Lang.Class({
                         file.load_contents_async(null, Lang.bind(this, cb));
                     }
                 }));
+                return false;
+            }));
         } else {
             this.freq = this.fixed_freq;
         }

i.e. delay the reading of the current frequencies by a little bit.

Lastly, I noted another issue with the change to async calls: the this._apply() and this.char.update() calls in the base update function should really now be done from the async completion callbacks - the way it's currently set up, the values are now effectively updated one refresh/update cycle later.

@return42
Copy link
Contributor Author

Sorry, but I don't understand why, unless you are planning to remove the cpufreq monitoring completely? #431 does not change this issue in any way (one of the patches in that pull just makes things more obvious, but shooting the messenger is not a good strategy IMO).

Sorry, you got me wrong, this is not may aim. This issue here (#496) is about "synchronous IO" and all the issues we have about (see my first post in this thread), you mixed the discussion with the merge of your PR #431.

@roadrunner2
Copy link

Sorry, but I don't understand why, unless you are planning to remove the cpufreq monitoring completely? #431 does not change this issue in any way (one of the patches in that pull just makes things more obvious, but shooting the messenger is not a good strategy IMO).

Sorry, you got me wrong, this is not may aim. This issue here (#496) is about "synchronous IO" and all the issues we have about (see my first post in this thread), you mixed the discussion with the merge of your PR #431.

Apologies for the misunderstanding. I only mentioned #431 to say it made the problem introduced in this patch in this issue more obvious, that's all - I did/do not intend to mix the two at all, and we can stop mentioning that PR here.

@return42
Copy link
Contributor Author

Ok, thanks for clarifying. If we not talking about 431, may you can help me once more to understand you right. You wrote:

Since e16e803 the cpu-frequency graph is pretty much useless (on my system at least), because it's almost always near max.

Is it? .. You are the first mentioning that e16e803 breaks the cpu-freq graph display and I can't reproduce it (here the graph works fine) ... Is that possibly why I misunderstand you?

@return42
Copy link
Contributor Author

Argh, sorry, I'am an idiot, now I see, that the graph does not work any longer (sorry for having tomates on my eyes).

@return42
Copy link
Contributor Author

Today I checked cpu-frequency graph again and it works (at least for me).

Bildschirmfoto vom 2019-03-15 08-16-39

And yes, I am really an idiot, in my first test I tested with a black graph on a black background. My only consolation is: Idiots like me won't die out ;-)

@sdorheim
Copy link

I seem to have a similar problem. I use sshfs and when I loose network-connectivity, my entire gnome-shell freezes. I have singled out that the system-monitor-applet is to blame, as it only happens when it is active.
If I can provide more information, or test a fix, please tell, I'm happy to help.

@return42
Copy link
Contributor Author

@sdorheim correct, its system-monitor-applet and the blocking I/O when reading disks stats. Sorry, I do not have time to fix it (I'am not the maintainer, just send PRs from time to time).
A interim workaround for you might be disabling the Disk stat in the system-monitor-applet.

@sdorheim
Copy link

@return42 Thanks.
After some searching I found out how to disable the Disk/Stats. It wasn't clear that Setting Usage-Style to none, possible.

@vith vith mentioned this issue Apr 9, 2019
@Russell-Jones-OxPhys
Copy link

To others who may look at this bug: if your Thermal and/or Frequency graphs only show a maximum value and "NaN" when viewed as text, it may be #520 rather than this bug.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

6 participants