Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use PSI (Pressure Stall Information) api in mempressure plugin #14

Open
wants to merge 7 commits into
base: master
Choose a base branch
from

Conversation

Karry
Copy link

@Karry Karry commented Aug 27, 2022

As discussed on the SFOS forum [1][2] and community meeting [3], cgroup memory usage metric is not ideal for indicating system memory pressure. Main reason is that this metric contains reclaimable memory (I believe). PSI (Pressure Stall Information) api provides better insight what actual memory pressure is.

I believe that together with switch from kernel lowmemorykiller module to some user-space lowmemory killer (using PSI api), this change will improve Sailfish memory management.

Merge request is ready for review but it would be better to squash it before merge...

  1. https://forum.sailfishos.org/t/mce-is-using-wrong-metric-to-properly-measure-ram-usage/12697
  2. https://forum.sailfishos.org/t/4-1-0-24-xperia-10-ii-unknown-memory-level-reported-by-mce/6856
  3. https://irclogs.sailfishos.org/meetings/sailfishos-meeting/2022/sailfishos-meeting.2022-01-20-08.00.html

@JacekJagosz
Copy link

JacekJagosz commented Aug 28, 2022

Could it work with the same config for each device, where 10 II and 10 III have lowmemorykiller set much more aggresively? >Also different levels of RAM usage? I guess it should still be better than the current broken behaviour of MCE?

Those are unrelated, MCE reports, the low memory killer kills. Please discuss elsewhere!

@Thaodan Thaodan requested a review from spiiroin August 28, 2022 06:53
@Thaodan
Copy link
Contributor

Thaodan commented Aug 28, 2022

Good work, this should work on any devices starting with Linux 4.4.

@Karry
Copy link
Author

Karry commented Feb 19, 2023

Hi sailors. Do you have any plan to look at SFOS memory management? I believe that using PSI api in mempressure mce plugin and some userspace lowmemorykiller is the right way to go. Do you agree? I, and possibly other members of community, are able to to provide some help in this area. But it would be great to let us know your plans!

@mlehtima
Copy link
Contributor

I think this idea is good but I probably wouldn't remove the old support because this new PSI only works with newer kernel and therefore would remove the support for older devices.

@lkraav
Copy link

lkraav commented Apr 2, 2023

Merge request is ready for review but it would be better to squash it before merge...

Is there a built binary that interested testers could experiment with?

@spiiroin
Copy link
Contributor

spiiroin commented Apr 4, 2023

Basically: memory pressure signaling should be somewhat in sync with oom killer triggering. So that reacting to memory pressure signals would have a chance to postpone / avoid oom killing. And IIRC the problem is that kernel/android/whatnot oom killer logic varies in sanity from one device type to another and in some instances no amount of mce side tuning helps (say, when "we have stuff in fs caches" is taken as need to kill something soon) -> oom killing logic should be brought under control first.

@JacekJagosz
Copy link

no amount of mce side tuning helps (say, when "we have stuff in fs caches" is taken as need to kill something soon) -> oom killing logic should be brought under control first.

That is true, but also right now what MCE is reporting back is very broken, so this feature is completely useless right now.
So the gain from fixing MCE would be limited until lowmemorykiller gets replaced as well, but it certainly wouldn't hurt for apps to know accurately the memory is running out.

@Karry
Copy link
Author

Karry commented Apr 16, 2023

@mlehtima :

... I probably wouldn't remove the old support because this new PSI only works with newer kernel and therefore would remove the support for older devices.

Not sure if older supported devices (Jolla C, Xperia 10, Xperia XA...) have PSI subsystem available/enabled, but keep in mind that current implementation is not working properly. Reporting "unknown" memory level will not make situation worse IMHO.

@lkraav

Is there a built binary that interested testers could experiment with?

Yes, I made test release of my fork, I just merged current upstream master.

just install mce with rpm:

rpm --replacefiles -if \
  /tmp/mce/mce-1.115.2-1.aarch64.rpm \
  /tmp/mce/mce-debugsource-1.115.2-1.aarch64.rpm \
  /tmp/mce/mce-debuginfo-1.115.2-1.aarch64.rpm \
  /tmp/mce/mce-tools-1.115.2-1.aarch64.rpm 

then restart mce:

systemctl daemon-reload
systemctl restart mce

you can watch mce log in journal:

journalctl -fe -u mce

memory status with dbus-monitor:

dbus-monitor --system sender=com.nokia.mce,interface=com.nokia.mce.signal,member=sig_memory_level_ind

to supress low-memory-killer:

echo "2560,2560,2560,2560,2560" > /sys/module/lowmemorykiller/parameters/minfree

when you wan to see memnotify debug log, stop mce service and run it on foreground with these arguments:

systemctl stop mce
/usr/sbin/mce -vvv --force-stderr 2>&1 | grep mem

@Thaodan
Copy link
Contributor

Thaodan commented Apr 17, 2023

@mlehtima :

... I probably wouldn't remove the old support because this new PSI only works with newer kernel and therefore would remove the support for older devices.

Not sure if older supported devices (Jolla C, Xperia 10, Xperia XA...) have PSI subsystem available/enabled, but keep in mind that current implementation is not working properly. Reporting "unknown" memory level will not make situation worse IMHO.

Jolla C, Xperia X would break anything else is fine.

If possible supporting both the old and new interface would be best.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
6 participants