New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
TEST ipq806x: add test patch to improve stability #11173
Conversation
Add test patch to improve stability. Signed-off-by: Christian Marangi <ansuelsmth@gmail.com>
@hnyman want to test? |
Thanks. |
I'm trying out your patch on my C2600 now but off topic/semi related I was getting regular crashes after 1/2 days running master snapshots with the performance governor but I'm not sure how to setup ramoops to see why it was crashing |
Well, at least 13 hours uptime already. No problems so far. ;-) Just wanted to provide feedback that the patch apparently clearly changes the CPU freq idle time behaviour: (Note the short flashing gaps in the graph data: master 5.15 until 20:15, then 22.03 for almost two hours, then at 22:00 master 5.15 including this PR. And then 10 minutes of varying traffic load testing with flent after flashing, after which an idle night.) |
well yes this is expected but I first need to understand if this is a solution to then try to reduce it... Currently we use really expensive function to make a simple clk change. @motolav that is strange maybe @hnyman can check if ramoops can be introduced and check if there are some interesting crash log |
Regarding ramoops, I have no experience with C2600, but as I packaged the pstore kernel modules with 97158fe, the fundamentals are there, and it just needs to be enabled for each device. For R7800 I then enabled it with two commits:
I assume that adding the same for C2600 might work. Likely/possibly the same memory address would work. @motolav might test it. |
Thanks @hnyman, I got loaded with my new build and if it works I'll send a patch in. I was at 14.5 hours without any issue
|
Note that you can trigger it also manually for testing... echo c > /proc/sysrq-trigger After the reboot you should have the crash log file in /sys/fs/pstore |
Thanks for the help, the reboot did save a crash log |
@Ansuel
(Never seen that before in the 6 years with R7800, and there has been no change regarding rrdtool for ages.) full ramoops:
|
I just had a hard lockup after like 20 hours of uptime, no logs unfortunately
|
Hi I had a reboot today on r20893. I have never had one on this build before. Latest master crashes mor for me. Is there a how to on how to get the ramoops logs? |
@tapper82 Which router are you using? If its not a R7800 then you have to modify your dts to be able to use ramoops. |
Oops#1 Part1 |
Panic#2 Part1 |
Thanks @motolav commented |
BTW I am not running this patch on my r7800. |
@hnyman random question can you disable devfreq and the krait cache driver in the config and try an image? |
Can you give exact changes to be made? Dts or kernel config? |
just kernel config... setting the 2 flag as not set.
|
My last crash might be relevant to these changes?
|
Had another crash overnight but with the same PC and LR as tapper82
|
@Ansuel I have not seen a crash since I applied the "devfreq" disablement patch we discussed privately
|
@hnyman i wonder if temporarily while i still search the cause of the wrong freq with messing with the original fw we can consider disabling the devfreq drivers? Will result in worse perf but at least better stability. Just to make sure you only have the devfreq patch right? not the changes to krait-cc right ? |
I would restore stability first, and think about performance afterwards.
Yes.
At least with my personal internet usage with a 200/100 connection, I see no actual performance hit in speedtests with SQM limits at 190/85: |
Considering the minimal change it looks sane to just disable the config. And this would confirm that the main problem on this platform is something badly configured around the cache clk |
@hnyman well i disabled the driver for now hoping i finally find the real culprit of all this mess... |
Add test patch to improve stability.
One patch is a qsdk backport...
The other is a major rework with the idea of moving everything to safe mux while scaling core clocks... From my limited test scaling looks to work correctly but wonder how it behave under load.
Signed-off-by: Christian Marangi ansuelsmth@gmail.com