treewide: Don't diverge from upstream default HZ settings on 4.19 #1831

diizzyy · 2019-02-18T07:21:12Z

Most targets upstream use 250Hz or even 1000Hz by default while
100Hz is hardcoded in OpenWrt's default config. Use upstream default
except for apm821xx which hardsets 1000Hz instead of platform default of
250Hz.

Signed-off-by: Daniel Engberg daniel.engberg.lists@pyret.net

diizzyy · 2019-02-18T07:23:59Z

Compile/Runtime tested on: EdgeRouter Lite with #1799

neheb · 2019-02-19T04:26:50Z

Compile/Runtime tested on: TP-Link Archer C7v2 with #1635

hnyman · 2019-02-19T16:32:44Z

@diizzyy

Is that PR title (and commit title) wrong?

Title says "Don't derive HZ from upstream", while the explanation and the code changes seem to delete OpenWrt fixed values and use the upstream Linux values...

Most targets upstream use 250Hz or even 1000Hz by default while 100Hz is hardcoded in OpenWrt's default config. Use upstream default except for apm821xx which hardsets 1000Hz instead of platform default of 250Hz. Signed-off-by: Daniel Engberg <daniel.engberg.lists@pyret.net>

diizzyy · 2019-02-19T20:01:29Z

@hnyman
Fixed, thank you :-)

This patch adds the disabled DRM_RADEON and DRM_AMDGPU config symbols from the x86' config to the generic target configs. The existing symbols in the x86' configs are kept for now, until we know whenever we want to remove such symbols or not (see Github PR #1831, #1825, #1828). Signed-off-by: Christian Lamparter <chunkeey@gmail.com>

This patch adds more disabled DRM config symbols from the x86' config to the generic target configs. The existing symbols in the x86' configs are kept for now, until we know whenever we want to remove such symbols or not (see Github PR #1831, #1825, #1828). THis patch also contains a squashed patch from Daniel Engberg <daniel.engberg.lists@pyret.net> titled "kernel: Fix config for 4.14" which fixes a duplicated line added by: commit 8bdc241 ("x86: fix geode image builds") Fixes: 8bdc241 ("x86: fix geode image builds") Signed-off-by: Daniel Engberg <daniel.engberg.lists@pyret.net> Signed-off-by: Christian Lamparter <chunkeey@gmail.com>

diizzyy · 2019-03-23T11:38:16Z

@chunkeey
Any reason why apm821xx is 1000Hz instead of upstream platform default?

chunkeey · 2019-04-03T18:57:23Z

Any reason why apm821xx is 1000Hz instead of upstream platform default?

Where are you getting that "upstream platform default" from? If it's just from the Kconfig then this is probably not enough, since there are also defconfigs for the architectures.

As for the current HZ_1000. It comes from Chris Blake's initial apm821xx tree he did for the MR24. It was the configuration that booted.

diizzyy · 2019-04-03T19:07:03Z

"upstream platform default" refers to what the kernel (Kconfig) sets by default if you select platform X when you start from scratch without a config.

chunkeey · 2019-04-03T19:38:23Z

"upstream platform default" refers to what the kernel (Kconfig) sets by default if you select platform X when you start from scratch without a config.

sounds like you hit reply too soon.

diizzyy · 2019-04-03T19:45:48Z

I'll admit I'm not very much into Linux kernel development so what I'm trying to accomplish is the default value if you do (MIPS in this case)

export ARCH=mips
make defconfig
make menuconfig

with upstream kernel sources and a "clean" config.

chunkeey · 2019-04-03T20:22:21Z

Hm, do you know if the "upstream defaults" are any better for OpenWrt? I mean at it's core (i.e. in the generic target) OpenWrt still likes it's HZ_PERIODIC whereas upstream switched over to NO_HZ(_X)
for many targets long ago. If HZ Setting makes a difference, it should be measurable. Like, do you remember the ath9x caldata extraction thing? I only measured a order-of-magnitude improvement on the ipq40xx, whereas is was much much more on ath79 (and according to wndr3700 owners it fixed the discovery bug). So, if you can measure how much the HZ setting would affect stuff like popular application, LAN, WAN and WLAN throughput and latency in a positive way. (You could also look at idle/load power consumption, or whatever tickles your fancy) then: Sure! If there's something to be gained then I would definitely take the risk of potentially breaking some eggs.

diizzyy · 2019-04-04T19:08:33Z

If that's the case we need to agree on how and what to test and stick to it
Perferably it should be something fairly easy to do and doable on most devices

Suggestions:

OpenSSL (default benchmark tests)
time creation of tar.gz file using busybox
iperf3 tests over eth and wifi if possible (tx rx)
time transfer using ncat-ssl (nmap-suite)
timed transcoding of audio (multithreaded) using ffmpeg4 (package available)

kernel latency (idlejitter) and softirqs (might be of interest?) can be monitored and mesured using netdata
Implement a simple ping (network latency) test using fping and netdata that runs in the background

Not sure if we have more user-friendly tools available for performance monitoring or how deep we need to go.

diizzyy · 2019-04-13T11:36:01Z

Ping

diizzyy · 2019-04-16T00:19:30Z

FWIW, tested with and without V=s

diizzyy · 2019-07-08T13:12:17Z

Compile tested: ipq40xx, Linksys EA6350v3, OpenWrt master
Run tested: ipq40xx, Linksys EA6350v3, OpenWrt master

diizzyy · 2019-08-05T07:47:05Z

@hauke
Any thoughts on this?

adschm · 2019-10-31T13:29:08Z

@diizzyy Based on what chunkeey's comment says, the next step would be to provide performance measurements. I don't think there has to be a greater plan on what to test, just do some that are appropriate in your opinion.

diizzyy · 2019-10-31T21:48:54Z

@adrianschmutzler
I could ask the exact same question the other way around, there's no information of why this value was chosen from what I could find.

Looking at the commit history I could only find one commit that isn't very informative about the subject.
cbbfc5f

As for the value itself it was hardset for kernel 2.6.39 (didn't bother to look further back) which is about 7 years ago. I think it's likely that it has been kept as kernel configs seems to have been reused for major versions "bumps" and not evaluated.

adschm · 2019-10-31T22:10:56Z

@diizzyy I won't be able to review and merge this, I'm just commenting as I've seen this is stuck. The last comment from a core-developer was the clear direction to do performance measurements. From my side, I cannot judge what's better or worse in this case. But if you would say that option A leads to improvement by X over B, I would get the feeling that B should be changed into A.
So, why not just try that?

diizzyy · 2019-10-31T22:20:01Z

I didn't even get any feedback and given earlier attempts I doubt it's worth wasting time as it'll most likely end up with "you didn't benchmark thing X". Perhaps if someone else posts a patch of the same thing it'll get merged who knows...

CodeFetch · 2020-02-05T19:21:23Z

As OpenWrt devices are mostly used as routers, the HZ-value is important for the NAPI polling.

The NAPI polling interval is hard-coded to 2 jiffies (correct me if I'm wrong).

NAPI switches from interrupt-driven mode to polling mode when there is a high traffic demand (defined by the NAPI weight).

The time of one jiffie is defined by the HZ-value. 100 Hz means one jiffie is 10 ms long. So the NAPI polling occurs every 20 ms. Increasing the HZ-value to 1000 Hz would lower this to 2 ms.

The polling mode is only being used when there are more packets per 2 jiffies than the NAPI weight (correct me if I'm wrong) which is 64 in ag71xx.

I remember I've read a commit message changing the NAPI weight, because Qualcomm said it was better, but actually someone should sit down and do the maths instead of relying on Qualcomm.
So we need someone to calculate which values make sense.

Edit: Unfortunately I don't have the nerves to do this at the moment. Please have a look at https://lwn.net/Articles/687617/

ldir-EDB0 · 2020-03-29T21:49:15Z

Thanks! Pulled into my staging tree at https://git.openwrt.org/openwrt/staging/ldir.git

CodeFetch · 2020-03-30T05:17:42Z

@ldir-EDB0 Didn't you read my comment about this?

neheb · 2020-03-30T05:31:40Z

That's an old mirror. I changed ag71xx to use 32 for NAPI weight.

edit: in any case, changing upstream defaults should be justified.

CodeFetch · 2020-03-30T05:46:02Z

@neheb It is justified, I guess. Possibly it diverged from upstream in QSDK.

Increasing the HZ-value to 1000 Hz would lower this to 2 ms.

2 ms is far to low for polling mode, because it could create higher load on devices with low computation power (which most of the routers are). You need to imagine it like this: polling mode should be used on high traffic demand - latency is secondary then, because the router does not have enough computation power to guarantee both latency and throughput.

I don't want to say these values make sense, but only that 2 ms might be too short rendering the polling mode kind of useless.

ldir-EDB0 · 2020-03-30T11:01:36Z

@CodeFetch Yes I did read the comment and decided that committing and waiting for the screams in order to revert if required was better than having a PR hang around for another year. If it turns out we DO need to override upstream defaults then the requirement can be documented in a new commit.

CodeFetch · 2020-03-30T11:36:22Z

@ldir-EDB0 We have embedded routers here. Upstream defaults are for ordinary computers running with a more efficient CPU architecture and at a higher clock. If people notice their devices are unresponsive when there is a high traffic demand, I doubt that they'll know why... When you set the HZ value too high, this means that you might starve user-space's computation time. 2 ms means that there are only 2 ms for the user-space... I expect this to cripple everything but networking.

diizzyy · 2020-03-30T15:47:22Z

Do note that not all sets 1000Hz by default and it doesn't seem to have much impact
https://passthroughpo.st/config_hz-how-does-it-affect-kvm/ looking at userspace.

f00b4r0 · 2020-04-10T10:08:00Z

So, I just stumbled upon this PR. TL;DR: I believe this proposal was wrong on all accounts.

Some datapoints:

And old, yet still relevant LWN post on HZ: https://lwn.net/Articles/145973/
What NAPI is and how it works in a nutshell: https://www.halolinux.us/kernel-architecture/support-for-highspeed-interfaces.html
For the technically inclined, a very detailed post on how to fine tune networking performance in linux (receiving data), which also walks in detail through the NAPI code: https://blog.packagecloud.io/eng/2016/06/22/monitoring-tuning-linux-networking-stack-receiving-data/

NAPI and HZ are related only in that NAPI, being a polling system, needs to track elapsed time and does this by measuring elapsed timer ticks. Timer ticks are controlled by HZ.

Increasing HZ to "improve NAPI" is a huge misunderstanding of NAPI's goals and implementation. In fact, it nearly entirely defeats its purpose. NAPI aims at reducing the CPU load servicing IRQs during packet storms. NAPI only engages on high traffic, precisely when the CPU is under load. Increasing HZ increases the IRQ "noise floor", since higher HZ directly translates in higher timer IRQ rate, which "distracts" the CPU from its main tasks.

On CPU starved devices like low end routers, a high HZ will result in a higher CPU load:

further reducing CPU cycles available to process network packets and perform other tasks
increasing power usage for no reason

The first LWN article posted above also clearly explain why the typical load encountered on a router would warrant a low HZ setting. If NAPI results in perceived latency, I think it would first need to be measured (I'd be more inclined to believe that the latency isn't due to NAPI but rather to CPU hogging anyway) and there are tunables better suited to adjust than fiddling with the timer tick.

This change was proposed without any material evidence supporting a benefit to typical OpenWRT workload (and since I assume my opinion won't matter much, I will apply the same effort in providing evidence the change is a bad choice, although I suspect that evidence will materialize soon on the bugtracker and forums 😜).

Assuming that upstream defaults are "better" or even "good for everyone" is absurd. These values need to be set, upstream offers some defaults that may be geared toward a specific workload (and most often for big metal). I certainly hope that the next proposal won't be to switch CONFIG_DEFAULT_IOSCHED back to upstream's CFQ, for instance 😛

HTH

CodeFetch · 2020-04-10T10:25:16Z

@f00b4r0 I totally agree with you.

diizzyy · 2020-04-11T08:58:03Z

A lot of kernel code has changed in 10+ years including compilers so performance characteristics are most likely very different and worth noting is that all arguments (so far) are theorical without any actual performance benchmarks. Also worth keeping in mind is that depending on platform you may see very different performance characteristics so what applies to MIPS may not apply to ARM or x86 and vice versa.

While you certainly can tune against specific workloads I think you can safely assume that mainline is tuned against "general" performance which however may or may not apply to your specific workload.

I've used this on MIPS, PPC and ARM without any obvious performance regressions. I've actually seen a few improvements but your milage may vary however these haven't been proper isolated tests.

With the "real HZ" debate out of the way, let's actually use the apm821xx's default upstream config file at arch/powerpc/configs/44x/bluestone_defconfig. From what I can tell, it sets NO_HZ (well, this platform was for NAS, so this is a bit unexpected) and remove any specific HZ_$VALUE symbol. Sadly, Daniel Engberg didn't run any before/after netperf tests, because it would have been such a slam dunk across the boards. In case of the apm821xx the tcp tx/rx performance improved ~14% (from 600Mbps to 700Mps). This now causes the emac to drop frames too, so let's see if this is causes more problems or not. This patch includes a refresh of the configuration too. Signed-off-by: Christian Lamparter <chunkeey@gmail.com>

f00b4r0 · 2020-04-11T09:18:13Z

Thank you for making these completely unsubstantiated (and plain wrong) claims. It shows clearly that you had no idea what you were talking about when you made this PR. I'll rest my case.

diizzyy · 2020-04-11T09:29:28Z

Again, no one (either way if that wasn't obvious) have done any benchmarks of actual performance which I honestly don't see why you need to get upset about it especially since you "believe" which would imply that you aren't sure about your claims. Looking at recent commit history apm821xx
did recieve a performance boost from this change (it was hardset at 1000Hz by default).

f00b4r0 · 2020-04-11T10:02:41Z

For the record, I don't "believe" the cpu impact of increasing HZ, I know it. I don't "believe" how NAPI works either. And I don't understand your last comment about the recent commit. Your change in this PR is a NO-OP for apm821xx so if you claim your PR had a performance impact there, I'll assume it's thanks to a nice fairy.

The commit you reference apparently switched apm821xx to full tickless kernel, which a completely different thing (and might actually be a good thing on some hardware for some workloads: this has performance implications as well, I won't detail it there it's not the place). Incidentally, full tickless operation means the timer tick can go as low as 1Hz. Are you arguing now that a lower timer frequency can be good for performance, like I did above?

Anyway, as I said, I'm now convinced you don't know what you're talking about, and I'm done feeding this troll. This is a moot case. I suggest next you offer to move the generic configs to the default upstream IOSCHED as well (CFQ), without providing any numbers like you did here, and see what kind of reaction you get. Might be tuned for "general performance" too, and as we all know, a router definitely falls into the "general" category 😛

Cheers

diizzyy · 2020-04-11T10:40:15Z

As far as I can tell no one as ever claimed that cpu load alone isn't impacted by changing HZ? I guess it could've been clearer that it's the end result that's interesting not whether top shows a few percents less or more. Most of the "trolling" goes back to the fact that there is no hard evidence (ie benchmarks) supporting either claim rather than falling back to very dated by now documentation. What you also seem to completely ignore is that I did offer to provide data however no one ever bothered to reply about that but then again you didn't even offer that so I don't see how that's going to provide any useful information making the discussion proceed regarding this matter. Either way it's committed and you're free to do a pull request if you think that it's inappropriate for this kind of application.

f00b4r0 · 2020-04-11T14:31:56Z

#1831 (comment)
You were asked to provide measurements. And then instead of complying, you reverted the burden of proof - as you're doing again - by claiming there was no evidence the other setting was better. If that's not trolling, I don't know what is.

And if you don't understand that if "top shows a few percent more" [wasted servicing timer interrupts] it means that e.g. wifi throughput will suffer on hardware where the CPU is the bottleneck, then I guess I made my point. I'm done here.

diizzyy · 2020-04-11T19:25:21Z

@f00b4r0
Perhaps you want to browse a bit further back.
#1831 (comment)
...to which I responded as I didn't want to spend time doing benchmarks to later find out that whatever application used weren't suitable for the task without any response.
#1831 (comment)

What I'm trying to say is that you're doing assumptions which may not behave exactly the same as 10+ years ago.

Not that it matters much by now (since it's already committed) but I guess I could provide some data using the mvebu platform using iperf3 and netdata as that's what I have at hand.

adschm · 2020-04-12T09:52:55Z

Since this is merged and the discussion might be educating, but pointless, I suggest we make the best out it: What this commit effectively did was removed the global HZ setting for all targets. However, maybe it's better to deviate from that target-wise, like it's already done for several targets (and recently added by for apm821xx by @chunkeey , who followed the same strategy, if I understood him correctly)

This will enable to aim for the specific architecture/performance of the target and also might make performance measurements more meaningful after all.

diizzyy mentioned this pull request Feb 18, 2019

target/octeon: Add Linux 4.19 support #1799

Closed

chunkeey added the kernel pull request/issue with Linux kernel related changes label Feb 18, 2019

diizzyy force-pushed the kernel419-hz branch from 833fb17 to 7d8b34f Compare February 19, 2019 20:00

diizzyy changed the title ~~treewide: Don't derive HZ from upstream on 4.19~~ treewide: Don't diverge from upstream default HZ settings on 4.19 Feb 19, 2019

adschm added the needs changes label Oct 31, 2019

ldir-EDB0 closed this Mar 30, 2020

adschm mentioned this pull request Apr 13, 2020

mvebu: initial 5.4 support #2804

Closed

treewide: Don't diverge from upstream default HZ settings on 4.19 #1831

treewide: Don't diverge from upstream default HZ settings on 4.19 #1831

Conversation

diizzyy commented Feb 18, 2019

diizzyy commented Feb 18, 2019

neheb commented Feb 19, 2019

hnyman commented Feb 19, 2019

diizzyy commented Feb 19, 2019

diizzyy commented Mar 23, 2019

chunkeey commented Apr 3, 2019

diizzyy commented Apr 3, 2019

chunkeey commented Apr 3, 2019

diizzyy commented Apr 3, 2019

chunkeey commented Apr 3, 2019

diizzyy commented Apr 4, 2019

diizzyy commented Apr 13, 2019

diizzyy commented Apr 16, 2019

diizzyy commented Jul 8, 2019

diizzyy commented Aug 5, 2019

adschm commented Oct 31, 2019

diizzyy commented Oct 31, 2019

adschm commented Oct 31, 2019

diizzyy commented Oct 31, 2019

CodeFetch commented Feb 5, 2020 • edited Loading

ldir-EDB0 commented Mar 29, 2020

CodeFetch commented Mar 30, 2020

neheb commented Mar 30, 2020 • edited Loading

CodeFetch commented Mar 30, 2020

ldir-EDB0 commented Mar 30, 2020

CodeFetch commented Mar 30, 2020

diizzyy commented Mar 30, 2020

f00b4r0 commented Apr 10, 2020 • edited Loading

CodeFetch commented Apr 10, 2020

diizzyy commented Apr 11, 2020

f00b4r0 commented Apr 11, 2020 • edited Loading

diizzyy commented Apr 11, 2020

f00b4r0 commented Apr 11, 2020 • edited Loading

diizzyy commented Apr 11, 2020

f00b4r0 commented Apr 11, 2020 • edited Loading

diizzyy commented Apr 11, 2020

adschm commented Apr 12, 2020

CodeFetch commented Feb 5, 2020 •

edited

Loading

neheb commented Mar 30, 2020 •

edited

Loading

f00b4r0 commented Apr 10, 2020 •

edited

Loading

f00b4r0 commented Apr 11, 2020 •

edited

Loading

f00b4r0 commented Apr 11, 2020 •

edited

Loading

f00b4r0 commented Apr 11, 2020 •

edited

Loading