-
-
Notifications
You must be signed in to change notification settings - Fork 10k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
treewide: Don't diverge from upstream default HZ settings on 4.19 #1831
Conversation
Compile/Runtime tested on: EdgeRouter Lite with #1799 |
Compile/Runtime tested on: TP-Link Archer C7v2 with #1635 |
Is that PR title (and commit title) wrong? Title says "Don't derive HZ from upstream", while the explanation and the code changes seem to delete OpenWrt fixed values and use the upstream Linux values... |
Most targets upstream use 250Hz or even 1000Hz by default while 100Hz is hardcoded in OpenWrt's default config. Use upstream default except for apm821xx which hardsets 1000Hz instead of platform default of 250Hz. Signed-off-by: Daniel Engberg <daniel.engberg.lists@pyret.net>
833fb17
to
7d8b34f
Compare
@hnyman |
This patch adds the disabled DRM_RADEON and DRM_AMDGPU config symbols from the x86' config to the generic target configs. The existing symbols in the x86' configs are kept for now, until we know whenever we want to remove such symbols or not (see Github PR #1831, #1825, #1828). Signed-off-by: Christian Lamparter <chunkeey@gmail.com>
This patch adds more disabled DRM config symbols from the x86' config to the generic target configs. The existing symbols in the x86' configs are kept for now, until we know whenever we want to remove such symbols or not (see Github PR #1831, #1825, #1828). THis patch also contains a squashed patch from Daniel Engberg <daniel.engberg.lists@pyret.net> titled "kernel: Fix config for 4.14" which fixes a duplicated line added by: commit 8bdc241 ("x86: fix geode image builds") Fixes: 8bdc241 ("x86: fix geode image builds") Signed-off-by: Daniel Engberg <daniel.engberg.lists@pyret.net> Signed-off-by: Christian Lamparter <chunkeey@gmail.com>
@chunkeey |
Where are you getting that "upstream platform default" from? If it's just from the Kconfig then this is probably not enough, since there are also defconfigs for the architectures. As for the current HZ_1000. It comes from Chris Blake's initial apm821xx tree he did for the MR24. It was the configuration that booted. |
"upstream platform default" refers to what the kernel (Kconfig) sets by default if you select platform X when you start from scratch without a config. |
sounds like you hit reply too soon. |
I'll admit I'm not very much into Linux kernel development so what I'm trying to accomplish is the default value if you do (MIPS in this case)
with upstream kernel sources and a "clean" config. |
Hm, do you know if the "upstream defaults" are any better for OpenWrt? I mean at it's core (i.e. in the generic target) OpenWrt still likes it's HZ_PERIODIC whereas upstream switched over to NO_HZ(_X) |
If that's the case we need to agree on how and what to test and stick to it Suggestions: OpenSSL (default benchmark tests) kernel latency (idlejitter) and softirqs (might be of interest?) can be monitored and mesured using netdata Not sure if we have more user-friendly tools available for performance monitoring or how deep we need to go. |
Ping |
FWIW, tested with and without |
Compile tested: ipq40xx, Linksys EA6350v3, OpenWrt master |
@hauke |
@diizzyy Based on what chunkeey's comment says, the next step would be to provide performance measurements. I don't think there has to be a greater plan on what to test, just do some that are appropriate in your opinion. |
@adrianschmutzler Looking at the commit history I could only find one commit that isn't very informative about the subject. As for the value itself it was hardset for kernel 2.6.39 (didn't bother to look further back) which is about 7 years ago. I think it's likely that it has been kept as kernel configs seems to have been reused for major versions "bumps" and not evaluated. |
@diizzyy I won't be able to review and merge this, I'm just commenting as I've seen this is stuck. The last comment from a core-developer was the clear direction to do performance measurements. From my side, I cannot judge what's better or worse in this case. But if you would say that option A leads to improvement by X over B, I would get the feeling that B should be changed into A. |
I didn't even get any feedback and given earlier attempts I doubt it's worth wasting time as it'll most likely end up with "you didn't benchmark thing X". Perhaps if someone else posts a patch of the same thing it'll get merged who knows... |
As OpenWrt devices are mostly used as routers, the HZ-value is important for the NAPI polling. The NAPI polling interval is hard-coded to 2 jiffies (correct me if I'm wrong). NAPI switches from interrupt-driven mode to polling mode when there is a high traffic demand (defined by the NAPI weight). The time of one jiffie is defined by the HZ-value. 100 Hz means one jiffie is 10 ms long. So the NAPI polling occurs every 20 ms. Increasing the HZ-value to 1000 Hz would lower this to 2 ms. The polling mode is only being used when there are more packets per 2 jiffies than the NAPI weight (correct me if I'm wrong) which is 64 in ag71xx. I remember I've read a commit message changing the NAPI weight, because Qualcomm said it was better, but actually someone should sit down and do the maths instead of relying on Qualcomm. Edit: Unfortunately I don't have the nerves to do this at the moment. Please have a look at https://lwn.net/Articles/687617/ |
Thanks! Pulled into my staging tree at https://git.openwrt.org/openwrt/staging/ldir.git |
@ldir-EDB0 Didn't you read my comment about this? |
That's an old mirror. I changed ag71xx to use 32 for NAPI weight. edit: in any case, changing upstream defaults should be justified. |
@neheb It is justified, I guess. Possibly it diverged from upstream in QSDK.
2 ms is far to low for polling mode, because it could create higher load on devices with low computation power (which most of the routers are). You need to imagine it like this: polling mode should be used on high traffic demand - latency is secondary then, because the router does not have enough computation power to guarantee both latency and throughput. I don't want to say these values make sense, but only that 2 ms might be too short rendering the polling mode kind of useless. |
@CodeFetch Yes I did read the comment and decided that committing and waiting for the screams in order to revert if required was better than having a PR hang around for another year. If it turns out we DO need to override upstream defaults then the requirement can be documented in a new commit. |
@ldir-EDB0 We have embedded routers here. Upstream defaults are for ordinary computers running with a more efficient CPU architecture and at a higher clock. If people notice their devices are unresponsive when there is a high traffic demand, I doubt that they'll know why... When you set the HZ value too high, this means that you might starve user-space's computation time. 2 ms means that there are only 2 ms for the user-space... I expect this to cripple everything but networking. |
Do note that not all sets 1000Hz by default and it doesn't seem to have much impact |
So, I just stumbled upon this PR. TL;DR: I believe this proposal was wrong on all accounts. Some datapoints:
NAPI and HZ are related only in that NAPI, being a polling system, needs to track elapsed time and does this by measuring elapsed timer ticks. Timer ticks are controlled by HZ. Increasing HZ to "improve NAPI" is a huge misunderstanding of NAPI's goals and implementation. In fact, it nearly entirely defeats its purpose. NAPI aims at reducing the CPU load servicing IRQs during packet storms. NAPI only engages on high traffic, precisely when the CPU is under load. Increasing HZ increases the IRQ "noise floor", since higher HZ directly translates in higher timer IRQ rate, which "distracts" the CPU from its main tasks. On CPU starved devices like low end routers, a high HZ will result in a higher CPU load:
The first LWN article posted above also clearly explain why the typical load encountered on a router would warrant a low HZ setting. If NAPI results in perceived latency, I think it would first need to be measured (I'd be more inclined to believe that the latency isn't due to NAPI but rather to CPU hogging anyway) and there are tunables better suited to adjust than fiddling with the timer tick. This change was proposed without any material evidence supporting a benefit to typical OpenWRT workload (and since I assume my opinion won't matter much, I will apply the same effort in providing evidence the change is a bad choice, although I suspect that evidence will materialize soon on the bugtracker and forums 😜). Assuming that upstream defaults are "better" or even "good for everyone" is absurd. These values need to be set, upstream offers some defaults that may be geared toward a specific workload (and most often for big metal). I certainly hope that the next proposal won't be to switch CONFIG_DEFAULT_IOSCHED back to upstream's CFQ, for instance 😛 HTH |
@f00b4r0 I totally agree with you. |
A lot of kernel code has changed in 10+ years including compilers so performance characteristics are most likely very different and worth noting is that all arguments (so far) are theorical without any actual performance benchmarks. Also worth keeping in mind is that depending on platform you may see very different performance characteristics so what applies to MIPS may not apply to ARM or x86 and vice versa. While you certainly can tune against specific workloads I think you can safely assume that mainline is tuned against "general" performance which however may or may not apply to your specific workload. I've used this on MIPS, PPC and ARM without any obvious performance regressions. I've actually seen a few improvements but your milage may vary however these haven't been proper isolated tests. |
With the "real HZ" debate out of the way, let's actually use the apm821xx's default upstream config file at arch/powerpc/configs/44x/bluestone_defconfig. From what I can tell, it sets NO_HZ (well, this platform was for NAS, so this is a bit unexpected) and remove any specific HZ_$VALUE symbol. Sadly, Daniel Engberg didn't run any before/after netperf tests, because it would have been such a slam dunk across the boards. In case of the apm821xx the tcp tx/rx performance improved ~14% (from 600Mbps to 700Mps). This now causes the emac to drop frames too, so let's see if this is causes more problems or not. This patch includes a refresh of the configuration too. Signed-off-by: Christian Lamparter <chunkeey@gmail.com>
Thank you for making these completely unsubstantiated (and plain wrong) claims. It shows clearly that you had no idea what you were talking about when you made this PR. I'll rest my case. |
Again, no one (either way if that wasn't obvious) have done any benchmarks of actual performance which I honestly don't see why you need to get upset about it especially since you "believe" which would imply that you aren't sure about your claims. Looking at recent commit history apm821xx |
For the record, I don't "believe" the cpu impact of increasing HZ, I know it. I don't "believe" how NAPI works either. And I don't understand your last comment about the recent commit. Your change in this PR is a NO-OP for apm821xx so if you claim your PR had a performance impact there, I'll assume it's thanks to a nice fairy. The commit you reference apparently switched apm821xx to full tickless kernel, which a completely different thing (and might actually be a good thing on some hardware for some workloads: this has performance implications as well, I won't detail it there it's not the place). Incidentally, full tickless operation means the timer tick can go as low as 1Hz. Are you arguing now that a lower timer frequency can be good for performance, like I did above? Anyway, as I said, I'm now convinced you don't know what you're talking about, and I'm done feeding this troll. This is a moot case. I suggest next you offer to move the generic configs to the default upstream IOSCHED as well (CFQ), without providing any numbers like you did here, and see what kind of reaction you get. Might be tuned for "general performance" too, and as we all know, a router definitely falls into the "general" category 😛 Cheers |
As far as I can tell no one as ever claimed that cpu load alone isn't impacted by changing HZ? I guess it could've been clearer that it's the end result that's interesting not whether top shows a few percents less or more. Most of the "trolling" goes back to the fact that there is no hard evidence (ie benchmarks) supporting either claim rather than falling back to very dated by now documentation. What you also seem to completely ignore is that I did offer to provide data however no one ever bothered to reply about that but then again you didn't even offer that so I don't see how that's going to provide any useful information making the discussion proceed regarding this matter. Either way it's committed and you're free to do a pull request if you think that it's inappropriate for this kind of application. |
#1831 (comment) And if you don't understand that if "top shows a few percent more" [wasted servicing timer interrupts] it means that e.g. wifi throughput will suffer on hardware where the CPU is the bottleneck, then I guess I made my point. I'm done here. |
@f00b4r0 What I'm trying to say is that you're doing assumptions which may not behave exactly the same as 10+ years ago. Not that it matters much by now (since it's already committed) but I guess I could provide some data using the mvebu platform using iperf3 and netdata as that's what I have at hand. |
Since this is merged and the discussion might be educating, but pointless, I suggest we make the best out it: What this commit effectively did was removed the global HZ setting for all targets. However, maybe it's better to deviate from that target-wise, like it's already done for several targets (and recently added by for apm821xx by @chunkeey , who followed the same strategy, if I understood him correctly) This will enable to aim for the specific architecture/performance of the target and also might make performance measurements more meaningful after all. |
Most targets upstream use 250Hz or even 1000Hz by default while
100Hz is hardcoded in OpenWrt's default config. Use upstream default
except for apm821xx which hardsets 1000Hz instead of platform default of
250Hz.
Signed-off-by: Daniel Engberg daniel.engberg.lists@pyret.net