Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add net.isr.maxthreads, net.isr.bindthreads, net.isr.dispatch to Tunables and adjust defaults #5415

Closed
2 tasks done
amezin opened this issue Dec 18, 2021 · 20 comments
Closed
2 tasks done
Labels
help wanted Contributor missing / timeout support Community support

Comments

@amezin
Copy link

amezin commented Dec 18, 2021

Important notices

Before you add a new report, we ask you kindly to acknowledge the following:

Is your feature request related to a problem? Please describe.

  • Hardware: Compulab fitlet2 - Atom E3950 CPU, dual Intel gigabit NICs.
  • Gigabit (almost) PPPoE WAN.

When running https://www.speedtest.net/ to my ISP's server, I noticed that OPNsense shows 20-30% packet loss on WAN.

It turns out that all PPP packets are handled on only one CPU core (with my hardware and default OPNsense configuration), and the single-core performance of E3950 is just not enough.

The issue (and the solution) is described in pfSense docs, FreeBSD Bugzilla

Describe the solution you like

I solved the issue by adding the following tunables:

  • net.isr.dispatch: deferred (hybrid works too)
  • net.isr.maxthreads: -1
  • net.isr.bindthreads: 1 (not sure if it's required, enabled just because it "makes sense")
  1. I think OPNsense UI should show these tunables by default.
  2. I think the default values for net.isr.maxthreads and net.isr.bindthreads should be adjusted.
  3. Maybe OPNsense should show a warning on PPP configuration page.

net.isr.maxthreads: I think on a router/firewall it should be set to "all cores" (-1) by default. Also, as far as I understand, this tunable has no effect when net.isr.dispatch == direct (the default).

net.isr.bindthreads: I haven't tested its performance impact. But if a thread is created for every CPU core, I think it makes sense to bind every thread to its own core.

net.isr.maxthreads and net.isr.bindthreads can be set only on boot, so a good default value is a bit more important than usual.

net.isr.dispatch: I've read in multiple places that you should avoid changing it unless you absolutely need to. Still don't understand the details. But at least it can be changed without a reboot, so:

  • Maybe it should be "more visible" than a "tunable"? I. e. maybe it should be added to "Interfaces: Settings" page, after various hardware offload options?
  • Maybe a script could detect that PPPoE is used, together with an affected NIC, and set the tunable automatically?
  • Comments in netisr.c suggest that dispatch policy can be set per-protocol. Maybe it could be set for PPP only? (although I haven't found how)
@AdSchellevis
Copy link
Member

Optimal settings vary per situation (like enabling rss for example https://forum.opnsense.org/index.php?topic=24409.msg116941#msg116941).

Adding an optimal settings paragraph for ppp type interfaces in the documentation (https://github.com/opnsense/docs) might be a better option.

@AdSchellevis AdSchellevis added the support Community support label Dec 18, 2021
@amezin
Copy link
Author

amezin commented Dec 27, 2021

That forum thread you've mentioned also suggests setting net.isr.maxthreads=-1 and net.isr.bindthreads=1. And also:

If RSS is enabled with the 'enabled' sysctl, the packet dispatching policy will move from ‘direct’ to ‘hybrid’.

So if you're enabling rss you also effectively apply the configuration I've described here.

pfSense docs suggest that they have net.isr.maxthreads=-1 as the default too (not sure about bindthreads):

Tuning the values of net.isr.maxthreads and net.isr.numthreads may yield additional performance gains. Generally these are best left at default values matching the number of CPU cores, but depending on the workload may work better at lower values.

And, again, maxthreads and bindthreads don't seem to have any effect unless you also enable deferred/hybrid dispatch or rss (or maybe something else) - i. e. if there are no threads.

So why not change the default? (I'm not asking for net.isr.dispatch, since that could cause regressions - only for maxthreads and bindthreads)

@fichtner
Copy link
Member

I‘m sure you can make this easier by sharing a comparison table across hardware with throughput and sysctl combinations. It’ll make changing the defaults that much easier. Thanks!

@lordraiden
Copy link

I‘m sure you can make this easier by sharing a comparison table across hardware with throughput and sysctl combinations. It’ll make changing the defaults that much easier. Thanks!

Or just use Google to find the evidence of a topic that has been beaten to death and documented in multiple forums

@lordraiden
Copy link

lordraiden commented Jan 11, 2022

@fichtner
Copy link
Member

Yeah, not really. ;)

@lordraiden
Copy link

Yeah, not really. ;)

Yeah

@fichtner
Copy link
Member

So what’s the reason FreeBSD has this default and doesn’t listen to “authorities” on the subject?

@lordraiden
Copy link

lordraiden commented Jan 12, 2022

So what’s the reason FreeBSD has this default and doesn’t listen to “authorities” on the subject?

Maybe is because freebsd is not designed to be a firewall only, I don't know

Men you asked for documentation (all the settings are explained with pro and cons), evidence and tests, now you have it, do as you wish.

If I was rude in my first comment accept my apologies

@lordraiden
Copy link

lordraiden commented Jan 12, 2022

In addition @amezin was asking for exposing the settings in the web ui if instead of making them the default if you see any risks. I think this will help people but you have to value how important is.
85% of lines in my country (all but 1 operator) are using ppoe.

If you decide to review the information there are more settings that can be exposed and will help performance

@mimugmail
Copy link
Member

@lordraiden its nearly impossible to find something stable for all situations and scenarios with just google and since operating System changes defaults over time (and CPU gets better) its also quite hard to keep defaults up2date. Similar to a discussion about IPsec defaults (which is way easier to handle). Usually you only change defaults when know 100% the impact. IMHO there needs to be a reproducable testbed and compare speedtests with every tunable showing the differences, then put this in docs (like pf does this too).

@lordraiden
Copy link

@mimugmail then expose the settings in the UI explain them base on the config file commented by calomel and let the people choose.

Some settings are validated and documented by other comercial firewalls so you are not jumping into the void

@mimugmail
Copy link
Member

It is already, you add a tunable via UI?! The rest is up to the docs.

@mimugmail
Copy link
Member

mimugmail commented Jan 12, 2022

BTW, how lucky you are, dont get me wrong, in Germany also most of the users have pppoe, but speeds at 100 or 250 mbit are still very high. Here at home I have a 100/50 fiber pppoe conneted with a standard slow Celeron:
CPU: Intel(R) Celeron(R) CPU J1900 @ 1.99GHz (2000.05-MHz K8-class CPU)
This is my speedtest:

root@mimu-fw:~ # speedtest-cli
Retrieving speedtest.net configuration...
Testing from Deutsche Telekom AG (80.151.56.135)...
Retrieving speedtest.net server list...
Selecting best server based on ping...
Hosted by FoxHost (Falkenstein) [83.95 km]: 22.812 ms
Testing download speed................................................................................
Download: 104.00 Mbit/s
Testing upload speed......................................................................................................
Upload: 53.62 Mbit/s

Quite sure also @fichtner has same values and Deciso in Netherlands don't use pppoe. Really, dont get me wrong, it needs qualified testing when changing defaults, referencing to a site, no matter about it's reputation shouldn't let anyone changing such things.

I'm also sure most commercial firewall vendors won't let you change such things as this would lead to very high support rates as ppl would change things read via google results turning into desaster.

I remember long time ago where I had a warning about a full disk in a central firewall manager. In hurry a googled and first hit was to reinit the database. Blindly executing this command wiped the db and also send all firewalls an empty config :) (wasnt one of my best days I guess)

@lordraiden
Copy link

lordraiden commented Jan 12, 2022

In my country everyone but Orange uses pppoe. 1gb connection are available in any big city or town although the standard is 300/600 mbps. Some providers like Digi have already customers with 10gb as a beta service and the ont and router provided by orange when you buy the 1gbps service is already 2.5 gbps capable

@mimugmail
Copy link
Member

Then most of the core dev's are not able to reproduce which makes it even harder to change defaults.

IMHO the best way would be:

  • iperf to external IPerf server, single stream, and a linux firewall (e.g. ipfire) between to check if you reach full speeds. If you have full speed on 3 times of the day this is good.
  • second would be the same test at the same time with default opnsense installation, compare the results in a sheet
  • third would be to adjust sysctl's one by one, every time with a reboot in between and add to the sheet.

It should be easy to see which knobs would gain more throughput and at first can be added to the docs.
Then let this evolve over time, let other with less throughput or evern other WAN uplink use against these values. Then, at some time, defaults can be changed.

Just my 2 cents :)

@L1ghtn1ng
Copy link

Just to add the UK uses PPPoE as well, it is more common than you think. Would be an idea to ask people on twitter maybe and in the forum and point people to a survey to get better understanding maybe? Or as @amezin mentioned detect that PPPoE is used, when you set your wan interface to it and then apply the changes in question? Mind you not sure what these changes would do if you are using RSS though. Just some food for thought, but docs for this would be a good start and if you pick PPPoE for your WAN interface, could have some help text in the UI that says you might want to check this url to the docs page in question to help get better throughput with this connection type you have as well which would be good.

@OPNsense-bot
Copy link

This issue has been automatically timed-out (after 180 days of inactivity).

For more information about the policies for this repository,
please read https://github.com/opnsense/core/blob/master/CONTRIBUTING.md for further details.

If someone wants to step up and work on this issue,
just let us know, so we can reopen the issue and assign an owner to it.

@OPNsense-bot OPNsense-bot added the help wanted Contributor missing / timeout label Jun 16, 2022
@brad0
Copy link

brad0 commented Mar 8, 2023

Just to add the UK uses PPPoE as well, it is more common than you think.

In Canada as well with Bell Canada with GPON / XGS-PON too.

@Zerophase
Copy link

Zerophase commented Jun 1, 2024

I did not formally bench it. But, when I was tuning my Dec850 for throughput turning these three settings on seems to significantly improve boot speeds.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
help wanted Contributor missing / timeout support Community support
Development

No branches or pull requests

9 participants