Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature: Knob-less QoS with fq_codel/cake and something like OpenWRT's "SQM" #505

Closed
obrienmd opened this issue Dec 7, 2015 · 30 comments
Closed
Assignees
Labels
feature Adding new functionality
Milestone

Comments

@obrienmd
Copy link

obrienmd commented Dec 7, 2015

Given the amazing success of OpenWRT with fq_codel and SQM at beating back bufferbloat and providing low latency across flows with having to tune: http://www.bufferbloat.net/projects/codel/wiki

I'd like to move some of my company's pfsense boxes over to a distro that uses something like this. Right now IPFire (being linux-based) is able to do this pretty easily, but I would love to use OPNsense.

This seems seriously non-trivial to do in FreeBSD given the chatter in the pfsense community about this.

@AdSchellevis
Copy link
Member

We're using a different system for traffic shaping and QoS (ipfw dummynet), which doesn't contain the codel algorithm.
There is a custom patch available for ALTQ/pf (which is in pfSense), but won't match our codebase.

In ipfw/dummynet there also are some options for scheduling, which are not in our UI at the moment, but which should be a more logical approach in our case.
https://www.freebsd.org/cgi/man.cgi?ipfw%288%29#TRAFFIC%09SHAPER_%28DUMMYNET%29_CONFIGURATION

@obrienmd
Copy link
Author

obrienmd commented Dec 7, 2015

Right - Would you give any consideration to fq_codel (not really the same as codel, see the bufferbloat.net link above), or cake? From my experience in all sorts of shaping / QoS systems, they are significantly ahead of the rest of the field in out-of-the-box effectiveness:

https://indico.uknof.org.uk/getFile.py/access?contribId=3&resId=0&materialId=slides&confId=27

@fichtner
Copy link
Member

fichtner commented Dec 7, 2015

I really don't see ALTQ/CODEL (it will be in FreeBSD 11 courtesy of pfSense) combination lift off as ALTQ is (and most likely will remain) disabled in FreeBSD GENERIC. OpenBSD removed ALTQ as well some time ago.

I've read the page you provided and a bit more on the topic (thanks btw). There seems to be dual-licensed code that could make its way into FreeBSD in another way. The work looks very promising. I also like the zero-config approach, although in theory this shouldn't be a subsystem, it should be a holistic switch that covers all traffic flowing through the box (or an interface). As such it may have side effects with an enabled traffic shaper, but it's better than having to deal with "either this or that, not both" scenarios. Or at least that's how the GUI should handle it, right? :)

@obrienmd
Copy link
Author

obrienmd commented Dec 7, 2015

Honestly, I'm not sure - it's dual licensed but from what I've read (I'm no expert on kernels or low-level nets code) the port to FreeBSD is not easy.

With regard to the GUI side, having something like OpenWRT's SQM would be my personal idea for my team's deployments.

@obrienmd
Copy link
Author

obrienmd commented Dec 8, 2015

Looks like there is a Comcast-sponsored student working on getting fq_codel into FreeBSD dummynet:

http://lists.freebsd.org/pipermail/freebsd-net/2015-September/043443.html

@fichtner
Copy link
Member

fichtner commented Dec 8, 2015

Great news indeed, this probably won't make it into FreeBSD 11.0 in time, but I'm sure we can backport or wait for 11.1. The OPNsense traffic shaper code is basically ready for this as is now.

@obrienmd
Copy link
Author

obrienmd commented Dec 8, 2015

Very cool. It's hard to overstate just how impressive fq_codel w/ BFQ is in action - I highly recommend spinning up OpenWRT and seeing it in action with SQM limited to 95% of bandwidth.

@fichtner fichtner added the upstream Third party issue label Feb 16, 2016
@fichtner fichtner added this to the Future milestone Feb 16, 2016
@fichtner
Copy link
Member

@obrienmd
Copy link
Author

Wow! I'm impressed they cranked this out that quickly...

@fichtner
Copy link
Member

We also have a test kernel based on the latest OPNsense code ;)

In case anyone wants to try it:

# opnsense-update -bkr 16.1.3-aqm && /usr/local/etc/rc.reboot

Here's the doc to operate AQM on the command line...

http://caia.swin.edu.au/freebsd/aqm/patches/README-0.1.txt

@obrienmd
Copy link
Author

obrienmd commented Mar 1, 2016

Great job! Seems to work OK, very interested in how it gets integrated into GUI - take a look at OpenWRT sqm in luci, it's super simple and works great.

@dtaht
Copy link

dtaht commented Apr 21, 2016

Applause! Benchmarks wanted. (try using https://github.com/tohojo/flent )

Is the -0.2 patch incorporated yet? That fixed a few problems, notably one with ecn handling.

@fichtner
Copy link
Member

@dtaht not yet shipped, but it's in the repo already opnsense/src@fb03383

I will push another test build soon enough. Preliminary compile looked good yesterday

Thanks for the link, will benchmark what I can from here :)

@fichtner fichtner added feature Adding new functionality and removed upstream Third party issue labels Apr 21, 2016
@fichtner
Copy link
Member

Test base/kernel with v0.2 is up for amd64:

# opnsense-update -bkr 16.1.9-aqm && /usr/local/etc/rc.reboot

AdSchellevis added a commit that referenced this issue Apr 25, 2016
@AdSchellevis
Copy link
Member

performed some simple tests to see how it works (using v0.1), thanks @dtaht for the tip of using flent.

All tests performed using the following command:

flent rrul -p all_scaled -l 60 -H hostname -t "Title" -o filename.png

Test 1: ipfw enabled, but not passed through dummynet
bufferbloat_no_codel_no_dummynet

Test 2: dummynet enabled, using default Weighted Fair Queueing (wf2q+)
bufferbloat_no_codel

Test 3: CoDel enabled, using defaults
bufferbloat_codel

Test 4: FQ-CoDel enabled, using defaults
bufferbloat_fq_codel

fichtner pushed a commit that referenced this issue Apr 27, 2016
…rrent default (wf2q+) explicit. related to #505

(cherry picked from commit 6fbd2dc)
fichtner pushed a commit that referenced this issue Apr 27, 2016
@fichtner
Copy link
Member

fichtner commented Apr 27, 2016

Initial work is delivered with 16.1.12 today. I'm going to close this ticket now.

The work will continue on our end, e.g. AQM v0.2 will be merged shortly after a bit more testing.

Feel free to discuss this ticket / its results further and add new tickets for individual improvements and bugs so we can track them independently.

Thank you all for your input, testing and help. :)

@dtaht
Copy link

dtaht commented Apr 27, 2016

your "codel" result (test 3) doesn't make any sense. You should have seen 10-20ms latency on this test with pure codel. Also, you can turn off log scales when generating test results in flent....

@dtaht
Copy link

dtaht commented Apr 27, 2016

also, @AdSchellevis were you explicitly shaping to 400mbit's or trying to run at line rate? Certainly shaping eats a great deal of cpu, and we don't know how fast BSD boxes can actually forward packets at line rate at all at this point, no matter the aqm/fq technology in play.

(hint, you can produce a comparison test in flent-gui by loading up all the *.flent.gz files and selecting "Data->add other open files) - bar charts, etc....)

PS: If you could stick up your flent.gz files somewhere I could get them, I could do a more full writeup elsewhere (blog.cerowrt.org probalby)

Thx VERY much for showing classic FQ result, also.

@AdSchellevis
Copy link
Member

@dtaht I'm not sure why you expect 10-20ms, without codel it was around 10-20ms under stress and dropped to 2ms with codel enabled, maybe I'm missing something or misinterpret the reading.
While testing, I had 2 different pipes (1 up, 1 down) limited to the max line speed (1Gbps), so shaping could certainly have impacted my performance a bit.

I deleted the *.flent.gz files from my machine, the measurements are probably not the best in the world. I can rerun the tests later under approx the same circumstances and send them to you then.
Can you provide me with an email address to send the files (or a link) to?

Thanks for the hint, I will certainly try the flent-gui too.

@dtaht
Copy link

dtaht commented Apr 28, 2016

From eyeballing the tests you only got 400mbit in both directions on the gbit link, when could crack about 880 in both directions simultaneously theoretically on a switched network, with suitably fast clients/servers driving the test. Your first test without anything in play (again, from eyeballing, there's a bar chart and totals chart in the flent-gui that makes it easier to read) hit 680/600 or so. (which is still well below theoretical) So what you were measuring was loss elsewhere in the stack. Probably. Sure! The end result looks good (I personally will take low jitter and latency all the time at some cost in bandwidth vs big spikes of throughput, high jitter/latency/loss), but...

Try shaping to 100mbit on both sides of the link to see a difference between codel and fq_codel.

@AdSchellevis
Copy link
Member

Yesterday I wasn't able to redo some testing, a bit too busy. I will try to run the 100Mbps test like you suggested next week (yes, my measurements where well below max, probably old / slow switches in between, a bit too busy to build a decent test setup ;) ).
If you drop me an email (ad at project domain), I will send you the results next week.

@dtaht
Copy link

dtaht commented Apr 30, 2016

one result I think you are showing is that fq_codel tends to drop stuff sanely when router cpu is overloaded. ;) I look forward to your results and I dropped you an email a few minutes ago.

@RasoolAlSaadi
Copy link

I would like to announce that we released Dummynet AQM version 0.2.1 (CoDel, FQ-CoDel, PIE and FQ-PIE) which includes important bugs fixing. I highly recommend to upgrade to this version.

@fichtner
Copy link
Member

@RasoolAlSaadi thank you, added via opnsense/src@74aa1a1

@skarekrow
Copy link
Contributor

So with the recent 16.1.14 does one just create a pipe with FlowQueue-CoDel as the scheduler and with Enable CoDel checked to use FQ-CoDel? Or does the Enable CoDel checkbox interfere with it?

Thanks for the great work guys!

@AdSchellevis
Copy link
Member

@skarekrow you can either use FlowQueue-CoDel or use "Enable Codel" on another scheduling mechanism (for example the default wf2q+). I don't think the checkbox actually does anything on FlowQueue-CoDel

@skarekrow
Copy link
Contributor

@AdSchellevis Ah, thank you sir :)

@fichtner fichtner modified the milestones: 16.7, Future Jul 23, 2016
@heri16
Copy link

heri16 commented Nov 25, 2018

Is cake available yet? Seems to be a marked improvement over fq_codel @fichtner

@mimugmail
Copy link
Member

Nope, I asked the dev's of CoDel implementation 2 months ago, no plans to implement it.

@dtaht
Copy link

dtaht commented Nov 25, 2018 via email

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature Adding new functionality
Development

No branches or pull requests

8 participants