packet loss stats #106

heistp · 2017-03-25T17:06:17Z

Loss stats and jitter are listed in the RRUL spec (https://www.bufferbloat.net/projects/bloat/wiki/RRUL_Spec/) but not available in Flent. I'd like to be able to see packet loss to compare the drop decisions made by different qdiscs, particularly on UDP flows.

It looks like this is actually a limitation of netperf, as I don't see packet loss available in the UDP_RR test results. And I understand that getting TCP loss could be challenging as we have to get it from the OS somehow or capture packets, but isn't there a different utility that could be used for the UDP flows that would measure both RTT and packet loss? If not, perhaps one could be written. :) I remember now that Dave was starting a twd project a while back, but it ended up being a bridge too far. What I'm thinking of is probably simpler, but I don't know if it's enough. UDP packets could be sent from each end (of a certain size, at a certain rate, TBD) with sequence numbers and timestamps, and the receiver could both count how many it didn't receive and send a response packet back to the client, so you have both requests and responses being sent and received from each end. One and two way delay could potentially be measured.

Suggestions?

heistp · 2017-03-26T12:47:53Z

Actually one suggestion is D-ITG, which is already used for the VoIP tests. Just noticed that it can do both both one-way and RTT tests, and return packet loss. That could possibly be used instead of netperf for the UDP flows.

dtaht · 2017-03-27T14:22:41Z

Yes, twd was a bridge too far at the time. I wanted something that was reliably realtime and capable of doing gigE well, in particular, and writing twd in C was too hard. Now things have evolved a bit in these worlds (netmap for example), and perhaps me taking another tack entirely ( https://github.com/dtaht/libv6/blob/master/erm/doc/philosophy.org ) will yield results... eventually.

It still seems like punting the whole tcp (or quic?) stack to userspace, using raw sockets, or leveraging bpf (the last seems to have promise)... would get me to the extreme I wanted.

heistp · 2017-03-27T20:24:32Z

Wow, that sounds very high end. That (especially your comment about a userspace TCP stack) triggered a couple of thoughts I've been having as I'm completing my second round of point-to-point WiFi tests:

With my Flent runs, I'm testing what would happen if clients were connected directly to the backhaul at the backhaul's maximum rate, but it's really that the client connections are the bottleneck and several slower, varying rate clients come together to eventually, and sometimes, saturate the backhaul links. I think simulating this may produce a different response from AQM than straight-on saturating flows (and perhaps make AQM look even better). A more accurate test would respect this, but then, as you suggested, either the test itself would need a TCP stack in it or the client flows would need to go through virtual interfaces (maybe with netem? I don't know how else) to simulate changing rates, latency or maybe loss for the individual flows that feed the RRUL test. This is, for me, a bridge too far right now, but it makes me hope that the results I'm producing are still useful somehow!
The original spirit (and spec) of the RRUL test sounded like it envisioned some sort of "metric" summarizing the response of the system being tested under load. After spending many hours putting together my second round of Flent results, I yearn for a mostly automatic test that would produce relevant results without having to configure enumerations of individual tests. It doesn't need to be a single metric, but a minimal set of metrics representing response under load. I'm not sure if this can be done, but...

For one, I do know that rig setups can be extremely specific and variable (for me, it's all about Wi-Fi backhaul, which is vastly different from other setups), so I'm not proposing something that varies rig setups automatically, but maybe there could be an automatic test of sorts, after rig setup has occurred somehow, that runs in phases and ramps up until some limit. Two possible phases could be:

A) "no load" stuff that summarizes physical link characteristics (useful both for CPE devices with low numbers of users, or for just understanding the basics of backhaul and router links):

straight throughput in either and both directions, testing symmetry
unloaded one-way latency, RTT and jitter, for VoIP-like, videoconferencing like, gaming-like and sparse UDP flows like DNS/NTP

B) response under load (useful for loaded CPE devices, and at higher connection counts for backhaul and routers):

TCP side: increasing numbers of real-world flows like:
- conversational and temporary one-way flows, web-browsing-like and POP/IMAP-like
- pulsed downloads (YouTube / streaming video)
- aggressive, P2P or Torrent like flow-swarms
UDP/ICMP side: increasing numbers of flows similar to part A, now how to those look?
diffserv markings (and not)

I don't mean to start summarizing the RRUL spec! So I'll stop, I only meant that maybe the process for testing RRUL could only have a minimal set of parameters, and the test program could automate the process of producing relevant results.

The only parameters to specify, in case the user wants to, might be "how far and how fast" (meaning for example how many simultaneous flows to go up to and how quickly to get there), which could be calculated based on either the results from phase A, "what it's probably capable of" or estimated during phase B based on "how it's going". The "how far to take the test" could be specified in case someone wants to push a link well beyond its limits, or wants to stop well short of them to get something quick.

Maybe this could just be a Flent "automatic" test?

So I know there's a place for a highly configurable "hard-core" tester that can hit 10 GigE and product microsecond-level accuracy, but I think there may also be a place for such an automated, "good enough for many" test.

PS- Golang 2.8 was released with GC improvements that bring pauses to "usually under 100 microseconds and often as low as 10 microseconds". I know that still might not be good enough for some tests, especially 10 GigE or microsecond-sensitive results (I'm starting to look at microseconds for VoIP test as well), but it's getting better.

tohojo · 2017-03-28T11:38:08Z

Pete Heist <notifications@github.com> writes:

Actually one suggestion is D-ITG, which is already used for the VoIP tests. Just noticed that it can do both both one-way and RTT tests, and return packet loss. That could possibly be used instead of netperf for the UDP flows.

Yeah, we're missing tools to do UDP packet loss statistics, unfortunately. D-ITG is a possibility, but it is a PITA to set up, and not suitable to run over the public internet...

…

-Toke

heistp · 2017-03-29T15:40:49Z

I noticed that. The VoIP tests were a little painful to get working. My Mac Mini G4 also had really bad clock drift, which at first produced some beautiful but useless delay curves. "adjtimex --tick 10029" got the system clock close enough so that ntp would agree to do the rest. Still, one-way delay can be off by up to a millisecond or so, depending on how ntp is feeling at the moment.

tohojo · 2017-03-29T15:42:17Z

Pete Heist <notifications@github.com> writes:

I noticed that. The VoIP tests were a little painful to get working. My Mac Mini G4 also had really bad clock drift, which at first produced some beautiful but useless delay curves. "adjtimex --tick 10029" got the system clock close enough so that ntp would agree to do the rest. Still, one-way delay can be off by up to a millisecond or so, depending on how ntp is feeling at the moment.

Yeah, I run PTP in my testbed to get around that. For most cases, though, having a simple ping-like (i.e. isochronous) back-and-forth UDP RTT measurement would be fine... Unfortunately, I haven't yet found a tool that will do that...

…

-Toke

heistp · 2017-03-29T16:15:39Z

Thanks, I might try PTP, didn't know about it.

Anything depending on the old xinetd echo service is probably out, right? I guess you'd want a small, native standalone client and server? It's not as easy to find as I thought it would be.

tohojo · 2017-03-30T10:47:29Z

Pete Heist <notifications@github.com> writes:

Thanks, I might try PTP, didn't know about it. Anything depending on the old xinetd echo service is probably out, right? I guess you'd want a small, native standalone client and server? It's not as easy to find as I thought it would be.

Main requirement is that the client can be convinced to timestamp its output and run for a pre-defined time interval. Accuracy in timing is a bonus... ;)

…

-Toke

heistp · 2017-03-30T12:07:54Z

Ok, and near as I can tell netperf UDP_RR sends a packet, waits for a response then sends another without any delay. If packets are lost, the test apparently stops, although it at least resumes after I built and installed 2.7.0 from source (after your tip in an email a while back).

I would think that, instead of stopping after not receiving a response, it should send another packet after some delay so the test doesn't stop. Perhaps the delay could be around 5x current mean RTT (maybe within the last 5x mean RTT window of time also, so it adapts to changes). That would need testing.

I'm surprised that the URP_RR test is that aggressive actually, that it sends continuously instead of at a fixed rate. It means that your UDP flows are in continuous competition with one another, as well as the TCP flows, whereas something like VoIP rather sends at a fixed rate. Perhaps that's what you want for the benchmark.

So I'll write if I find anything...

tohojo · 2017-03-30T12:27:27Z

Pete Heist <notifications@github.com> writes:

Ok, and near as I can tell netperf UDP_RR sends a packet, waits for a response then sends another without any delay. If packets are lost, the test apparently stops, although it at least resumes after I built and installed 2.7.0 from source (after your tip in an email a while back). I would think that, instead of stopping after not receiving a response, it should send another packet after some delay so the test doesn't stop. Perhaps the delay could be around 5x current mean RTT (maybe within the last 5x mean RTT window of time also, so it adapts to changes). That would need testing. I'm surprised that the URP_RR test is that aggressive actually, that it sends continuously instead of at a fixed rate. It means that your UDP flows are in continuous competition with one another, as well as the TCP flows, whereas something like VoIP rather sends at a fixed rate. Perhaps that's what you want for the benchmark.

Exactly. The ping-pong means that the rate consumed by the measurement flow varies with the RTT (so fix bufferbloat, you'll lose bandwidth as far as that test is concerned). Also, sine netperf only reports number of successful back-and-forth transactions (which Flent then turns into an RTT measure), a hickup turns into a very high RTT value, even with the restart behaviour. So yeah, fixed (isochronous) rate, similar to 'ping' is what we need.

…

-Toke

heistp · 2017-03-30T12:55:39Z

Thanks, now I see where twd was headed and why. :)

dtaht · 2017-03-30T20:40:09Z

one thing that I increasing think is worth doing is adding an ipv6 timestamp header type. there are internet drafts on this, and it wouldn't break on local connnectons.

…

On 3/30/17 3:47 AM, Toke Høiland-Jørgensen wrote: Pete Heist ***@***.***> writes: > Thanks, I might try PTP, didn't know about it. > > Anything depending on the old xinetd echo service is probably out, > right? I guess you'd want a small, native standalone client and > server? It's not as easy to find as I thought it would be. Main requirement is that the client can be convinced to timestamp its output and run for a pre-defined time interval. Accuracy in timing is a bonus... ;) -Toke — You are receiving this because you commented. Reply to this email directly, view it on GitHub <#106 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AAGoijbqICqJp8t8V-qIzcbKVhhImzSwks5rq4hBgaJpZM4MpJAm>.

heistp · 2017-04-07T18:15:34Z

I wrote a quick mockup in Go to see what's possible. Here's pinging localhost for 200 packets with standard 'ping':

200 packets transmitted, 200 packets received, 0.0% packet loss
round-trip min/avg/max/stddev = 0.044/0.150/0.224/0.026 ms

And using my 'rrperf' mockup, sending and echoing 200 UDP packets with nanosecond timestamps to localhost:

Iter 199, avg 0.388392 ms, min 0.096244 ms, max 0.527164 ms, stddev 0.075881 ms

Summary:

mean RTT is 150 microseconds for ping/ICMP, 388 microseconds for rrperf/UDP
stddev is 26 microseconds for ping/ICMP, 76 microseconds for rrperf/UDP

Do you think these stats are within the realm of acceptability for local traffic, and would you use this at all from Flent? If so, I could complete a latency (and throughput for that matter) tester pretty quickly in Go, that outputs results say, to JSON.

Basically it could just run multiple isochronous RTT tests simultaneously, specifying packet size, spacing and diffserv marking for each, along with multiple TCP flows, specifying direction and diffserv marking. As for results, I suppose it would have periodic samples from each flow and totals at the end. For the UDP flows, I could have packet loss and RTT, but not OWD, for now (maybe later).

I don't know what extra features are needed from netperf, but I suspect there can be more detail. :)

Notes / Caveats:

I did note that things don't change much using a single goroutine for both send and receive, so I don't think there's much of a measurable impact from goroutine scheduling.
Like we talked about, there is the 2+ MB statically linked executable, which can be reduced on some platforms with -s -w to ldflags, or there's upx for executable compression, but it isn't reliable on all platforms all the time in my experience.
As for throughput, I think saturating 1Gbit would be no problem but don't have 10Gbit to test.

tohojo · 2017-04-08T13:08:10Z

Pete Heist <notifications@github.com> writes:

I wrote a quick mockup in Go to see what's possible. Here's pinging localhost for 200 packets with standard 'ping': --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- 200 packets transmitted, 200 packets received, 0.0% packet loss round-trip min/avg/max/stddev = 0.044/0.150/0.224/0.026 ms And using my 'rrperf' mockup, sending and echoing 200 UDP packets with nanosecond timestamps to localhost: --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- Iter 199, avg 0.388392 ms, min 0.096244 ms, max 0.527164 ms, stddev 0.075881 ms Summary: * mean RTT is 150 microseconds for ping/ICMP, 388 microseconds for rrperf/UDP * stddev is 26 microseconds for ping/ICMP, 76 microseconds for rrperf/UDP Do you think these stats are within the realm of acceptability for local traffic, and would you use this at all from Flent? If so, I could complete a latency (and throughput for that matter) tester pretty quickly in Go, that outputs results say, to JSON.

Sure! The latency test is the most pressing, so far netperf works quite well for throughput (and has a ton of features that it would take some time to replicate fully). What does the server look like, and is it safe to expose to the internet? :)

Basically it could just run multiple isochronous RTT tests simultaneously, specifying packet size, spacing and diffserv marking for each, along with multiple TCP flows, specifying direction and diffserv marking. As for results, I suppose it would have periodic samples from each flow and totals at the end.

Actually, having Flent run multiple instances of the tool is probably easier than having to split the output of one instance into multiple data series.

…

-Toke

dtaht · 2017-04-08T14:20:58Z

Toke Høiland-Jørgensen <notifications@github.com> writes: It's a three way handshake that's rather needed before inflicting it on the internet.

heistp · 2017-04-08T20:20:09Z

Ok, so I'll start with a latency only, single RTT test then. Keep things simple!

The client and server are separate, like netperf, as the server might end up a little smaller.

As for safe to expose, it's a given that it should be safe from buffer overflow problems (I'll avoid Go's 'unsafe' package). But what else of these is important:

Challenge/response test with a fixed shared key to smoke test for legitimate clients. (I take this as the "three-way handshake".)
Configurable limits on server for length of test, send interval, etc (basic DoS protection).
Accounts / permissions with "request and grant" (I want to do this test, will you let me?)
Invisibility to unauthorized clients (requires PING -D option #3).

I think #1 and #2 make sense to me and are "easy", but as for #3 and #4, I assume they're not needed now. This is something that might run on public servers and you want the server to be safe, but it's not something that needs to be run securely between trusted parties across the open Internet, right?

There's no way to prevent someone from writing a rogue client and hogging up resources, but we could stop random probes with #1, reduce the impact of any attacks with #2, and obviously lock things down more with #3 and #4, with more effort.

If all this sounds reasonable, I'll just put something together and welcome any critique...

dtaht · 2017-04-09T02:55:39Z

Pete Heist <notifications@github.com> writes:

Ok, so I'll start with a latency only, single RTT test then. Keep things simple! The client and server are separate, like netperf, as the server might end up a little smaller. As for safe to expose, it's a given that it should be safe from buffer overflow problems (I'll avoid Go's 'unsafe' package). But what else of these is important: 1 Challenge/response test with a fixed shared key to smoke test for legitimate clients. (I take this as the "three-way handshake".)

Basically anything that would cause a test like this to be an inadvertent amplifier. A forged src address that just started a test with no confirmation, is bad. So long as there is a challenge response phase (and a strict upper limit on the duration and number of tests), I can sleep at night. I'm one of the guys that predicted the ntp amplification attacks...

2 Configurable limits on server for length of test, send interval, etc (basic DoS protection). 3 Accounts / permissions with "request and grant" (I want to do this test, will you let me?)

Tis much harder and not strictly necessary. A reply of "I'm busy now, please try again later" seems much simpler. For how deep that rathole can go, see owamp for inspiration.

…

4 Invisibility to unauthorized clients (requires #3). I think #1 and #2 make sense to me and are "easy", but as for #3 and #4, I assume they're not needed now. This is something that might run on public servers and you want the server to be safe, but it's not something that needs to be run securely between trusted parties across the open Internet, right? There's no way to prevent someone from writing a rogue client and hogging up resources, but we could stop random probes with #1, reduce the impact of any attacks with #2, and obviously lock things down more with #3 and #4, with more effort. If all this sounds reasonable, I'll just put something together and welcome any critique... — You are receiving this because you commented. Reply to this email directly, view it on GitHub, or mute the thread.

heistp · 2017-04-09T07:58:03Z

Ok, understood.

Also, a compiled-in pre-shared key for the handshake that can be overridden from the command line is still easy to implement and would allow for restricted tests, if needed.

More later when something's ready...

tohojo · 2017-04-09T11:25:19Z

Pete Heist <notifications@github.com> writes:

Ok, understood. Also, a compiled-in pre-shared key for the handshake that can be overridden from the command line is still easy to implement and would allow for restricted tests, if needed.

Sure, that is useful (netperf has a similar feature), but it is also important that one can have a "public" server that will not send large amounts of unsolicited traffic to random addresses given a spoofed source IP. I'm not too worried about this as long as the reply is no bigger than the request; then it will be no worse than normal ping (well, rate limiting may be necessary). Also, playing nice with firewalls (the server should be able to listen on a configurable port number, and not use a large random port number space).

…

-Toke

heistp · 2017-04-10T09:15:14Z

Yep, I'll try to keep it to a single UDP port on the server (random on client).

There might be a bit of a delay as I still have to complete tests of Ubiquiti's stuff for FreeNet and finish the report and presentation by 5/1 (along with the day job :)

One of my main motivations for getting this test done asap though is WMM. After Dave's tip I did tests with WMM on and off (or at least avoided) and the results were really surprising. With WMM on, even when you do the default rrul (not rrul_be) test, latencies are 5-10x what they probably should be. Disable or bypass WMM and things look much, much better. So either:

WMM is "bad" and to be avoided, particularly for higher numbers of diffserv marked flows, OR
The netperf UDP_RR test, with its zero-delay back and forth (which arguably doesn't represent what you see in the real world) when marked with higher priority diffserv markings like EF or others, doesn't play well with WMM

Or something in between. Hopefully we can determine that soon.

PS- I finally did Chaos Calmer tests. LEDE's latency improvements under load are pretty staggering, particularly when it comes to dynamic rate drops. So hopefully the report I produce helps highlight the good work you guys are doing. :)

tohojo · 2017-04-10T09:59:53Z

Pete Heist <notifications@github.com> writes:

Yep, I'll try to keep it to a single UDP port on the server (random on client). There might be a bit of a delay as I still have to complete tests of Ubiquiti's stuff for FreeNet and finish the report and presentation by 5/1 (along with the day job :) One of my main motivations for getting this test done asap though is WMM. After Dave's tip I did tests with WMM on and off (or at least avoided) and the results were really surprising. With WMM on, even when you do the default rrul (not rrul_be) test, latencies are 5-10x what they probably should be. Disable or bypass WMM and things look much, much better. So either: 1 WMM is "bad" and to be avoided, particularly for higher numbers of diffserv marked flows, OR 2 The netperf UDP_RR test, with its zero-delay back and forth (which arguably doesn't represent what you see in the real world) when marked with higher priority diffserv markings like EF or others, doesn't play well with WMM

Well, the plain RRUL test will put a TCP flow in each WMM priority bin, which will hammer them pretty hard. This tends to break things, since there's no admission control on the different queues. For instance, according to the standard, the VI queue should never send aggregates with durations longer than 1ms. The code checks this, but only for the first rate in the configured rate chain for the transmission. So if that fails, and subsequent retries drop down to a lower rate, the aggregate can suddenly be longer than 1ms, in violation of the standard. And also, since the queues are served in strict priority order, hammering the VO and VI queues tends to lock out the others.

…

-Toke

heistp · 2017-04-10T10:21:31Z

On Apr 10, 2017, at 11:59 AM, Toke Høiland-Jørgensen ***@***.***> wrote: Well, the plain RRUL test will put a TCP flow in each WMM priority bin, which will hammer them pretty hard. This tends to break things, since there's no admission control on the different queues. For instance, according to the standard, the VI queue should never send aggregates with durations longer than 1ms. The code checks this, but only for the first rate in the configured rate chain for the transmission. So if that fails, and subsequent retries drop down to a lower rate, the aggregate can suddenly be longer than 1ms, in violation of the standard. And also, since the queues are served in strict priority order, hammering the VO and VI queues tends to lock out the others.

Interesting, so it sounds like even just the TCP flows by themselves are already giving WMM problems... I'm trying to answer the question of whether or not FreeNet should be disabling or bypassing WMM in their backhaul. Since we can’t control what goes into the backhaul, it seems like either the diffserv markings need to be removed or WMM needs to be disabled or bypassed. As for the bypassing option, I can make RRUL results look _way_ better by sending backhaul traffic through an IPIP tunnel with TOS 0. Yes, your MTU goes down a bit, but you don’t have problems with WMM and that _may_ be better than removing diffserv markings entirely as they’re preserved in the encapsulated packet for any routers upstream. RRUL results for rate limited cake with ‘diffserv4’ looks far better with IPIP tunneling than not. I’m asking myself why NOT to do this in production, and one main component of that is determining whether the RRUL test is an approximation of reality, at least some of the time, or not.

heistp · 2017-04-10T10:37:25Z

Pete Heist <peteheist@gmail.com> writes:

> On Apr 10, 2017, at 11:59 AM, Toke Høiland-Jørgensen ***@***.***> wrote: > > Well, the plain RRUL test will put a TCP flow in each WMM priority bin, > which will hammer them pretty hard. This tends to break things, since > there's no admission control on the different queues. > > For instance, according to the standard, the VI queue should never send > aggregates with durations longer than 1ms. The code checks this, but > only for the first rate in the configured rate chain for the > transmission. So if that fails, and subsequent retries drop down to a > lower rate, the aggregate can suddenly be longer than 1ms, in violation > of the standard. > > And also, since the queues are served in strict priority order, > hammering the VO and VI queues tends to lock out the others. Interesting, so it sounds like even just the TCP flows by themselves are already giving WMM problems... I'm trying to answer the question of whether or not FreeNet should be disabling or bypassing WMM in their backhaul. Since we can’t control what goes into the backhaul, it seems like either the diffserv markings need to be removed or WMM needs to be disabled or bypassed.

The question is what you gain by having it on. If you're running FQ-CoDel-enabled WiFi nodes you get almost equivalent behaviour for voice traffic without using the VO queue (or at least that's what I've seen in the scenarios I've been testing). And the efficiency of the network is higher if you don't use the VO and VI queues (since they can't aggregate as much (or at all in the case of VO)). For a backhaul link (which I assume is point-to-point?) you are not going to have a lot of multi-node contention (which is where the VO queue might help as that also affects contention parameters). So a lot of the potential benefits of different 802.11 priorities are probably not relevant for this case...

As for the bypassing option, I can make RRUL results look _way_ better by sending backhaul traffic through an IPIP tunnel with TOS 0. Yes, your MTU goes down a bit, but you don’t have problems with WMM and that _may_ be better than removing diffserv markings entirely as they’re preserved in the encapsulated packet for any routers upstream. RRUL results for rate limited cake with ‘diffserv4’ looks far better with IPIP tunneling than not.

'Far better' how? I'm guessing we're down to the micro-optimisation level here?

I’m asking myself why NOT to do this in production, and one main component of that is determining whether the RRUL test is an approximation of reality, at least some of the time, or not.

The RRUL test is designed to be a representation of the worst case, rather than typical traffic most applications generate. So probably, in *most* cases you will be fine with leaving it on, since no applications will generate high volumes of VO or VI traffic. It's more of a security issue, really, making it easier to DOS your network...

…

-Toke

heistp · 2017-04-10T11:29:14Z

On Apr 10, 2017, at 12:37 PM, Toke Høiland-Jørgensen ***@***.***> wrote: Pete Heist ***@***.*** ***@***.***>> writes: > Interesting, so it sounds like even just the TCP flows by themselves > are already giving WMM problems... > > I'm trying to answer the question of whether or not FreeNet should be > disabling or bypassing WMM in their backhaul. Since we can’t control > what goes into the backhaul, it seems like either the diffserv > markings need to be removed or WMM needs to be disabled or bypassed. The question is what you gain by having it on. If you're running FQ-CoDel-enabled WiFi nodes you get almost equivalent behaviour for voice traffic without using the VO queue (or at least that's what I've seen in the scenarios I've been testing). And the efficiency of the network is higher if you don't use the VO and VI queues (since they can't aggregate as much (or at all in the case of VO)). For a backhaul link (which I assume is point-to-point?) you are not going to have a lot of multi-node contention (which is where the VO queue might help as that also affects contention parameters). So a lot of the potential benefits of different 802.11 priorities are probably not relevant for this case…

Exactly, that’s what I’m thinking (for the backhaul). But WMM is required in 802.11n and later as I understand it. If you disable it in LEDE, you fall back to 802.11g speeds. You can’t disable it in Ubiquiti’s UI, but it looks like it can be done manually in their config files, another thing I need to test.

> As for the bypassing option, I can make RRUL results look _way_ better > by sending backhaul traffic through an IPIP tunnel with TOS 0. Yes, > your MTU goes down a bit, but you don’t have problems with WMM and > that _may_ be better than removing diffserv markings entirely as > they’re preserved in the encapsulated packet for any routers upstream. > RRUL results for rate limited cake with ‘diffserv4’ looks far better > with IPIP tunneling than not. 'Far better' how? I'm guessing we're down to the micro-optimisation level here?

Here are three rrul results using the default LEDE config, one with no IPIP tunnel, one with an IPIP tunnel with TOS 0, and one with TOS ‘inherit’, which takes the TOS value of the encapsulated packet: http://www.drhleny.cz/bufferbloat_tmp/diffserv_rrul_default_noipip/index.html <http://www.drhleny.cz/bufferbloat_tmp/diffserv_rrul_default_noipip/index.html> http://www.drhleny.cz/bufferbloat_tmp/diffserv_rrul_default_ipip_tos_0/index.html <http://www.drhleny.cz/bufferbloat_tmp/diffserv_rrul_default_ipip_tos_0/index.html> http://www.drhleny.cz/bufferbloat_tmp/diffserv_rrul_default_ipip_tos_inh/index.html <http://www.drhleny.cz/bufferbloat_tmp/diffserv_rrul_default_ipip_tos_inh/index.html> The TOS 0 version basically looks like the rrul_be test, because all traffic going over the P2P link is effectively best effort. TOS ‘inherit’ basically looks the same as having no IPIP tunnel, because the packets have the same diffserv markings, just with a little bit of overhead. It makes sense, other than the fundamental point of why latencies are so much higher with the rrul test and rrul_be. Now, I can use an IPIP tunnel with TOS 0, rate limit and shape that with Cake diffserv4, for example, and get much better results than with no IPIP tunnel: http://www.drhleny.cz/bufferbloat_tmp/diffserv_rrul_eg_cake_ds4_36mbit_noipip/index.html <http://www.drhleny.cz/bufferbloat_tmp/diffserv_rrul_eg_cake_ds4_36mbit_noipip/index.html> http://www.drhleny.cz/bufferbloat_tmp/diffserv_rrul_eg_cake_ds4_36mbit_ipip_tos_0/index.html <http://www.drhleny.cz/bufferbloat_tmp/diffserv_rrul_eg_cake_ds4_36mbit_ipip_tos_0/index.html> By adding the IPIP tunnel with TOS 0, average latency goes from ~50 ms to ~10 ms, and to summarize: ICMP: 21.2 -> 7.0 ms UDP BE: 20.8 -> 7.3 ms UDP BK: 84.3 -> 14.2 ms UDP EF: 9.0 -> 8.1 ms All the latencies go down considerably, so why not do this? Unless the test just isn’t close to reality. Now, and this is also important, I’m making an assumption here that this is demonstrating a problem with WMM, but maybe it’s not. Is LEDE’s new driver also prioritizing based on diffserv markings, or what exactly am I seeing?

> I’m asking myself why NOT to do this in production, and one main > component of that is determining whether the RRUL test is an > approximation of reality, at least some of the time, or not. The RRUL test is designed to be a representation of the worst case, rather than typical traffic most applications generate. So probably, in *most* cases you will be fine with leaving it on, since no applications will generate high volumes of VO or VI traffic. It's more of a security issue, really, making it easier to DOS your network...

That’s what I’d like to also confirm with the tests. :) Thanks Toke for your advice...

heistp · 2017-04-10T13:07:39Z

On Apr 10, 2017, at 2:28 PM, Toke Høiland-Jørgensen ***@***.***> wrote: Pete Heist ***@***.*** ***@***.***>> writes: > Now, I can use an IPIP tunnel with TOS 0, rate limit and shape that with Cake diffserv4, for example, and get much better results than with no IPIP tunnel: > > http://www.drhleny.cz/bufferbloat_tmp/diffserv_rrul_eg_cake_ds4_36mbit_noipip/index.html > > http://www.drhleny.cz/bufferbloat_tmp/diffserv_rrul_eg_cake_ds4_36mbit_ipip_tos_0/index.html > > By adding the IPIP tunnel with TOS 0, average latency goes from ~50 ms to ~10 ms, and to summarize: > > ICMP: 21.2 -> 7.0 ms > UDP BE: 20.8 -> 7.3 ms > UDP BK: 84.3 -> 14.2 ms > UDP EF: 9.0 -> 8.1 ms > > All the latencies go down considerably, so why not do this? Unless the test just isn’t close to reality. > > Now, and this is also important, I’m making an assumption here that > this is demonstrating a problem with WMM, but maybe it’s not. Is > LEDE’s new driver also prioritizing based on diffserv markings, or > what exactly am I seeing? The Linux WiFi stack will serve the different priority queues in strict priority order. So if you have a long backlog on the VO queue, that will basically lock out the others; which is why you see the other traffic classes worsen in this case. In addition, the VO queue can't aggregate packets, so when it is busy, the effective bandwidth of the link drops. Which I believe is the reason you're seeing Cake behave better when you're using tunnelling that hides the diffserv markings from the WiFi stack: When there tunnelling is turned off, Cake is simply no longer shaping at less than the effective bandwidth of the link.

Ok, but of the two possibilities, I think your first assessment is more likely (that the VO queue is taking priority over the others). If Cake were not in control of the queue for the no IPIP tunnel case because the effective bandwidth of the link were too low, I would expect the average bandwidth for upload and download to not be as flat as it appears. I would like to prove this with another test limited to lower rates, but my LEDE hardware has gone back to its home as of today (springtime at the camp) running Open Mesh’s firmware. I’ll be testing Ubiquiti hardware soon. I expect its WiFi stack is the same though(?), and it will be interesting to see how it compares. Now that LEDE has a stable release build, I’m also considering trying that at the camp, but that’s another step, and maybe more tests, for later. I’ll also have to do a real test of disabling WMM, to sort out what if any difference here is coming from WMM, or elsewhere. It was a “bridge too far” for me to try that in LEDE thus far.

I do believe the 'wash' keyword made it back into cake, btw. If you enable that, cake will zero out the diffserv marks (after acting on them); so you could get the behaviour you want without using tunneling, as long as you don't have any other bottlenecks further along the path where you need the markings...

I noticed it in the source. I’m not sure yet that removing the markings entirely is better than tunneling through the WiFi links, as they may be useful elsewhere upstream. I’m also not sure they’d make much difference on the downstream in consumer hardware though, when that’s probably not usually where the bottleneck is. Overall, in order to add the complexity of tunneling, I need to show it makes a real difference in the real world. So maybe I’ll have more on this after the Ubiquiti results…

dtaht · 2017-04-10T17:03:15Z

Pete Heist <notifications@github.com> writes: The "rightest" answer for backhaul wifi is to leave wmm enabled, but stop classifying anything into anything other than the BE queue. That was a one line patch at the time. There is a new abstraction that I don't know how to get to, that allows setting a qos map for wifi, that can be made to do the same thing.

…

— You are receiving this because you commented. Reply to this email directly, view it on GitHub, or mute the thread.

heistp · 2017-04-10T20:33:41Z

On Apr 10, 2017, at 7:03 PM, Dave Täht ***@***.***> wrote: The "rightest" answer for backhaul wifi is to leave wmm enabled, but stop classifying anything into anything other than the BE queue. That was a one line patch at the time. There is a new abstraction that I don't know how to get to, that allows setting a qos map for wifi, that can be made to do the same thing.

Oh yeah, that's “righter” for sure. Then you’d preserve diffserv markings without tunneling. I just need to find out if this is a problem on UBNT, and fix if there if so. Even still, I’m not done experimenting with my “one weird trick” just yet. I tested a “poor man’s full duplex WiFi” by using the fact that the transmit and receive sides of IPIP tunnels are separate, and can travel over separate routes. Unfortunately I didn’t have four WiFi devices at the time I tried it, so I used Ethernet for one direction in the link. But the Flent results were rather beautiful, and one-way latency under load for the WiFi link was ~2.5ms with Cake limiting. I just got four UBNT NanoBridge M5’s to test this fully, so will include it in my results. If it works well, then in theory you could have a full-duplex WiFi setup, with failover to half-duplex, for much less than what it usually costs. This might be useful for backhaul. (BTW, one might be able to do this with straight asymmetric routing and no tunnel also, but as I understand it, asymmetric routing is generally not something you want to do on purpose. I don’t understand all of the reasons why, beyond breaking NAT setups.)

tohojo · 2017-09-18T08:12:27Z

So any updates on any of this work? :)

-Toke

heistp · 2017-09-18T14:58:00Z

Funny you should ask...it was impossible to do anything over the summer, but in the last couple of weeks I've gotten close on the new latency tester. It took some time playing around with timer error, system vs monotonic clock values, and socket options, among other things (Windows might be mostly a lost cause on that). A few more things left to do, and I hope to update more soon...

tron:~/src/github.com/peteheist/irtt:% ./irtt -fill rand -fillall -i 10ms -l 160 -d 5s -timer comp -ts a.b.c.d
IRTT to a.b.c.d (a.b.c.d:2112)

                  RTT: mean=19.7584ms min=12.4221ms max=64.2016ms
   one-way send delay: mean=9.1482ms min=3.8003ms max=44.6247ms
one-way receive delay: mean=10.6096ms min=8.0872ms max=42.1626ms
packets received/sent: 498/499 (0.20% loss)
  bytes received/sent: 79680/79840
    receive/send rate: 127.8 Kbps / 127.7 Kbps
             duration: 5.209s (wait 193ms)
         timer misses: 1/500 (0.20% missed)
          timer error: mean=-997ns (-0.01%) min=-3.492683ms max=2.55169ms
       send call time: mean=84.8µs min=13.2µs max=180.4µs

tohojo · 2017-09-18T15:52:42Z

Pete Heist <notifications@github.com> writes:

Funny you should ask...it was impossible to do anything over the summer, but in the last couple of weeks I've gotten close on the new latency tester. It took some time playing around with timer error, system vs monotonic clock values, and socket options, among other things (Windows might be mostly a lost cause on that). A few more things left to do, and I hope to update more soon...

Neat! Does it do machine parsable output, and can it output stats at intervals during the test? :)

…

-Toke

tohojo · 2017-11-20T12:11:03Z

Pete Heist <notifications@github.com> writes:

G.711 can be simulated today with `-i 20ms -l 172 -fill rand -fillall`. I do this test pretty often, and I think it would be a good default voip test.

The problem with this is that it also changes the sampling rate. I don't necessarily want to plot the latency every 20ms, so I'd have to compensate for that in the Flent plotter somehow. Also, a better way to deal with loss would be needed.

…

-Toke

heistp · 2017-11-20T12:52:13Z

On Nov 20, 2017, at 1:11 PM, Toke Høiland-Jørgensen ***@***.***> wrote: Pete Heist ***@***.***> writes: > G.711 can be simulated today with `-i 20ms -l 172 -fill rand > -fillall`. I do this test pretty often, and I think it would be a good > default voip test. The problem with this is that it also changes the sampling rate. I don't necessarily want to plot the latency every 20ms, so I'd have to compensate for that in the Flent plotter somehow. Also, a better way to deal with loss would be needed.

I wondered if/when this would come up… Why not plot the latency every 20ms, too dense? I guess even if not, eventually at a low enough interval the round trip and plotting intervals would need to be decoupled, no matter what plot type is used. If we want to minimize flent changes, irtt could optionally produce a `round_trip_snapshots` (name TBD) array in the json with elements created at a specified interval (`-si duration` or similar) that would summarize the data from multiple round trips. For each snapshot, there would be no timestamps, but the start and end seqnos would be there (if needed), mean delays and ipdv, counts (or percentages?) of lost, lost_up or lost_down, etc. I’d need to spec this out, but would something like this help?

tohojo · 2017-11-20T13:21:15Z

Pete Heist <notifications@github.com> writes:

> On Nov 20, 2017, at 1:11 PM, Toke Høiland-Jørgensen ***@***.***> wrote: > > Pete Heist ***@***.***> writes: > > > G.711 can be simulated today with `-i 20ms -l 172 -fill rand > > -fillall`. I do this test pretty often, and I think it would be a good > > default voip test. > > The problem with this is that it also changes the sampling rate. I don't > necessarily want to plot the latency every 20ms, so I'd have to > compensate for that in the Flent plotter somehow. Also, a better way to > deal with loss would be needed. I wondered if/when this would come up… Why not plot the latency every 20ms, too dense?

For the current plot type (where data points are connected by lines), certainly. It would probably be possible to plot denser data sets by a point cloud type plot, but that would make denser data series harder to read.

I guess even if not, eventually at a low enough interval the round trip and plotting intervals would need to be decoupled, no matter what plot type is used.

Yeah, exactly.

If we want to minimize flent changes, irtt could optionally produce a `round_trip_snapshots` (name TBD) array in the json with elements created at a specified interval (`-si duration` or similar) that would summarize the data from multiple round trips. For each snapshot, there would be no timestamps, but the start and end seqnos would be there (if needed), mean delays and ipdv, counts (or percentages?) of lost, lost_up or lost_down, etc. I’d need to spec this out, but would something like this help?

Hmm, seeing as we probably want to keep all the data points in the Flent data file anyway, I think we might as well do the sub-sampling in Flent. Just thinning the plots is a few lines of numpy code; just need to figure out a good place to apply it. Handling loss is another matter, but one that I need to deal with anyway. Right now I'm just throwing away lost data points entirely, which loses the lost_{up,down} information. Will fix that and also figure out the right way to indicate losses.

…

-Toke

heistp · 2017-11-20T13:51:14Z

On Nov 20, 2017, at 2:21 PM, Toke Høiland-Jørgensen ***@***.***> wrote: Pete Heist ***@***.***> writes: >> On Nov 20, 2017, at 1:11 PM, Toke Høiland-Jørgensen ***@***.***> wrote: > I wondered if/when this would come up… Why not plot the latency every > 20ms, too dense? For the current plot type (where data points are connected by lines), certainly. It would probably be possible to plot denser data sets by a point cloud type plot, but that would make denser data series harder to read.

Yeah, thought of the same, or some area fill between min and max, or 98th percentile values, or something.

Hmm, seeing as we probably want to keep all the data points in the Flent data file anyway, I think we might as well do the sub-sampling in Flent. Just thinning the plots is a few lines of numpy code; just need to figure out a good place to apply it.

Didn’t think of that (keeping all data points anyway), but it really makes more sense. At first I thought numpy was a new street-talkin’ adjective (as in, that’s some really numpy code). I see: NumPy. :)

Handling loss is another matter, but one that I need to deal with anyway. Right now I'm just throwing away lost data points entirely, which loses the lost_{up,down} information. Will fix that and also figure out the right way to indicate losses.

Really looking forward to it!

tohojo · 2017-11-20T16:01:45Z

Really looking forward to it!

Working on it. Turned out to need a bit of refactoring. This is me currently: https://i.imgur.com/t0XHtgJ.gif

…

-Toke

heistp · 2017-11-20T16:19:54Z

:) Gut laugh, I know that feeling sometimes...

tohojo · 2017-11-20T20:58:15Z

Okay, testable code in the runner-refactor branch.

Ended up doing a fairly involved refactoring of how runners work with
data; which is good, as the new way to structure things makes a lot more
sense in general; but it did mean I had to change the data format, so
quite a few places this can break. So testing appreciated, both for
running new tests, and for plotting old data files.

flent-users · 2017-11-20T21:14:19Z

On Mon, Nov 20, 2017 at 5:21 AM, Toke Høiland-Jørgensen < notifications@github.com> wrote:

Pete Heist ***@***.***> writes: >> On Nov 20, 2017, at 1:11 PM, Toke Høiland-Jørgensen < ***@***.***> wrote: >> >> Pete Heist ***@***.***> writes: >> >> > G.711 can be simulated today with `-i 20ms -l 172 -fill rand >> > -fillall`. I do this test pretty often, and I think it would be a good >> > default voip test. >> >> The problem with this is that it also changes the sampling rate. I don't >> necessarily want to plot the latency every 20ms, so I'd have to >> compensate for that in the Flent plotter somehow. Also, a better way to >> deal with loss would be needed. > > > I wondered if/when this would come up… Why not plot the latency every > 20ms, too dense? For the current plot type (where data points are connected by lines), certainly. It would probably be possible to plot denser data sets by a point cloud type plot, but that would make denser data series harder to read.

Winstein plot of latency variance? It doesn't get denser, it gets darker. Packet loss vs throughput?

> I guess even if not, eventually at a low enough interval the round > trip and plotting intervals would need to be decoupled, no matter what > plot type is used. Yeah, exactly. > If we want to minimize flent changes, irtt could optionally produce a > `round_trip_snapshots` (name TBD) array in the json with elements > created at a specified interval (`-si duration` or similar) that would > summarize the data from multiple round trips. For each snapshot, there > would be no timestamps, but the start and end seqnos would be there > (if needed), mean delays and ipdv, counts (or percentages?) of lost, > lost_up or lost_down, etc. I’d need to spec this out, but would > something like this help? Hmm, seeing as we probably want to keep all the data points in the Flent data file anyway, I think we might as well do the sub-sampling in Flent. Just thinning the plots is a few lines of numpy code; just need to figure out a good place to apply it. Handling loss is another matter, but one that I need to deal with anyway. Right now I'm just throwing away lost data points entirely, which loses the lost_{up,down} information. Will fix that and also figure out the right way to indicate losses.

Groovy.

…

-Toke — You are receiving this because you commented. Reply to this email directly, view it on GitHub <#106 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AerUv9ZyQpd6Oq7HudRiOYPKKIg8Ey8Aks5s4XzMgaJpZM4MpJAm> . _______________________________________________ Flent-users mailing list ***@***.*** http://flent.org/mailman/listinfo/flent-users_flent.org

-- Dave Täht CEO, TekLibre, LLC http://www.teklibre.com Tel: 1-669-226-2619

flent-users · 2017-11-20T21:44:55Z

A goal for me has been to be able to run Opus at 24 bit, 96Khz, with 2.7ms sampling latency. Actually getting 8 channels of that through a loaded box would be mahvelous.

…

On Mon, Nov 20, 2017 at 1:14 PM, Dave Taht ***@***.***> wrote: On Mon, Nov 20, 2017 at 5:21 AM, Toke Høiland-Jørgensen < ***@***.***> wrote: > Pete Heist ***@***.***> writes: > > >> On Nov 20, 2017, at 1:11 PM, Toke Høiland-Jørgensen < > ***@***.***> wrote: > >> > >> Pete Heist ***@***.***> writes: > >> > >> > G.711 can be simulated today with `-i 20ms -l 172 -fill rand > >> > -fillall`. I do this test pretty often, and I think it would be a > good > >> > default voip test. > >> > >> The problem with this is that it also changes the sampling rate. I > don't > >> necessarily want to plot the latency every 20ms, so I'd have to > >> compensate for that in the Flent plotter somehow. Also, a better way to > >> deal with loss would be needed. > > > > > > I wondered if/when this would come up… Why not plot the latency every > > 20ms, too dense? > > For the current plot type (where data points are connected by lines), > certainly. It would probably be possible to plot denser data sets by a > point cloud type plot, but that would make denser data series harder to > read. Winstein plot of latency variance? It doesn't get denser, it gets darker. Packet loss vs throughput? > > I guess even if not, eventually at a low enough interval the round > > trip and plotting intervals would need to be decoupled, no matter what > > plot type is used. > > Yeah, exactly. > > > If we want to minimize flent changes, irtt could optionally produce a > > `round_trip_snapshots` (name TBD) array in the json with elements > > created at a specified interval (`-si duration` or similar) that would > > summarize the data from multiple round trips. For each snapshot, there > > would be no timestamps, but the start and end seqnos would be there > > (if needed), mean delays and ipdv, counts (or percentages?) of lost, > > lost_up or lost_down, etc. I’d need to spec this out, but would > > something like this help? > > Hmm, seeing as we probably want to keep all the data points in the Flent > data file anyway, I think we might as well do the sub-sampling in Flent. > Just thinning the plots is a few lines of numpy code; just need to > figure out a good place to apply it. > > Handling loss is another matter, but one that I need to deal with > anyway. Right now I'm just throwing away lost data points entirely, > which loses the lost_{up,down} information. Will fix that and also > figure out the right way to indicate losses. > Groovy. > -Toke > > — > You are receiving this because you commented. > Reply to this email directly, view it on GitHub > <#106 (comment)>, or mute > the thread > <https://github.com/notifications/unsubscribe-auth/AerUv9ZyQpd6Oq7HudRiOYPKKIg8Ey8Aks5s4XzMgaJpZM4MpJAm> > . > > _______________________________________________ > Flent-users mailing list > ***@***.*** > http://flent.org/mailman/listinfo/flent-users_flent.org > > -- Dave Täht CEO, TekLibre, LLC http://www.teklibre.com Tel: 1-669-226-2619 <(669)%20226-2619>

-- Dave Täht CEO, TekLibre, LLC http://www.teklibre.com Tel: 1-669-226-2619

heistp · 2017-11-21T07:39:05Z

On Nov 20, 2017, at 10:44 PM, flent-users ***@***.***> wrote: A goal for me has been to be able to run Opus at 24 bit, 96Khz, with 2.7ms sampling latency. Actually getting 8 channels of that through a loaded box would be marvelous.

Sounds like a musician. :) If it were CBR, I don’t know if this is a way to estimate it: 2.7ms ~= 370 packets/sec @128KBPS, 56 bytes / packet (44 data + 12 RTP) @256kbps, 99 bytes / packet (87 data + 12 RTP) Just for fun, a ~256 kbps test between two sites, 50km apart, both using p2p WiFi to the Internet. For realtime audio, I guess it’s the maximums that could be the biggest issue. ``` % ./irtt client -i 2.7ms -l 99 -q -d 10s a.b.c.d [Connecting] connecting to a.b.c.d [Connected] connected to a.b.c.d:2112 Min Mean Median Max Stddev --- ---- ------ --- ------ RTT 10.16ms 15.57ms 14.14ms 71.37ms 4.89ms send delay 4.5ms 8.01ms 6.85ms 33.1ms 3.6ms receive delay 4.99ms 7.56ms 6.93ms 64.86ms 3.05ms IPDV (jitter) 1.06µs 2.52ms 2.56ms 56.16ms 2.55ms send IPDV 50ns 2.1ms 1.93ms 25.94ms 2.18ms receive IPDV 49ns 1.14ms 663µs 58.63ms 1.9ms send call time 38.2µs 83.2µs 13.46ms 310µs timer error 2ns 44.7µs 18.23ms 620µs server proc. time 33.6µs 47.4µs 242µs 18.1µs duration: 10.2s (wait 214.1ms) packets sent/received: 3647/3644 (0.08% loss) server packets received: 3644/3647 (0.08%/0.00% loss up/down) bytes sent/received: 361053/360756 send/receive rate: 288.9 Kbps / 288.7 Kbps packet length: 99 bytes timer stats: 57/3704 (1.54%) missed, 1.65% error ```

heistp · 2017-11-21T07:57:00Z

On Nov 20, 2017, at 9:58 PM, Toke Høiland-Jørgensen ***@***.***> wrote: Okay, testable code in the runner-refactor branch. Ended up doing a fairly involved refactoring of how runners work with data; which is good, as the new way to structure things makes a lot more sense in general; but it did mean I had to change the data format, so quite a few places this can break. So testing appreciated, both for running new tests, and for plotting old data files.

Awesome, I’m sure it could take some shaking out. I tried an rrul_be test on the runner-refactor branch... ``` % flent rrul_be --socket-stats -l 60 -H 10.72.0.231 -p all_scaled --figure-width=10 --figure-height=7.5 -t new_runner_test -o new_runner_test.png Started Flent 1.1.1-git-b958d01 using Python 2.7.13. Starting rrul_be test. Expected run time: 70 seconds. Traceback (most recent call last): File "/usr/local/bin/flent", line 11, in <module> load_entry_point('flent===1.1.1-git-b958d01', 'console_scripts', 'flent')() File "/usr/local/lib/python2.7/dist-packages/flent-1.1.1_git_b958d01-py2.7.egg/flent/__init__.py", line 59, in run_flent b.run() File "/usr/local/lib/python2.7/dist-packages/flent-1.1.1_git_b958d01-py2.7.egg/flent/batch.py", line 609, in run return self.run_test(self.settings, self.settings.DATA_DIR, True) File "/usr/local/lib/python2.7/dist-packages/flent-1.1.1_git_b958d01-py2.7.egg/flent/batch.py", line 508, in run_test res = self.agg.postprocess(self.agg.aggregate(res)) File "/usr/local/lib/python2.7/dist-packages/flent-1.1.1_git_b958d01-py2.7.egg/flent/aggregators.py", line 232, in aggregate measurements, metadata, raw_values = self.collect() File "/usr/local/lib/python2.7/dist-packages/flent-1.1.1_git_b958d01-py2.7.egg/flent/aggregators.py", line 120, in collect t.check() File "/usr/local/lib/python2.7/dist-packages/flent-1.1.1_git_b958d01-py2.7.egg/flent/runners.py", line 964, in check ip_version=args['ip_version']) File "/usr/local/lib/python2.7/dist-packages/flent-1.1.1_git_b958d01-py2.7.egg/flent/runners.py", line 232, in add_child c.check() File "/usr/local/lib/python2.7/dist-packages/flent-1.1.1_git_b958d01-py2.7.egg/flent/runners.py", line 1652, in check super(SsRunner, self).check() File "/usr/local/lib/python2.7/dist-packages/flent-1.1.1_git_b958d01-py2.7.egg/flent/runners.py", line 393, in check self.metadata['UNITS'] = self.units AttributeError: 'SsRunner' object has no attribute 'units' ``` Without —socket-stats: ``` % flent rrul_be -l 60 -H 10.72.0.231 -p all_scaled --figure-width=10 --figure-height=7.5 -t new_runner_test -o new_runner_test.png Started Flent 1.1.1-git-b958d01 using Python 2.7.13. Starting rrul_be test. Expected run time: 70 seconds. Traceback (most recent call last): File "/usr/local/bin/flent", line 11, in <module> load_entry_point('flent===1.1.1-git-b958d01', 'console_scripts', 'flent')() File "/usr/local/lib/python2.7/dist-packages/flent-1.1.1_git_b958d01-py2.7.egg/flent/__init__.py", line 59, in run_flent b.run() File "/usr/local/lib/python2.7/dist-packages/flent-1.1.1_git_b958d01-py2.7.egg/flent/batch.py", line 609, in run return self.run_test(self.settings, self.settings.DATA_DIR, True) File "/usr/local/lib/python2.7/dist-packages/flent-1.1.1_git_b958d01-py2.7.egg/flent/batch.py", line 508, in run_test res = self.agg.postprocess(self.agg.aggregate(res)) File "/usr/local/lib/python2.7/dist-packages/flent-1.1.1_git_b958d01-py2.7.egg/flent/aggregators.py", line 232, in aggregate measurements, metadata, raw_values = self.collect() File "/usr/local/lib/python2.7/dist-packages/flent-1.1.1_git_b958d01-py2.7.egg/flent/aggregators.py", line 120, in collect t.check() File "/usr/local/lib/python2.7/dist-packages/flent-1.1.1_git_b958d01-py2.7.egg/flent/runners.py", line 1458, in check delay=self.delay, remote_host=self.remote_host, AttributeError: 'UdpRttRunner' object has no attribute 'delay' ```

heistp · 2017-11-21T08:01:53Z

On Nov 20, 2017, at 10:14 PM, flent-users ***@***.***> wrote: Winstein plot of latency variance? It doesn't get denser, it gets darker. Packet loss vs throughput?

Not sure what that is exactly. Something like from July 2014 on this page? https://cs.stanford.edu/~keithw/ <https://cs.stanford.edu/~keithw/>

tohojo · 2017-11-21T10:36:06Z

Pete Heist <notifications@github.com> writes:

> On Nov 20, 2017, at 9:58 PM, Toke Høiland-Jørgensen ***@***.***> wrote: > Okay, testable code in the runner-refactor branch. > > Ended up doing a fairly involved refactoring of how runners work with > data; which is good, as the new way to structure things makes a lot more > sense in general; but it did mean I had to change the data format, so > quite a few places this can break. So testing appreciated, both for > running new tests, and for plotting old data files. > Awesome, I’m sure it could take some shaking out. I tried an rrul_be test on the runner-refactor branch...

Ha! Epic fail! :D Well, I only just managed to finish writing the code and unbreaking the CI tests; didn't actually get around to running any tests. I've fixed those two errors, and am running a full test run on my testbed now...

…

-Toke

heistp · 2017-11-21T11:09:30Z

On Nov 21, 2017, at 11:36 AM, Toke Høiland-Jørgensen ***@***.***> wrote: Ha! Epic fail! :D Well, I only just managed to finish writing the code and unbreaking the CI tests; didn't actually get around to running any tests. I've fixed those two errors, and am running a full test run on my testbed now…

Much better now though! Both rrul_be test ran fine for me (with and without —socket-stats). I have a number of .flent.gz files from Jan this year I can try when I get a chance. I just deleted thousands of them from my newer (unreleased) tests from March or so as I want to re-run them all in my new test bed, but oh well... Next thing I noticed as for current tests, for rrul_be_nflows, the test completed but only one irtt instance ran (also just saw one connection to the server). % flent rrul_be_nflows --test-parameter upload_streams=8 --test-parameter download_streams=8 --socket-stats -l 60 -H $SERVER -p all_scaled --figure-width=10 --figure-height=7.5 -t irtt -o irtt_8flows.png

tohojo · 2017-11-21T14:53:24Z

Pete Heist <notifications@github.com> writes:

> On Nov 21, 2017, at 11:36 AM, Toke Høiland-Jørgensen ***@***.***> wrote: > > Ha! Epic fail! :D > > Well, I only just managed to finish writing the code and unbreaking the > CI tests; didn't actually get around to running any tests. I've fixed > those two errors, and am running a full test run on my testbed now… Much better now though! Both rrul_be test ran fine for me (with and without —socket-stats).

Cool. Getting closer. Still a few bugs to fix with the more esoteric runners, but I'm working on that.

Next thing I noticed as for current tests, for rrul_be_nflows, the test completed but only one irtt instance ran (also just saw one connection to the server). % flent rrul_be_nflows --test-parameter upload_streams=8 --test-parameter download_streams=8 --socket-stats -l 60 -H $SERVER -p all_scaled --figure-width=10 --figure-height=7.5 -t irtt -o irtt_8flows.png

Well that's actually to be expected. That test only varies the number of TCP parameters; there's always a single ICMP and a single UDP latency measurement.

…

-Toke

heistp · 2017-11-21T15:09:11Z

On Nov 21, 2017, at 3:53 PM, Toke Høiland-Jørgensen ***@***.***> wrote: > Next thing I noticed as for current tests, for rrul_be_nflows, the > test completed but only one irtt instance ran (also just saw one > connection to the server). > > % flent rrul_be_nflows --test-parameter upload_streams=8 > --test-parameter download_streams=8 --socket-stats -l 60 -H $SERVER -p > all_scaled --figure-width=10 --figure-height=7.5 -t irtt -o > irtt_8flows.png Well that's actually to be expected. That test only varies the number of TCP parameters; there's always a single ICMP and a single UDP latency measurement.

Aha, my bad, I must have never noticed that. I’ll plot some of my older stuff too and let you know… Pete

dtaht · 2017-11-21T19:12:13Z

Pete Heist <notifications@github.com> writes:

> On Nov 20, 2017, at 10:44 PM, flent-users ***@***.***> wrote: > > A goal for me has been to be able to run Opus at 24 bit, 96Khz, with 2.7ms > sampling latency. > Actually getting 8 channels of that through a loaded box would be marvelous. Sounds like a musician. :) If it were CBR, I don’t know if this is a way to estimate it: 2.7ms ~= 370 packets/sec

Well, it might be 8 of those with different tuples.

@128KBPS, 56 bytes / packet (44 data + 12 RTP) @256kbps, 99 bytes / packet (87 data + 12 RTP) Just for fun, a ~256 kbps test between two sites, 50km apart, both using p2p WiFi to the Internet. For realtime audio, I guess it’s the maximums that could be the biggest issue.

Hah. I didn't say over wifi. That's impossible.

…

``` % ./irtt client -i 2.7ms -l 99 -q -d 10s a.b.c.d [Connecting] connecting to a.b.c.d [Connected] connected to a.b.c.d:2112 Min Mean Median Max Stddev --- ---- ------ --- ------ RTT 10.16ms 15.57ms 14.14ms 71.37ms 4.89ms send delay 4.5ms 8.01ms 6.85ms 33.1ms 3.6ms receive delay 4.99ms 7.56ms 6.93ms 64.86ms 3.05ms IPDV (jitter) 1.06µs 2.52ms 2.56ms 56.16ms 2.55ms send IPDV 50ns 2.1ms 1.93ms 25.94ms 2.18ms receive IPDV 49ns 1.14ms 663µs 58.63ms 1.9ms send call time 38.2µs 83.2µs 13.46ms 310µs timer error 2ns 44.7µs 18.23ms 620µs server proc. time 33.6µs 47.4µs 242µs 18.1µs duration: 10.2s (wait 214.1ms) packets sent/received: 3647/3644 (0.08% loss) server packets received: 3644/3647 (0.08%/0.00% loss up/down) bytes sent/received: 361053/360756 send/receive rate: 288.9 Kbps / 288.7 Kbps packet length: 99 bytes timer stats: 57/3704 (1.54%) missed, 1.65% error ``` — You are receiving this because you commented. Reply to this email directly, view it on GitHub, or mute the thread.

heistp · 2017-11-21T21:16:14Z

Trying to confirm how latency was being calculated before with the UDP_RR test. Looking at its raw output, I see that transactions per second is probably used to calculate RTT, with interim results like:

NETPERF_INTERIM_RESULT[0]=3033.41
NETPERF_UNITS[0]=Trans/s
NETPERF_INTERVAL[0]=0.200
NETPERF_ENDING[0]=1511296777.475

So RTT = (1 / 3033.41) ~= 330us

And this likely takes the mean value of all transactions and summarizes it at the end of the interval, then the calculated latency was what was plotted in flent?

tohojo · 2017-11-21T21:56:36Z

Pete Heist <notifications@github.com> writes:

Trying to confirm how latency was being calculated before with the UDP_RR test. Looking at its raw output, I see that transactions per second is probably used to calculate RTT, with interim results like: ``` NETPERF_INTERIM_RESULT[0]=3033.41 NETPERF_UNITS[0]=Trans/s NETPERF_INTERVAL[0]=0.200 NETPERF_ENDING[0]=1511296777.475 ``` So RTT = (1 / 3033.41) ~= 330us And this likely takes the mean value of all transactions and summarizes it at the end of the interval, then the calculated latency was what was plotted in flent?

Yup, that's exactly it :)

heistp · 2017-11-21T22:40:31Z

On Nov 21, 2017, at 10:56 PM, Toke Høiland-Jørgensen ***@***.***> wrote: Pete Heist ***@***.***> writes: > Trying to confirm how latency was being calculated before with the > UDP_RR test. Looking at its raw output, I see that transactions per > second is probably used to calculate RTT, with interim results like: > > ``` > NETPERF_INTERIM_RESULT[0]=3033.41 > NETPERF_UNITS[0]=Trans/s > NETPERF_INTERVAL[0]=0.200 > NETPERF_ENDING[0]=1511296777.475 > ``` > > So RTT = (1 / 3033.41) ~= 330us > > And this likely takes the mean value of all transactions and > summarizes it at the end of the interval, then the calculated latency > was what was plotted in flent? Yup, that's exactly it :)

Ok, it’ll be interesting for me to look at the differences between the two going forward. Naturally doing it the udp_rr way would probably result in a smoother line. The other impacts on the test might be fun to explore.

tohojo · 2017-11-22T07:49:01Z

Pete Heist <notifications@github.com> writes:

> > And this likely takes the mean value of all transactions and > > summarizes it at the end of the interval, then the calculated latency > > was what was plotted in flent? > > Yup, that's exactly it :) Ok, it’ll be interesting for me to look at the differences between the two going forward. Naturally doing it the udp_rr way would probably result in a smoother line. The other impacts on the test might be fun to explore.

Well the obvious one is that the netperf measurement uses more bandwidth as the latency decreases. Have been meaning to add that to the Flent bandwidth graphs, but now I'm not sure I'll even bother :P Also, the netperf measurement will stop at the first packet loss (later versions added in a timeout parameter that will restart it, but even with that we often see UDP latency graphs completely stopping after a few seconds of the RRUL test).

…

-Toke

heistp · 2017-11-22T11:18:27Z

On Nov 22, 2017, at 8:49 AM, Toke Høiland-Jørgensen ***@***.***> wrote: Pete Heist ***@***.***> writes: >> > And this likely takes the mean value of all transactions and >> > summarizes it at the end of the interval, then the calculated latency >> > was what was plotted in flent? >> >> Yup, that's exactly it :) > > Ok, it’ll be interesting for me to look at the differences between the > two going forward. Naturally doing it the udp_rr way would probably > result in a smoother line. The other impacts on the test might be fun > to explore. Well the obvious one is that the netperf measurement uses more bandwidth as the latency decreases. Have been meaning to add that to the Flent bandwidth graphs, but now I'm not sure I'll even bother :P

True that, it ends up in a pretty tight loop with straight cabled GigE, as in my test bed...

Also, the netperf measurement will stop at the first packet loss (later versions added in a timeout parameter that will restart it, but even with that we often see UDP latency graphs completely stopping after a few seconds of the RRUL test).

Yes, was noticing that before (one of our original motivations). I know it’s a random connection, but I wonder how this would affect the throughput asymmetry I was seeing on the MBPs, for example. Would the driver/card grab airtime more aggressively when it’s transmitting many small packets, or do those get grouped together anyway? I can test it again when I get a chance, but I’m out of my league on the theory side here.

tohojo · 2017-11-22T12:05:49Z

Right, so convinced myself that I'd fixed most of the breakages in the refactor (which turned out to be a multiple-thousands lines patch, but with a net negative of 400 lines of code; not too bad), so merged it and closed this issue.

Please open new issue(s) for any breakage that I missed. I'll open a new one specifically for using irtt for VoIP tests.

tohojo · 2017-11-22T12:06:36Z

Oh, and many thanks for your work on irtt, @peteheist! We really needed such a tool :)

heistp · 2017-11-22T13:10:57Z

Oh yeah, probably time for this issue thread to retire. :)

So I'm glad! Looking forward to playing with this more soon. Thanks for all that refactoring too, looks like it was some real walking through walls...

tohojo · 2017-11-22T13:33:59Z

Pete Heist <notifications@github.com> writes:

So I'm glad! Looking forward to playing with this more soon. Thanks for all that refactoring too, looks like it was some real walking through walls...

Meh, it needed doing anyway. You just gave me a chance to repay a bit of technical debt ;)

dtaht · 2017-11-22T21:30:52Z

Toke Høiland-Jørgensen <notifications@github.com> writes:

Oh, and many thanks for your work on irtt, @peteheist! We really needed such a tool :)

Thx very much also. I'd really like to get some owd plots out of flent....

…

— You are receiving this because you commented. Reply to this email directly, view it on GitHub, or mute the thread.

tohojo · 2017-11-22T21:57:05Z

The owd data is already being collected, so it's fairly trivial to add the plots...

tohojo closed this as completed in 117b39c Nov 22, 2017

tohojo mentioned this issue Nov 22, 2017

Using irtt for VoIP tests #119

Closed

packet loss stats #106

packet loss stats #106

Comments

heistp commented Mar 25, 2017

heistp commented Mar 26, 2017

dtaht commented Mar 27, 2017 • edited

heistp commented Mar 27, 2017 • edited

tohojo commented Mar 28, 2017 via email

heistp commented Mar 29, 2017

tohojo commented Mar 29, 2017 via email

heistp commented Mar 29, 2017

tohojo commented Mar 30, 2017 via email

heistp commented Mar 30, 2017

tohojo commented Mar 30, 2017 via email

heistp commented Mar 30, 2017

dtaht commented Mar 30, 2017 via email

heistp commented Apr 7, 2017 • edited

tohojo commented Apr 8, 2017 via email

dtaht commented Apr 8, 2017 via email

heistp commented Apr 8, 2017

dtaht commented Apr 9, 2017 via email

heistp commented Apr 9, 2017

tohojo commented Apr 9, 2017 via email

heistp commented Apr 10, 2017

tohojo commented Apr 10, 2017 via email

heistp commented Apr 10, 2017 via email

heistp commented Apr 10, 2017 via email

heistp commented Apr 10, 2017 via email

heistp commented Apr 10, 2017 via email

dtaht commented Apr 10, 2017 via email

heistp commented Apr 10, 2017 via email

tohojo commented Sep 18, 2017

heistp commented Sep 18, 2017

tohojo commented Sep 18, 2017 via email

tohojo commented Nov 20, 2017 via email

heistp commented Nov 20, 2017 via email

tohojo commented Nov 20, 2017 via email

heistp commented Nov 20, 2017 via email

tohojo commented Nov 20, 2017 via email

heistp commented Nov 20, 2017

tohojo commented Nov 20, 2017

flent-users commented Nov 20, 2017 via email

flent-users commented Nov 20, 2017 via email

heistp commented Nov 21, 2017 via email

heistp commented Nov 21, 2017 via email

heistp commented Nov 21, 2017 via email

tohojo commented Nov 21, 2017 via email

heistp commented Nov 21, 2017 via email

tohojo commented Nov 21, 2017 via email

heistp commented Nov 21, 2017 via email

dtaht commented Nov 21, 2017 via email

heistp commented Nov 21, 2017

tohojo commented Nov 21, 2017 via email

heistp commented Nov 21, 2017 via email

tohojo commented Nov 22, 2017 via email

heistp commented Nov 22, 2017 via email

tohojo commented Nov 22, 2017

tohojo commented Nov 22, 2017

heistp commented Nov 22, 2017

tohojo commented Nov 22, 2017 via email

dtaht commented Nov 22, 2017 via email

tohojo commented Nov 22, 2017

dtaht commented Mar 27, 2017 •

edited

heistp commented Mar 27, 2017 •

edited

heistp commented Apr 7, 2017 •

edited