-
Notifications
You must be signed in to change notification settings - Fork 136
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Simple Forward with Token Bucket NF #199
Conversation
CI MessageYour results will arrive shortly |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In response to PR creation
CI Message
Run successful see results:
✔️ PR submitted to develop branch
✔️ Linter passed
@onvm test this please! |
CI MessageYour results will arrive shortly |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@onvm test this please!
CI Message
Run successful see results:
✔️ PR submitted to develop branch
✔️ Pktgen performance check passed
✔️ Speed Test performance check passed
✔️ Linter passed
[Results from nimbnode23]
-
Median TX pps for Pktgen: 7705634
Performance rating - 100.07% (compared to 7700000 average) -
Median TX pps for Speed Tester: 42169402
Performance rating - 100.40% (compared to 42000000 average)
I wonder if this whole process of dequeuing a large burst of packets from
the Rx ring and then processing it in the leaky bucket is the most
appropriate way to design a leaky bucket AQM mechanism...
…On Mon, Mar 30, 2020 at 9:45 AM Dennis Afanasev ***@***.***> wrote:
***@***.**** commented on this pull request.
------------------------------
In examples/simple_fwd_tb/forward.c
<#199 (comment)>:
> +#include <stdlib.h>
+#include <string.h>
+#include <sys/queue.h>
+#include <unistd.h>
+
+#include <rte_common.h>
+#include <rte_ip.h>
+#include <rte_mbuf.h>
+#include <rte_cycles.h>
+
+#include "onvm_nflib.h"
+#include "onvm_pkt_helper.h"
+
+#define NF_TAG "simple_fwd_tb"
+
+#define PKT_READ_SIZE ((uint16_t) 100000000) // TODO: Find proper value for PKT_READ_SIZE
That value, RTE_MP_*_DESC_DEFAULT appears to be related to the port
configuration. NF_QUEUE_RINGSIZE is the size of the RX/TX ring. You may
either check the size and dequeue all packets up to that maximum, or just
set the dequeue burst parameter to the ringsize. I think the latter
approach is more efficient.
—
You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub
<#199 (comment)>, or
unsubscribe
<https://github.com/notifications/unsubscribe-auth/AC6TUJJUYRJDVFWAB3ZOHVLRKDEDVANCNFSM4LVFHKTA>
.
--
K. K. Ramakrishnan
|
I could think of 3 possible ways of implementing the token bucket:
Considering the above points, I went ahead with the second approach as mentioned above. Does that look fine? Thanks and Regards. |
I feel this is not the right way to implement it. It adds a lot of latency
on the processing of the Rx; potentially drops packets unnecessarily on the
Rx; creates lots of burstiness on the Tx, which in principle will be bad
for downstream nodes processing the traffic. I really don't think you need
to have these multiple threads the way you have structured it. Maybe Sameer
might give you a better design on how to structure the token filling...
K. K.
…On Mon, Mar 30, 2020 at 11:46 AM Rohit M P ***@***.***> wrote:
I could think of 3 possible implementing the token bucket:
1. Dequeue pkt by pkt based on availability of tokens - This way, we
wouldn't be able to apply a dynamic depth constraint and it would be fixed
to the size of the ring. Also, we'd have to preemptively dequeue a pkt to
know it's size if we treat a token to be equivalent to a byte instead of a
pkt.
2. Burst dequeue and process pkts based on availability of tokens -
This way, we can apply depth constraints since we know how many pkts were
in the queue (since we dequeue all available pkts). Although, the packets
which would've been dropped as compared between this implementation and
dequeuing one-by-one would be different.
3. Maintain a separate queue apart from the ring to simulate a
traditional token bucket - This method would involve launching threads, one
for processing rx and the other for tx since it would be hard for a single
thread to dequeue from rx_ring at a constant rate while generating tokens
and enqueuing to tx_ring based on availability of tokens. And maintaining
two threads comes with the cost of having locks on the queue which could
lead to poor performance.
Considering the above points, I went ahead with the second approach as
mentioned above. Does that look fine? Thanks and Regards.
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub
<#199 (comment)>, or
unsubscribe
<https://github.com/notifications/unsubscribe-auth/AC6TUJML5HT5FDFI5TELJITRKDSG5ANCNFSM4LVFHKTA>
.
--
K. K. Ramakrishnan
|
Thanks for the feedback Sir. I could remove the tx burst by modifying my implementation but I'm not quite sure how the rx latency is increasing. Should I send an email to Sameer? |
Thanks for the inputs. I've simplified the code to work with a batch size of 32. The excess packets are now stalled and not dropped if tokens are insufficient. Packets are dropped only when the ring becomes full. |
Thanks @rohit-mp . I think this is a good solution which should reduce burstiness caused by large dequeues/enqueues. How have you been testing this? |
I tested the rate by varying it to see a corresponding change in the packets processed per second. I'm not sure how to verify depth as that can be tested under bursty condition and speed-tester wasn't suitable for that. Any input on how to test it thoroughly is appreciated. |
You may try testing depth out by hooking up pktgen on a client node and sending it traffic from there, check out our Wiki page to see how to set it up. I think that pktgen lets you attach a script, you can try simulating bursty behavior through that. |
Just like @dennisafa said, write a little Lua code that Pktgen will process and send. If you update pktgen Lua script you can create these kinds of advanced tests to pump packets through the network. |
Hello team, I recently realized that we don't actually need advanced rings mode to simulate the token bucket. But I think this is still a good basic example showing the usage. |
It would seem to me that if the default ring mode is sufficient, I would
just restrict it to use that capability. Unless you are likely to use the
advanced ring mode for the scaling to multiple instances. I would think
that we may have to carefully think through all the ramifications before
that.
…On Sun, May 3, 2020 at 12:33 AM Rohit M P ***@***.***> wrote:
Hello team, I recently realized that we don't actually need advanced rings
mode to simulate the token bucket. But I think this is still a good basic
example showing the usage.
Would you prefer that I use default rings mode instead? Or would it be
good to provide both modes as options as done in scaling NF?
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub
<#199 (comment)>, or
unsubscribe
<https://github.com/notifications/unsubscribe-auth/AC6TUJPLJZ7DHYYB5TPKCATRPUM5TANCNFSM4LVFHKTA>
.
--
K. K. Ramakrishnan
|
Thank you for the suggestion. Yes, default ring mode is sufficient. But @twood02 mentioned that this would be a good example for helping people learn how to use the advanced rings interface (in the comments above on this PR). |
In my opinion I think that it would be best to use the default mode unless necessary. Adv. rings mode should just be used when we need full control over the RX/TX rings. |
Yes, I still think this is quite useful as an example of the advanced rings mode. I suggest we finalize this Pull Request and merge it with just the advanced rings mode. Later we can make a second version based on the normal packet handler function, which I'd prefer to keep as a separate directory so it is easier to see the differences in the two approaches. If @rohit-mp wants to add the standard packet handler version, that'd be great. Otherwise one of our students can do it this summer. |
Thanks, let me know if anything is needed to help in finalizing the PR. I'd be happy to add the default ring mode as well soon. |
This is the output of the C Linter which failed:
The first two lines would be an issue with every example in the repo. Should I fix that here? |
Hi @rohit-mp . Thanks for your comment. The reason is because we recently introduced a new linter using Github actions. The new linter updated the verbosity level --the amount of information returned by the linter-- which is why the lint failed. I felt this would provide more useful feedback to the user. However, you raised some important points and I will speak to the rest of the team and get back to you. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In order to fix a bug for the case where the token bucket is larger than the packet size, on line 210 of forward_tb.c, you should have a condition:
if (pkt->pkt_len > tb_tokens) {
tb_tokens -= pkt->pkt_len;
} else {
tb_tokens = 0;
}
Also, in your README.md, when describing the -R flag, 'Mbps' should be changed to 'MBps' to specify Megabytes rather than Megabits.
Token Bucket being larger than the pkt size is the only valid case here. If size of Token Bucket is less than the pkt size, we would never be able to process any pkts. I think you meant you say tokens instead. These lines make sure that tokens are more than pkt size before reaching line 210:
Good catch. Will fix that in the usage as well. |
@rohit-mp I think you are correct that your To test this, run speed tester with Your PR mentions depth should be set carefully, but I think that should be more explicit. At least print an obvious warning if depth is less than say 10,000. |
Ah nice catch! Thanks for that. I missed out the value of depth being less than pkt_len. But @catherinemeadows fix would resolve the issue only partially (I'm assuming we've to interchange the if and else part in that) since we'd still be processing the pkt without sufficient tokens. I will add a warning for when depth is set to less than 10,000. But along with that warning, should I drop pkts that arrive with a size of more than the depth? It seems like the better option to me since technically, that packet should never be processed and all the pkts arriving behind it would be stalled forever. |
@rohit-mp if you have a chance to add the warning and fix soon that'd be great (sorry to bug you with all these little things, your contribution is greatly appreciated!). If you won't have time to make them soon, then we can push the edits ourselves. We'd like to merge this for our May release. |
Not an issue at all. The careful reviews are highly appreciated. I was waiting for confirmation on what to do when a pkt with length greater than depth arrives (as asked in my last message). |
oops, sorry missed your question at the end.
Yes, I agree that dropping packets that are larger than the depth is fine. I don't think this would be used much in practice (depth should always be much bigger), but I suppose this would be a way to apply a filter to remove large packets which might be useful in some cases. |
I've added a warning for when depth is less than 10,000 and an unlikely check in the packet handler for when pkt length is more than depth. Let me know if any changes are required |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great work!
This PR adds a new example - simple_fwd_tb
The NF uses the advanced rings mode and simulates a queue with a token bucket. It takes in depth and rate as additional parameters and forwards/drops the packets based on the those values.
Summary:
Usage:
Merging notes:
TODO before merging :
Test Plan:
Trivial inputs have been tested.
The rate was increased while keeping depth constant to observe a increase in tx_pps upto a limit.
The depth was increased while keeping rate constant to observe a decrease in the number of dropped packets.
Any input on how to test the same thoroughly is appreciated.
Testing the rate
R=100
R=1000
R=10000
Testing the depth
D=1000
D=10000
D=100000
Point to be noted is that the depth should be sufficiently high enough to enqueue atleast 32 packets at once onto the tx_ring for the NF stats to be updated properly. Related to #198
Review:
@twood02
Subscribers: << @-mention people who probably care about these changes >>
@Shashwatha-Mitra @NishanthSubramanian @archit-p @mohittahiliani