Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Interrupt moderation/coalesce settings don't seem to work on c6gn or m6i instances #159

Closed
talawahtech opened this issue Dec 28, 2020 · 17 comments
Labels
bug Report errors or unexpected behavior Linux ENA driver

Comments

@talawahtech
Copy link

talawahtech commented Dec 28, 2020

Hi,

I have been running some tests on a c6gn.xlarge and noticed that the interrupt moderation settings don't seem to have any effect.

ENA version 2.4.0
AMI: amzn2-ami-hvm-2.0.20201218.1-arm64-gp2
Instance: c6gn.xlarge

Here is a comparison between a c5n.xlarge and a c6gn.xlarge.

Moderation settings:

sudo ethtool -C eth0 adaptive-rx off
sudo ethtool -C eth0 rx-usecs 256
sudo ethtool -C eth0 tx-usecs 256

c5n.xlarge (w/ c5n.4xlarge client)

Server Command: iperf3 -s -p 5200 --logfile /dev/null

Client Command: iperf3 -c XXX.XXX.XXX.XXX -P 16 -t 10 -M 88 -p 5200

Monitoring Command: dstat --cpu -y -i -I 27,28,29,30 --net-packets

Result:

----total-cpu-usage---- ---system-- -------interrupts------ -pkt/total-
usr sys idl wai hiq siq| int   csw |  27    28    29    30 |#recv #send
  1  15  58   0   0  26|  15k 4633 |3665  3274  3777  3309 |1801k  931k
  0  17  61   0   0  22|  15k 4840 |3408  3513  3828  3569 |1800k  928k
  1  18  61   0   0  20|  16k 5140 |3386  3401  3849  3697 |1798k  930k
  1  19  59   0   0  21|  16k 5075 |3616  3213  3888  3555 |1802k  922k
  0  22  57   0   0  22|  16k 5787 |3600  3029  3894  3556 |1799k  937k

As expected, the number of interrupts per second per queue is under 3900 (1M/256)

c6gn.xlarge (w/ c6gn.4xlarge client)

Server Command: iperf3 -s -p 5200 --logfile /dev/null

Client Command: iperf3 -c XXX.XXX.XXX.XXX -P 16 -t 10 -M 88 -p 5200

Monitoring Command: dstat --cpu -y -i -I 47,48,49,50 --net-packets

Result:

----total-cpu-usage---- ---system-- -------interrupts------ -pkt/total-
usr sys idl wai hiq siq| int   csw |  47    48    49    50 |#recv #send
  0  22  31   0   0  47| 145k 5893 |  37k   55k   32k   20k|2441k 1236k
  0  24  29   0   0  47| 145k 5727 |  31k   65k   26k   21k|2504k 1281k
  0  22  20   0   0  57| 109k 4647 |  12k   59k   23k   13k|2761k 1425k
  1  22  31   0   0  46| 172k 6704 |  40k   74k   36k   21k|2396k 1242k
  0  23  25   0   0  52| 141k 5509 |  20k   67k   40k   13k|2594k 1338k

Much to my surprise I am seeing tens of thousands of interrupts per second per queue for the c6gn.xlarge. Is there something else that I need to do to get this to work?

@akiyano
Copy link
Contributor

akiyano commented Dec 28, 2020

Hi @talawahtech,

Thanks for your report.
I've reproduced the issue and am looking into it.

Thanks,
Arthur

@akiyano
Copy link
Contributor

akiyano commented Dec 29, 2020

Hi @talawahtech,

This is a known issue in c6gn.
A fix is expected to be introduced in Q1/2021
Will update this ticket once it is fixed.

Regards,
Arthur

@talawahtech talawahtech changed the title Interrupt moderation/coalesce settings don't seem to work on Graviton2 instance (c6gn.xlarge) Interrupt moderation/coalesce settings don't seem to work on c6gn instances Dec 29, 2020
@talawahtech
Copy link
Author

Hey @akiyano just checking if there is an updated timeline for this fix.

@akiyano
Copy link
Contributor

akiyano commented Jul 27, 2021

Hi @talawahtech,

Unfortunately, this feature was de-prioritized for the time-being. We might decide to get back to it in early 2022.
However, we would still like to assist you if we can. Is there a real use-case / customer impact that you are trying to solve with interrupt moderation?
You can contact me via email akiyano@amazon.com if you prefer.

Thanks,
Arthur

@talawahtech
Copy link
Author

Hey @akiyano,

Just to make sure we are the same page, based on my tests, as things are right now, interrupt moderation works as expected on the c6g, but doesn't work at all (neither static nor adaptive moderation) on the c6gn. Is that correct?

My use case is that I am doing series of blog posts on network performance tuning on AWS. I started off by doing a number of optimizations to get a c5n.xlarge to be able to serve 1.2M HTTP req/s in my first post.

I was hoping to follow that up with a post covering the c6gn.xlarge. The plan was to see if there are any Graviton2 specific changes needed, and compare the two in terms of price and performance. I am also doing a talk covering the same subject at the P99 CONF virtual conference in October (slides are due in August) and I was hoping to at least have some preliminary Graviton2 numbers for the talk. I did a quick test with the c6g.xlarge, but I hit the 1M pps cap right away with only 70% CPU usage. While on the c6gn.xlarge I can't get past 800k req/s because of interrupt processing, so I was holding out for this fix. Is there a pre-release driver available that fixes this issue on the c6gn that I could test?

Just to be fully clear, my talk is not at all dependent on these numbers, it was just a nice to have. But 2022 does see pretty far off for this kind of performance constraint. It may be worth it to update the interrupt moderation section of new ENA best practices document to indicate that the c6gn doesn't currently support this feature.

@ShayAgros
Copy link
Contributor

Hi @talawahtech,

I am also doing a talk covering the same subject at the P99 CONF virtual conference in October (slides are due in August) and I was hoping to at least have some preliminary Graviton2 numbers for the talk

This is quite cool really. I'll go over the your first blog post, I might learn something new there. Well done for the effort (:

Is there a pre-release driver available that fixes this issue on the c6gn that I could test?

Unfortunately the work required to support interrupt moderation for c6gn.* instance types isn't trivial, and not something we can spawn off quickly

I did a quick test with the c6g.xlarge, but I hit the 1M pps cap right away with only 70% CPU usage

It's hard to get it from the name, but c6g.* and c6gn.* instance types are very different, where the former is supposed to be less performant than the latter. c6g.* indeed supports interrupt coalescing though.

I can't get past 800k req/s because of interrupt processing

Interrupt coalescing support on ENA devices is very basic in its functionality. All it does is postpone the interrupt sent to the CPU until some time has passed after first packet arrived.

In ENA the interrupt line for each queue is automatically masked once an interrupt is fired. It stays this way until the driver unmasks it explicitly (e.g. like it does here).
So the interrupt coalescing behavior can be simulated by delaying the driver from unmasking the interrupt line after a napi cycle.

The Linux kernel provides some tools for delaying interrupt unmasking using sysfs files. For example:

net: napi: add hard irqs deferral feature
Introduce preferred busy-polling

I suggest to try some of these options. While it won't be the same as interrupt coalescing since an in-kernel solution requires its own CPU cycles to operate, it might be close enough to improve your performance results.

It may be worth it to update the interrupt moderation section of new ENA best practices document to indicate that the c6gn doesn't currently support this feature.

Yup, this is somewhat irresponsible to omit this information. We'll add a note about this in the document

If the proposed solutions don't work. I'd appreciate it if you could tell us the symptoms you observe. E.g.

  • CPU usage is constantly at 100%
  • some driver stats ($ ethtool -S [interface name]) that usually don't increase, now do. For your convenience you can use this script diff_stats.sh.txt which shows the stats that change every given interval (usage $ ./diff_stats.sh.txt -i [interval] [interface name], the interval value can be omitted and would be 1 sec by default).

@talawahtech
Copy link
Author

talawahtech commented Jul 29, 2021

I am also doing a talk covering the same subject at the P99 CONF virtual conference in October (slides are due in August) and I was hoping to at least have some preliminary Graviton2 numbers for the talk

This is quite cool really. I'll go over the your first blog post, I might learn something new there. Well done for the effort (:

Thanks, I would love to hear any feedback that you have about the post.

Is there a pre-release driver available that fixes this issue on the c6gn that I could test?

Unfortunately the work required to support interrupt moderation for c6gn.* instance types isn't trivial, and not something we can spawn off quickly

I see. My understanding the c6gn uses a newer version of the Elastic Network Adapter (ENAv3). Does this mean that future instances that use ENAv3 will also have this limitation?

The Linux kernel provides some tools for delaying interrupt unmasking using sysfs files. For example:

net: napi: add hard irqs deferral feature
Introduce preferred busy-polling

I suggest to try some of these options. While it won't be the same as interrupt coalescing since an in-kernel solution requires its own CPU cycles to operate, it might be close enough to improve your performance results.

Oh wow! I didn't know that was possible, thanks for sharing those links. It looks like those features weren't added to the kernel until 5.8 and 5.11. I am currently using Amazon Linux 2 with kernel 4.14, so I guess I'll have to wait until I start testing later kernels before I try it out, but I am definitely looking forward to testing napi_defer_hard_irqs in particular.

If there are any other lesser known network performance gems like this that I didn't touch on in my blog post please let me know.

  • some driver stats ($ ethtool -S [interface name]) that usually don't increase, now do. For your convenience you can use this script diff_stats.sh.txt which shows the stats that change every given interval (usage $ ./diff_stats.sh.txt -i [interval] [interface name], the interval value can be omitted and would be 1 sec by default).

Thanks, that script should be useful.

@akiyano akiyano added bug Report errors or unexpected behavior Linux ENA driver labels Oct 8, 2021
@talawahtech talawahtech changed the title Interrupt moderation/coalesce settings don't seem to work on c6gn instances Interrupt moderation/coalesce settings don't seem to work on c6gn or m6i instances Dec 14, 2021
@talawahtech
Copy link
Author

Just checking in to see if this is still on the schedule for early 2022. FYI I tested an m6i a couple months ago and it was also affected. I assume it is the same for the other ENAv3 instances like c6i and m6a as well.

@davidarinzon
Copy link
Contributor

Hi @talawahtech
Thank you for checking in on this.
We're currently planning on releasing this support in Q1/22, we'll update on this ticket when it will be available.
You are correct in your analysis and it is indeed relevant for c6i and m6a as well.
Please note that once deployed, the support will be available for newly attached ENIs.

@talawahtech
Copy link
Author

Good to know. Thanks for the update @davidarinzon!

@gshanemiller
Copy link

gshanemiller commented Mar 28, 2022

Do the Interrupt moderation/coalesce settings

sudo ethtool -C eth0 adaptive-rx off
sudo ethtool -C eth0 rx-usecs 256
sudo ethtool -C eth0 tx-usecs 256

have any bearing on DPDK usage via virtio?

@nafeabshara
Copy link

nafeabshara commented Mar 28, 2022 via email

@gshanemiller
Copy link

gshanemiller commented Mar 28, 2022

Thank you for the prompt reply. However, it did not scratch my itch. You replied about when features will be delivered. I was asking about how interrupts relate to DPDK code using this driver set in userspace. See also here.

Put another way: if the packet flow

  • app -> DPDK -> virtio-driver - virt NIC

is essentially bounded by the number of interrupts/sec the HW can handle almost certainly underperforming what a CPU can do ... then interrupt moderation might be a thing to know about. Like I ask in the link, I came into DPDK development using this driver set thinking it's all poll mode driven which is what the DPDK documentation lead me to believe.

@shaibran
Copy link
Contributor

ENA PMD works in the polling mode, Rx interrupt is supported but disabled by default.
Interrupt moderation feature is used by kernel drivers (or the interrupt driven drivers). For the DPDK the application uses the interrupt handlers so the driver couldn't moderate it's own interrupts that way, and the application shouldn't use the PMD API directly.

In general the polling mode should be faster than the interrupt based approach, but the hardware limitations must be taken into consideration there as well.

@davidarinzon
Copy link
Contributor

Hi @talawahtech

The change to enable interrupt moderation/coalescing was recently introduced, you're welcome to re-execute your testing.
If you have any further queries, please let us know.

Thanks

@talawahtech
Copy link
Author

talawahtech commented May 12, 2022

The change to enable interrupt moderation/coalescing was recently introduced, you're welcome to re-execute your testing. If you have any further queries, please let us know.

Awesome news. Thanks David, I'll give it a spin as soon as I get a chance.

@harp-intel
Copy link

harp-intel commented Jun 6, 2022

Hi @talawahtech

The change to enable interrupt moderation/coalescing was recently introduced, you're welcome to re-execute your testing. If you have any further queries, please let us know.

Thanks

Edit 2: Please disregard. We found that building/installing the latest driver from source (it was not in the AMI) and enabling adaptive-rx got us the performance we hoped to see.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Report errors or unexpected behavior Linux ENA driver
Projects
None yet
Development

No branches or pull requests

8 participants