ARTEMIS-994 Support Netty Native Epoll on Linux#1093
Conversation
|
I am running the testsuite, and if everything is ok I should merge it tomorrow... I have 50% of tests running and so far so good.. will know in 2 hours... BTW: can you run these commands please? git pull -rebase upstream master
git push origin -f {your branch name}and then: git rebase -i upstream/masterAnd squash these committs into one, with a description that matches it? I could do this myself, but I would rather have you finding the best commit name once you squash them. |
|
Awesome work! Testsuite is good... |
The following changes are made to support Epoll. Refactored SharedNioEventLoopGroup into renamed SharedEventLoopGroup to be generic (as so we can re-use for both Nio and Epoll) Add support and toggles for Epoll in NettyAcceptor and NettyConnector (with fall back to NIO if cannot load Epoll) Removal from code of PartialPooledByteBufAllocator, caused bad address when doing native, and no longer needed - see jira discussion New Connector Properties: useEpoll - toggles to use epoll or not, default true (but we failback to nio gracefully) epollRemotingThreads = same behaviour as nioRemotingThreads but for Epoll. useEpollGlobalWorkerPool = same behaviour as useNioGlobalWorkerPool but for Epoll. New Acceptor Properties: useEpoll - toggles to use epoll or not, default true (but we failback to nio gracefully) useEpollGlobalWorkerPool = same behaviour as useNioGlobalWorkerPool but for Epoll.
|
@clebertsuconic I have just now rebased and squashed, hope this is what you were wanting me to do. |
|
@michaelandrepearce I don't think you really need three parameters for this... just one would do...
the other two you introduced could just use the same parameter, being the same semantic.. being just epoll. That way you would configure to use epoll or not with a single switch. If you need to configure different settings.. than you can just update the values.. it gets easier on users IMHO. WDYT? |
|
@clebertsuconic I though about this but annoyingly theyre named nioBlah as such just using that would be not directly indicate you're affecting epoll. Like wise if we were to strip the nio part then they would be generic but would break any compatibility with existing clients as they'd have to change their properties if used. This is why I introduce the extra duplicate props. If you think renaming the existing properties to generic in nature I can do this but as noted wouldn't be back compatible. |
|
@michaelandrepearce I will add a new generic parameter, and deprecate the old one. Will do that on a separate commit. I just wanted to know if you had any reason. |
|
@clebertsuconic sounds good. |
|
@clebertsuconic I notice this didn't get merged still. Did I miss understand and you expected me to make the configuration changes? if so no worries, just let me know. |
|
@michaelandrepearce nope.. I just procrastinated for an afternoon :) give me till tomorrow |
|
@clebertsuconic I was thinking this afternoon has 2.0.0 been cut for tagging? If not then maybe worth trying to get the property change done before that as a 2.0.0 is a major breaking release anyhow would avoid need for deprecating and creating a new generic we just would need a rename which would be cleaner |
|
@michaelandrepearce 2.0.0 (Artemis) is already cut. 3 days ago.. it was just the vote on... So this will be for 2.0.1. I was thinking about setting the default as NIO at least for a while, and having a --epoll --no-epoll property through the CLI. |
|
@michaelandrepearce I did some performance tests with https://github.com/ssorj/quiver and the test hung... I will have to take double care before merging this.. it may take some extra time, as I don't want to make the broker unstable now. I would like to investigate what's happening before merging this. |
|
@clebertsuconic thanks, ill try look into also see if i spot anything. I assume this is repeatable. (i had a quick go using it out the box via dnf install on fedora, but couldn't get it to run against current master (without my changes), i probably need to rebuild with 2.0.0?) |
|
@michaelandrepearce I couldn't check other options.. I was using AMQP.. so, I was using this command line on quiver to replicate it: you may need to use a few snapshots on quiver. but most stuff is available on fedora. Look at the list of dependencies if you like to try it. |
|
Hi @clebertsuconic i managed to get it running. Some notes: I got the below output. $ quiver q0 --sender qpid-messaging-cpp --receiver qpid-messaging-cpp --messages 1m --body-size 100 --credit 1000 --timeout 10 I did however successfully run using just: $ quiver q0 --messages 1m --body-size 100 --credit 1000 --timeout 10 I am using fedora 25. It ran fine with both server acceptor set to epoll and nio for me. Attached is the output i got for both runs. |
|
the only comment is with epoll, i notice the producer was producing faster than the consumer during the test as such some queue depth occured, with nio producer and consumer speeds were closer as such less queue depth occured. epoll producer rate > nio producer rate by approx 200 messages a second where as consumer rates differed only by approx 50. |
|
@michaelandrepearce thanks for the results! |
|
@michaelandrepearce you have to follow all the dependencies on quiver. probably the cpp didn't finish compilation on make? There's something going on.. I will review it this week. if you run the cpp module, which generates a bit more load on AMQP and you will have a few weird errors. it just needs some testing. |
|
@clebertsuconic i have re-run with march larger run size, and now able to reproduce. So it seems there is a direct memory leak, this occurs with both epoll and nio set, as such i assume this is around removal of PartialPooledByteBufAllocator for the default netty allocator rather than epoll itself. I will dig into this area first. |
|
i turned on the netty leak detector: 22:55:59,492 SEVERE [io.netty.util.ResourceLeakDetector] LEAK: ByteBuf.release() was not called before it's garbage-collected. See http://netty.io/wiki/reference-counted-objects.html for more information. |
|
Also: #1: |
|
@michaelandrepearce that's what I was going after.. some sort of leak while dealing with AMQP. I will review it this week. |
|
I think I got to the bottom of this... there was a leak.. and some proton miss use. |
|
@michaelandrepearce did you ammend anything on your PR? I can't build it any longer (especially outside of Linux.. like on a mac). |
|
No haven't touched it |
|
@clebertsuconic i just did a complete fresh clone and rebuild of this PR/My branch which i rebased the other day. It still built ok for me. Should i rebase again? Maybe some upstream commit in the last few days might have broken something? |
|
=?utf-8?Q?Michael_Andr=C3=A9_Pearce?= on dev@activemq.apache.org replies: Any luck getting it building? Did you take my branch or did you take the pr and apply to a local fork/bran= Also re the amqp stuff, I see a lot of upstream changes if I take latest now= Cheers Sent from my iPhone |
|
Clebert Suconic on dev@activemq.apache.org replies: I need to make some changes also as we talked. I was using this to test |
|
=?utf-8?Q?Michael_Andr=C3=A9_Pearce?= on dev@activemq.apache.org replies: What is the extract failure you're seeing?=20 |
|
Clebert Suconic on dev@activemq.apache.org replies: Now that you mentioned. I think rebuilt netty. And I provably did Will redo in the morning with a fresh Maven. Going to sleep now :) |
The following changes are made to support Epoll. Refactored SharedNioEventLoopGroup into renamed SharedEventLoopGroup to be generic (as so we can re-use for both Nio and Epoll) Add support and toggles for Epoll in NettyAcceptor and NettyConnector (with fall back to NIO if cannot load Epoll) Removal from code of PartialPooledByteBufAllocator, caused bad address when doing native, and no longer needed - see jira discussion New Connector Properties: useEpoll - toggles to use epoll or not, default true (but we failback to nio gracefully) remotingThreads = same behaviour as nioRemotingThreads. Previous property is depreated. useGlobalWorkerPool = same behaviour as useNioGlobalWorkerPool. Old property is deprecated. New Acceptor Properties: useEpoll - toggles to use epoll or not, default true (but we failback to nio gracefully) useGlobalWorkerPool = same behaviour as useNioGlobalWorkerPool but for Epoll. This closes apache#1093
|
Please review master ? |
|
Many thanks Clebert.
Had a brief look over tonight. Seems good. Will spend a bit more time tomorrow and also take the chance to update my fork.
…Sent from my iPhone
On 24 Mar 2017, at 00:47, clebertsuconic ***@***.***> wrote:
Please review master ?
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub, or mute the thread.
|
The following changes are made to support Epoll. Refactored SharedNioEventLoopGroup into renamed SharedEventLoopGroup to be generic (as so we can re-use for both Nio and Epoll) Add support and toggles for Epoll in NettyAcceptor and NettyConnector (with fall back to NIO if cannot load Epoll) Removal from code of PartialPooledByteBufAllocator, caused bad address when doing native, and no longer needed - see jira discussion New Connector Properties: useEpoll - toggles to use epoll or not, default true (but we failback to nio gracefully) remotingThreads = same behaviour as nioRemotingThreads. Previous property is depreated. useGlobalWorkerPool = same behaviour as useNioGlobalWorkerPool. Old property is deprecated. New Acceptor Properties: useEpoll - toggles to use epoll or not, default true (but we failback to nio gracefully) useGlobalWorkerPool = same behaviour as useNioGlobalWorkerPool but for Epoll. This closes apache#1093
The following changes are made to support Epoll. Refactored SharedNioEventLoopGroup into renamed SharedEventLoopGroup to be generic (as so we can re-use for both Nio and Epoll) Add support and toggles for Epoll in NettyAcceptor and NettyConnector (with fall back to NIO if cannot load Epoll) Removal from code of PartialPooledByteBufAllocator, caused bad address when doing native, and no longer needed - see jira discussion New Connector Properties: useEpoll - toggles to use epoll or not, default true (but we failback to nio gracefully) remotingThreads = same behaviour as nioRemotingThreads. Previous property is depreated. useGlobalWorkerPool = same behaviour as useNioGlobalWorkerPool. Old property is deprecated. New Acceptor Properties: useEpoll - toggles to use epoll or not, default true (but we failback to nio gracefully) useGlobalWorkerPool = same behaviour as useNioGlobalWorkerPool but for Epoll. This closes apache#1093
The following changes are made to support Epoll.
New Connector Properties:
useEpoll - toggles to use epoll or not, default true (but we failback to nio gracefully)
epollRemotingThreads = same behaviour as nioRemotingThreads but for Epoll.
useEpollGlobalWorkerPool = same behaviour as useNioGlobalWorkerPool but for Epoll.
New Acceptor Properties:
useEpoll - toggles to use epoll or not, default true (but we failback to nio gracefully)
useEpollGlobalWorkerPool = same behaviour as useNioGlobalWorkerPool but for Epoll.