Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

WIP: switch to libevent for node socket handling #11227

Closed
wants to merge 23 commits into
base: master
from

Conversation

8 participants
@theuni
Copy link
Member

theuni commented Sep 3, 2017

Not yet ready for review. This can be considered a staging area. Chunks of this will be PR'd until only the actual libevent switch-over is ready, at which point this PR should be ready for review.

These changes remove our old select() loop for socket handling in favor of libevent, which uses epoll/kqueue/etc. as a back-end. In addition to being faster and more efficient, this allows us to drop some annoying restrictions, namely that select can only handle 1024 sockets in practice on most systems.

Note that this does not yet make the switch to libevent for outgoing connections, that work is happening in parallel, and will be easier to merge after this.

Also, for any reviewers, several of these commits would individually introduce some regressions or slow-downs, but they've been split up in order to clarify why some of the changes are being made.

Depends on:

Todo:

  • Add a ton of documentation
  • RAII the libevent structures
  • Add some tests where possible

@theuni theuni force-pushed the theuni:minimal-libevent3 branch Sep 3, 2017

@theuni

This comment has been minimized.

Copy link
Member Author

theuni commented Sep 4, 2017

I believe that the proxy_test failure is a false-positive.

Looks like the issue is that disconnection happens so quickly now after an error that the testnode fails.

@theuni theuni force-pushed the theuni:minimal-libevent3 branch Sep 4, 2017

src/net.cpp Outdated
@@ -454,7 +454,7 @@ void CConnman::DumpBanlist()

void CNode::CloseSocketDisconnect()
{
fDisconnect = true;
Disconnect();

This comment has been minimized.

@sipa

sipa Sep 4, 2017

Member

It looks like this could be a scripted diff, s/fDisconnect = true/Disconnect()/g.

This comment has been minimized.

@theuni

theuni Sep 4, 2017

Author Member

Probably so, will give it a shot.

src/net.cpp Outdated
@@ -914,12 +914,7 @@ size_t CConnman::SocketSendData(CNode *pnode) const
while (it != pnode->vSendMsg.end()) {
const auto &data = *it;
assert(data.size() > pnode->nSendOffset);
int nBytes = 0;
{
if (pnode->hSocket == INVALID_SOCKET)

This comment has been minimized.

@sipa

sipa Sep 4, 2017

Member

This test is no longer needed?

This comment has been minimized.

@theuni

theuni Sep 4, 2017

Author Member

I believe all of these are gone by the end of the change set, the removal may make more sense in an earlier commit, though.

CloseSocket(hSocket);
return;
const char* method = event_base_get_method(m_event_base);
if (!method || (strncmp(method, "select", 6) == 0)) {

This comment has been minimized.

@sipa

sipa Sep 4, 2017

Member

Heh, I'd have hoped that this restriction wouldn't be needed anymore with libevent.

This comment has been minimized.

@theuni

theuni Sep 4, 2017

Author Member

It's only needed if select ends up being used, which should be very unlikely on most platforms. I'll add a comment here, and print something to the log so that it's clear which back-end is being used.

This comment has been minimized.

@laanwj

laanwj Sep 5, 2017

Member

We could just bail out if select is being used. But not sure what platforms would be affected.

src/net.cpp Outdated
@@ -2397,6 +2370,9 @@ bool CConnman::Start(CScheduler& scheduler, const Options& connOptions)
LOCK(m_cs_interrupt_event);
m_interrupt_event = event_new(m_event_base, -1, 0, [](evutil_socket_t, short, void* data) {static_cast<CConnman*>(data)->InterruptEvents();}, (this));
}
m_event_timeout = timeval{60, 0};
const timeval* common_timeout = event_base_init_common_timeout(m_event_base, &m_event_timeout);
memcpy(&m_event_timeout, common_timeout, sizeof(m_event_timeout));

This comment has been minimized.

@sipa

sipa Sep 4, 2017

Member

The documentation on event_base_init_common_timeout says "To do this, call this function with the common duration. It will return a pointer to a different, opaque timeout value. (Don't depend on its actual contents!)".

I'm not sure whether that means you need to use the exact pointer returned, or just a copy of its value (as this code does).

This comment has been minimized.

@theuni

theuni Sep 4, 2017

Author Member

Good eye! I wasn't sure how to store this properly either. This usage is taken from the official documentation. I'll add a comment about its sanity and a link to the docs.

src/net.cpp Outdated
@@ -456,7 +471,14 @@ void CConnman::DumpBanlist()

void CNode::CloseSocketDisconnect()
{
Disconnect();
if (m_write_event) {

This comment has been minimized.

@sipa

sipa Sep 4, 2017

Member

I would prefer it if you could have some RAII wrappers around the libevent structures before adding all this.

This comment has been minimized.

@theuni

theuni Sep 4, 2017

Author Member

Agreed, see PR description :)

@fanquake fanquake added this to In progress in P2P refactor Sep 4, 2017

@laanwj

This comment has been minimized.

Copy link
Member

laanwj commented Sep 5, 2017

Concept ACK, woohoo!

src/net.cpp Outdated

}
WakeMessageHandler();

This comment has been minimized.

@jimpo

jimpo Sep 5, 2017

Contributor

Should this be moved into the above if block? (It used to be inside the if (notify) statement)

This comment has been minimized.

@theuni

theuni Sep 6, 2017

Author Member

Yes, thank you.

src/net.cpp Outdated
@@ -1150,7 +1150,7 @@ void CConnman::AcceptConnection(const ListenSocket& hListenSocket) {
}
}

bool whitelisted = hListenSocket.whitelisted || IsWhitelistedRange(addr);
whitelisted = whitelisted || IsWhitelistedRange(addr);

This comment has been minimized.

@sipa

sipa Sep 5, 2017

Member

||=

I don't think such an operator exists in C++.

This comment has been minimized.

@jimpo

jimpo Sep 5, 2017

Contributor

Oops, you're right. Should have checked.

This comment has been minimized.

@theuni

theuni Sep 6, 2017

Author Member

I can say for certain that the operator doesn't exist because I tried it first :)

This comment has been minimized.

@laanwj

laanwj Sep 6, 2017

Member

Hehe I also remember it doesn't exist - I looked it up a long time ago and IIRC the rationale is that's because |= is the same when using only booleans. In any case this is ok as it is.

@theuni theuni force-pushed the theuni:minimal-libevent3 branch Sep 12, 2017

@theuni theuni force-pushed the theuni:minimal-libevent3 branch Sep 19, 2017

theuni added some commits Sep 18, 2017

net: when de-activating p2p, set nodes to be disconnected rather than…
… closing the socket immediately

Messing with the socket from a non-network thread adds unnecessary complexity.
net: Add a Disconnect function for nodes
For now, this just sets fDisconnect like before. But a function will be
necessary later.
net: add an EnableReceive function so that the message handler can tr…
…igger it

for now this just disables fPauseRecv, but it will need to be a function later.
net: get rid of optimistic send and the socket lock
The socket lock only existed because it was possible to trigger an immediate
disconnect via threads other than the socket handler.

The optimistic send was the last offender.
net: pass the socket/whitelist args directly to AcceptConnection
Subsequent changes will turn AcceptConnection into a callback where the socket
is one of the supplied params.
net: add an event handler for future read/write callbacks
For now this just handles read/write/error/timeout conditions for each node
serially, but OnEvents will soon be invoked as a callback when socket status
changes.
net: Only close the socket in once place
By only allowing the socket to be closed and the handle to be invalidated in
one place, it becomes much easier to reason about the status at any given time.
scripted-diff: Call Disconnect rather than setting fDisconnect directly
-BEGIN VERIFY SCRIPT-
sed -i 's/fDisconnect = true/Disconnect()/' src/net.cpp src/net_processing.cpp
-END VERIFY SCRIPT-

@theuni theuni force-pushed the theuni:minimal-libevent3 branch to 77cee16 Sep 20, 2017

@theuni theuni changed the title RFC: switch to libevent for node socket handling WIP: switch to libevent for node socket handling Sep 20, 2017

@promag

This comment has been minimized.

Copy link
Member

promag commented Oct 18, 2017

Needs rebase.

laanwj added a commit that referenced this pull request Dec 13, 2017

Merge #11363: net: Split socket create/connect
3830b6e net: use CreateSocket for binds (Cory Fields)
df3bcf8 net: pass socket closing responsibility up to caller for outgoing connections (Cory Fields)
9e3b2f5 net: Move IsSelectableSocket check into socket creation (Cory Fields)
1729c29 net: split socket creation out of connection (Cory Fields)

Pull request description:

  Requirement for #11227.

  We'll need to create sockets and perform the actual connect in separate steps, so break them up.

  #11227 adds an RAII wrapper around connection attempts, as a belt-and-suspenders in case a CloseSocket is missed.

Tree-SHA512: de675bb718cc56d68893c303b8057ca062c7431eaa17ae7c4829caed119fa3f15b404d8f52aca22a6bca6e73a26fb79e898b335d090ab015bf6456cf417fc694
@DrahtBot

This comment has been minimized.

Copy link
Contributor

DrahtBot commented Nov 8, 2018

There hasn't been much activity lately and the patch still needs rebase, so I am closing this for now. Please let me know when you want to continue working on this, so the pull request can be re-opened.

@DrahtBot DrahtBot closed this Nov 8, 2018

@dongcarl dongcarl moved this from In progress to Backlog in P2P refactor Feb 21, 2019

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.