Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

TCP socket rewrite with strand and queueing support #1938

Merged
merged 7 commits into from May 14, 2019

Conversation

@cryptocode
Copy link
Collaborator

commented Apr 26, 2019

Marking as incomplete as more docs and discussions related to the needs for implementing tcp live messages are needed.

  • Factored out socket stuff from bootstrap
  • Introduced server_socket since it's no longer bootstrap specific (though bootstrap still starts the tcp server, a different PR should refactor that)
  • Socket multithreading support through write queues and strands
  • Write queue means shared_ptr of buffers is required everywhere (to extend buffer lifetime). send_buffer_raw removed for that reason.
  • Socket can be single-writer (default for now) or multi-writer. Former is slightly more efficient (suitable for bootstrap connections which is send/reply), while the latter allows concurrent writers (for live messages if we end up not awaiting replies like with UDP)
  • Checkup and the two timer variables are replaced with a steady deadline timer
  • Added stats for accept success/failures as an accept failure log entry may give the impression that the node has stopped accepting tcp connections even though a timeout will ensure stale connections are dropped.
  • Added network_timeout as a logging option. This was previously tied to bulk pull logging which doesn't make sense for live messages.
  • tcp_client_timeout renamed to tcp_io_timeout. Proposing to increase from 5 to 15 second in case tunnels need to be established etc (5 sec may be a tad too low, causing unnecessary connection timeouts)
  • tcp_server_timeout removed, no longer needed as accept always starts a timer
  • tcp_idle_timeout added, default socket idling timeout, can be overridden by constructor
  • Removed if (!ec) from callback wrappers, in case callback needs notification of errors/do corrective actions/logging.
  • Fix io_ctx#run race in wallet. Thread runner is now started first. This requires a work token, which is cancelled on join.
  • To enable tests with thread_runner to finish more quickly, join takes an argument where event looping can be stopped
  • remote is cached on connect/accept as it's not thread safe to call it directly (caching is more convenient than wrapping in a strand on use sites). For outgoing connections, the remote endpoint is known. For accept, there's an overload to receive it.
  • In testutils, a completion_signal type is added. Some tests will benefit from using thread runner instead of polling (the new socket test is an example) and this class simplifies waiting for async operations to complete.

@cryptocode cryptocode added this to the V19.0 milestone Apr 26, 2019

@cryptocode cryptocode self-assigned this Apr 26, 2019

@zhyatt zhyatt added this to During RC in V19 Apr 30, 2019

@zhyatt zhyatt requested review from SergiySW, wezrule and clemahieu and removed request for wezrule Apr 30, 2019

@cryptocode cryptocode force-pushed the cryptocode:tcp/strand branch from ec206fc to 5cfc136 Apr 30, 2019

@cryptocode cryptocode removed the incomplete label Apr 30, 2019

@cryptocode

This comment has been minimized.

Copy link
Collaborator Author

commented Apr 30, 2019

If merged, update wiki docs wrt config changes

Show resolved Hide resolved nano/core_test/network.cpp Outdated
Show resolved Hide resolved nano/core_test/testutil.hpp Outdated
Show resolved Hide resolved nano/node/CMakeLists.txt
Show resolved Hide resolved nano/node/bootstrap.cpp Outdated
Show resolved Hide resolved nano/core_test/testutil.hpp Outdated
Show resolved Hide resolved nano/node/logging.cpp
Show resolved Hide resolved nano/node/stats.hpp
Show resolved Hide resolved nano/node/transport/udp.cpp Outdated
@wezrule

wezrule approved these changes May 1, 2019

Show resolved Hide resolved nano/node/socket.cpp Outdated
Show resolved Hide resolved nano/node/socket.cpp Outdated
Show resolved Hide resolved nano/node/socket.cpp
Show resolved Hide resolved nano/node/socket.cpp Outdated
Show resolved Hide resolved nano/node/socket.hpp

@cryptocode cryptocode force-pushed the cryptocode:tcp/strand branch from ed0877b to 0486fb9 May 9, 2019

@cryptocode cryptocode force-pushed the cryptocode:tcp/strand branch from 7037e5b to bb83d39 May 10, 2019

@clemahieu clemahieu merged commit 381fbcd into nanocurrency:master May 14, 2019

2 checks passed

continuous-integration/appveyor/pr AppVeyor build succeeded
Details
continuous-integration/travis-ci/pr The Travis CI build passed
Details

@zhyatt zhyatt moved this from During RC to RC 3 (TBD) in V19 May 20, 2019

SergiySW added a commit that referenced this pull request May 22, 2019

Send live network messages over TCP (#1962)
- merge_peer function modified to try TCP connection before UDP
- TCP channels ongoing_keepalive function used to wake up long not sending channels to prevent TCP disconnection from server
- Live TCP connection starts with node ID handshake
- After handshake node sends preferred peering ports for response channels (keepalive self)
- Received TCP live message is responded to sender with response channels
- Using #1938 multi-writer for TCP channel
- Multiple nodes system () tests using TCP connections by default
- Some tests modified to check 2 live network options: TCP connections & UDP connections
- tcp_incoming_connections_max added, max established incoming TCP connections
- bootstrap_server modified to accept live network messages

argakiig added a commit to argakiig/raiblocks that referenced this pull request May 22, 2019

TCP socket rewrite with strand and queueing support (nanocurrency#1938)
* Strand and queuing support in tcp socket

* Check callback validity in async_write, do sync close in tests to avoid address-reuse issue

* Address review items; tests, checking stats

* Remove unrelated websocket changes

* Reintroduce atomic ticket system, but support concurrent writers. Close from destructor, check strand-execution where possible, and fix a test with too low deadline.

* Don't start io operations if closed

* Don't pass bool to join(), use unsigned instead of size_t and return error instead of success in counted_completion

argakiig added a commit to argakiig/raiblocks that referenced this pull request May 22, 2019

Send live network messages over TCP (nanocurrency#1962)
- merge_peer function modified to try TCP connection before UDP
- TCP channels ongoing_keepalive function used to wake up long not sending channels to prevent TCP disconnection from server
- Live TCP connection starts with node ID handshake
- After handshake node sends preferred peering ports for response channels (keepalive self)
- Received TCP live message is responded to sender with response channels
- Using nanocurrency#1938 multi-writer for TCP channel
- Multiple nodes system () tests using TCP connections by default
- Some tests modified to check 2 live network options: TCP connections & UDP connections
- tcp_incoming_connections_max added, max established incoming TCP connections
- bootstrap_server modified to accept live network messages
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.