Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement fragmentation of large distribution messages #2133

Merged
merged 13 commits into from Feb 22, 2019

Conversation

Projects
None yet
3 participants
@garazdawi
Copy link
Contributor

commented Feb 5, 2019

This PR implements fragmentation of large Erlang Distribution signals in order to prevent head-of-line blocking. This PR introduces two changes to the
distribution protocol.

  1. Move exit reason of EXIT, EXIT2 and MONITOR_P_EXIT to after the control message.
  2. Introduce two new distribution headers that represent:
  3. Start of a new sequence of fragments
  4. A fragment in a sequence

The new distribution headers look like this:

1 1 8 8 1 NumberOfAtomCacheRefs/2+1 | 0 N | 0
131 69 SeqId FragNo NumberOfAtomCacheRefs Flags AtomCacheRefs
1 1 8 8
131 70 SeqId FragNo
  • The atom cache is the same for the entire sequence.
  • The FragNo starts at the total number of fragments in the sequence and then decrements to 1, i.e. in a sequence of 2 fragments the start header has FragNo set to 2 and the following fragment has FragNo set to 1.
  • The old distribution header is still used for messages that do not need to be fragmented.

The following restrictions exist when using the message fragmentation:

  • Only the payload of the message may be fragmented. The control sequence may not span across several fragments.
  • Only one sequence may be sent by one process at a time.
  • Fragments must arrive in the correct order. i.e. if a sequence consists of 4 fragments, then the fragments have to arrive as 4, 3, 2, 1.

In addition to these changes to the Erlang Distribution protocol, this PR also fixes and optimizes many internal issues.

  • Yielding during processes exit when sending many exit/down messages
  • Change the distr inet driver to run in binary mode and fix dist.c to not copy the payload unneccisarily.
  • Trap when sending distributed exit/1, exit/2 and monitor down messages.

NOTE: The documentation for the new distribution headers is not done yet.

garazdawi added some commits Dec 17, 2018

erts: Refactor rbt _yielding to use reductions
All of the Red-Black Tree _yielding functions have been
updated to work with reductions returned by the called
function instead of yielding on each element.

@garazdawi garazdawi self-assigned this Feb 5, 2019

@garazdawi garazdawi requested review from rickard-green and sverker Feb 5, 2019

@michaelklishin

This comment has been minimized.

Copy link
Contributor

commented Feb 5, 2019

@garazdawi how will this work for mixed version clusters, e.g. when an OTP 22 node is connected to a 21.2 one?

@garazdawi

This comment has been minimized.

Copy link
Contributor Author

commented Feb 5, 2019

The feature is only available if both nodes present the distribution flag indicating that they support fragmented messages.

Show resolved Hide resolved erts/doc/src/erl_dist_protocol.xml Outdated

garazdawi added some commits Dec 17, 2018

erts: Remove a copy of distribution data payload
Before this change the inet driver was in list mode and
thus the data from it had to be copied when received by
the dist entry. This change puts the tcp port in binary mode
and makes the any refc binary created by it be used all the way
to the process where it is decoded.

Thus eliminating one copy of the entire message payload.
erts: Move reason in dist messages to payload
The dist messages EXIT, EXIT2 and MONITOR_DOWN have been
updated with new versions that send the reason term as
part of the payload of the message instead of as part
of the control message.

This allows the decode of the reason to be done by the
receiving process instead of the dist entry which in turn
makes it possible for multiple decodes to be done in
parallel.

This change is done in order to make it easier to fragment
the potentially large payload of EXIT, EXIT2 and MONITOR_DOWN
into multiple distribution messages.

OTP-15611

@garazdawi garazdawi force-pushed the garazdawi:lukas/erts/fragment-dist-messages branch from ccdb9c0 to f4c121b Feb 22, 2019

garazdawi added some commits Sep 26, 2018

erts: Refactor ErtsSendContext to be ErtsDSigSendContext
This commit removed the general send context (which was used
very little anyways) and only uses the distributed send context.
This will make it easier to use the dist API at the cost of
a little bit more code for the local send.
erts: Implement trapping while sending distr exit/down
The reason in EXIT and DOWN may be arbitrarily large,
so we yield and allow other processes to execute while
encoding and sending the signals over the distribution.

@garazdawi garazdawi force-pushed the garazdawi:lukas/erts/fragment-dist-messages branch 2 times, most recently from 3ee8e06 to c0c6f6b Feb 22, 2019

@garazdawi garazdawi merged commit c0c6f6b into erlang:master Feb 22, 2019

1 of 2 checks passed

continuous-integration/travis-ci/pr The Travis CI build is in progress
Details
license/cla Contributor License Agreement is signed.
Details
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.