Implement fragmentation of large distribution messages #2133

garazdawi · 2019-02-05T14:35:44Z

This PR implements fragmentation of large Erlang Distribution signals in order to prevent head-of-line blocking. This PR introduces two changes to the
distribution protocol.

Move exit reason of EXIT, EXIT2 and MONITOR_P_EXIT to after the control message.
Introduce two new distribution headers that represent:
Start of a new sequence of fragments
A fragment in a sequence

The new distribution headers look like this:

1	1	8	8	1	NumberOfAtomCacheRefs/2+1 \| 0	N \| 0
131	69	SeqId	FragNo	NumberOfAtomCacheRefs	Flags	AtomCacheRefs

1	1	8	8
131	70	SeqId	FragNo

The atom cache is the same for the entire sequence.
The FragNo starts at the total number of fragments in the sequence and then decrements to 1, i.e. in a sequence of 2 fragments the start header has FragNo set to 2 and the following fragment has FragNo set to 1.
The old distribution header is still used for messages that do not need to be fragmented.

The following restrictions exist when using the message fragmentation:

Only the payload of the message may be fragmented. The control sequence may not span across several fragments.
Only one sequence may be sent by one process at a time.
Fragments must arrive in the correct order. i.e. if a sequence consists of 4 fragments, then the fragments have to arrive as 4, 3, 2, 1.

In addition to these changes to the Erlang Distribution protocol, this PR also fixes and optimizes many internal issues.

Yielding during processes exit when sending many exit/down messages
Change the distr inet driver to run in binary mode and fix dist.c to not copy the payload unneccisarily.
Trap when sending distributed exit/1, exit/2 and monitor down messages.

NOTE: The documentation for the new distribution headers is not done yet.

All of the Red-Black Tree _yielding functions have been updated to work with reductions returned by the called function instead of yielding on each element.

michaelklishin · 2019-02-05T14:43:04Z

@garazdawi how will this work for mixed version clusters, e.g. when an OTP 22 node is connected to a 21.2 one?

garazdawi · 2019-02-05T14:44:19Z

The feature is only available if both nodes present the distribution flag indicating that they support fragmented messages.

erts/doc/src/erl_dist_protocol.xml

OTP-15610

Before this change the inet driver was in list mode and thus the data from it had to be copied when received by the dist entry. This change puts the tcp port in binary mode and makes the any refc binary created by it be used all the way to the process where it is decoded. Thus eliminating one copy of the entire message payload.

The dist messages EXIT, EXIT2 and MONITOR_DOWN have been updated with new versions that send the reason term as part of the payload of the message instead of as part of the control message. This allows the decode of the reason to be done by the receiving process instead of the dist entry which in turn makes it possible for multiple decodes to be done in parallel. This change is done in order to make it easier to fragment the potentially large payload of EXIT, EXIT2 and MONITOR_DOWN into multiple distribution messages. OTP-15611

OTP-15612

This commit removed the general send context (which was used very little anyways) and only uses the distributed send context. This will make it easier to use the dist API at the cost of a little bit more code for the local send.

The reason in EXIT and DOWN may be arbitrarily large, so we yield and allow other processes to execute while encoding and sending the signals over the distribution.

garazdawi added 2 commits February 5, 2019 14:40

erts: Limit binary printout for %.XT in erts_print

bcb79fa

erts: Refactor rbt _yielding to use reductions

5c8f2be

All of the Red-Black Tree _yielding functions have been updated to work with reductions returned by the called function instead of yielding on each element.

garazdawi added team:VM Assigned to OTP team VM feature labels Feb 5, 2019

garazdawi self-assigned this Feb 5, 2019

garazdawi requested review from rickard-green and sverker February 5, 2019 14:35

ferd reviewed Feb 5, 2019

View reviewed changes

erts/doc/src/erl_dist_protocol.xml Outdated Show resolved Hide resolved

kjnilsson mentioned this pull request Feb 12, 2019

Adjust batching numbers based on entry sizes rabbitmq/ra#29

Closed

garazdawi added 3 commits February 21, 2019 16:37

erts: Yield later during process exit and allow free procs to run

45c5725

OTP-15610

garazdawi force-pushed the lukas/erts/fragment-dist-messages branch from ccdb9c0 to f4c121b Compare February 22, 2019 08:29

garazdawi added 8 commits February 22, 2019 11:12

erts: Expand distribution protocol documentation

6686877

erts: Implement fragmentation of distrubution messages

f2c4f6f

erts: Make remote send of exit/2 trap

d191345

OTP-15612

erts: Add distr testcases for fragmentation

6493f5e

erts: Refactor ErtsSendContext to be ErtsDSigSendContext

1066040

This commit removed the general send context (which was used very little anyways) and only uses the distributed send context. This will make it easier to use the dist API at the cost of a little bit more code for the local send.

erts: Add ERL_NODE_BOOKKEEP to node tables refc

2bf27ec

erts: Implement trapping while sending distr exit/down

fc09673

The reason in EXIT and DOWN may be arbitrarily large, so we yield and allow other processes to execute while encoding and sending the signals over the distribution.

erts: Expand etp to look for free processes

c0c6f6b

garazdawi force-pushed the lukas/erts/fragment-dist-messages branch 2 times, most recently from 3ee8e06 to c0c6f6b Compare February 22, 2019 10:14

garazdawi merged commit c0c6f6b into erlang:master Feb 22, 2019

michaelklishin mentioned this pull request Oct 24, 2019

ra performance in large data transfer scenarios rabbitmq/ra#135

Closed

garazdawi deleted the lukas/erts/fragment-dist-messages branch October 24, 2019 08:23

ityonemo mentioned this pull request Feb 25, 2021

Kousa: Arch: Consider using Phoenix Tracker or Phoenix Presence instead of database benawad/dogehouse#407

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Implement fragmentation of large distribution messages #2133

Implement fragmentation of large distribution messages #2133

garazdawi commented Feb 5, 2019

michaelklishin commented Feb 5, 2019 •

edited

Loading

garazdawi commented Feb 5, 2019

Implement fragmentation of large distribution messages #2133

Implement fragmentation of large distribution messages #2133

Conversation

garazdawi commented Feb 5, 2019

michaelklishin commented Feb 5, 2019 • edited Loading

garazdawi commented Feb 5, 2019

michaelklishin commented Feb 5, 2019 •

edited

Loading