Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

empty frame in TCP/IP communication #3028

Closed
spirali opened this issue Sep 5, 2019 · 5 comments

Comments

@spirali
Copy link
Contributor

commented Sep 5, 2019

Hi,

As I understand from the documentation, the framing of Dask message on TCP/IP is the following:

<count> <size_1> ... <size_count> <payload_1> ... <payload_count>

In Wireshark I see (Distributed version 2.3.2) the following pattern:

00000000  02 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00   ........ ........
00000010  4f 00 00 00 00 00 00 00  83 a2 6f 70 af 72 65 67   O....... ..op.reg
00000020  69 73 74 65 72 2d 63 6c  69 65 6e 74 a6 63 6c 69   ister-cl ient.cli
00000030  65 6e 74 d9 2b 43 6c 69  65 6e 74 2d 36 37 31 63   ent.+Cli ent-671c
00000040  62 63 61 34 2d 63 66 62  31 2d 31 31 65 39 2d 61   bca4-cfb 1-11e9-a
00000050  62 64 36 2d 39 63 62 36  64 30 62 62 36 36 64 39   bd6-9cb6 d0bb66d9
00000060  a5 72 65 70 6c 79 c2                               .reply.
    00000000  02 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00   ........ ........
    00000010  12 00 00 00 00 00 00 00  91 81 a2 6f 70 ac 73 74   ........ ...op.st
    00000020  72 65 61 6d 2d 73 74 61  72 74                     ream-sta rt
00000067  02 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00   ........ ........
00000077  23 00 00 00 00 00 00 00  92 81 a2 6f 70 ac 63 6c   #....... ...op.cl
00000087  6f 73 65 2d 63 6c 69 65  6e 74 81 a2 6f 70 ac 63   ose-clie nt..op.c
00000097  6c 6f 73 65 2d 73 74 72  65 61 6d                  lose-str eam

It seems that every message is composed of two frames (02 00 00 00 00 00 00 00) where the first frame is always empty (00 00 00 00 00 00 00 00).

Is it a bug or is there reasoning behind it?

@mrocklin

This comment has been minimized.

Copy link
Member

commented Sep 5, 2019

Interesting, we're just using Tornado IOStreams and the following code:

lengths = [nbytes(frame) for frame in frames]
length_bytes = [struct.pack("Q", len(frames))] + [
struct.pack("Q", x) for x in lengths
]
if sum(lengths) < 2 ** 17: # 128kiB
b = b"".join(length_bytes + frames) # small enough, send in one go
stream.write(b)
else:
stream.write(b"".join(length_bytes)) # avoid large memcpy, send in many
for frame in frames:
# Can't wait for the write() Future as it may be lost
# ("If write is called again before that Future has resolved,
# the previous future will be orphaned and will never resolve")
if not self._iostream_allows_memoryview:
frame = ensure_bytes(frame)
future = stream.write(frame)
bytes_since_last_yield += nbytes(frame)
if bytes_since_last_yield > 32e6:
yield future
bytes_since_last_yield = 0

We frame things up a bit with the number of frames, and the size of each frame, but then each frame should be sent independently. I don't see why we would get empty frames. I'd love to see why this is happening though and if there is anything that we can do to fix things. Is this something that you have an interest in looking into?

@spirali

This comment has been minimized.

Copy link
Contributor Author

commented Sep 5, 2019

I see now where it comes from. The first frame is header for the payload, a place where compression can be specified. But as my captured messages have no headers, the first frames are empty. So having first frame empty is probably as intended. In protocol it just a small 8B overhead. I am sorry for opening useless issue.

I am looking into this because I am thinking about creating an experimental (and very limited) scheduler reimplementation in Rust. One reason is to perform some experiments with schedulers and also to estimate speed difference between Rust/Python implementation. Therefore, I am looking into communication, to get a picture of what have to be implemented.

@mrocklin

This comment has been minimized.

Copy link
Member

commented Sep 5, 2019

@mrocklin

This comment has been minimized.

Copy link
Member

commented Sep 5, 2019

OK to close?

@spirali

This comment has been minimized.

Copy link
Contributor Author

commented Sep 6, 2019

If I finish something, I will definitely inform you. Thank you for your response. (I hope no significant changes in protocol is comming soon:)

@spirali spirali closed this Sep 6, 2019

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
2 participants
You can’t perform that action at this time.