-
-
Notifications
You must be signed in to change notification settings - Fork 5.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Make workers resilient to deserialize failures #13795
Conversation
malmaud
commented
Oct 27, 2015
- Add a boundary marker to the wire format of transmitting remote messages so errors in deserialization can be recovered from.
- Store deserialization error in proper refs to the client sees the error at the expected time.
@@ -786,7 +805,7 @@ end | |||
|
|||
function wait_ref(rid, callee, args...) | |||
v = fetch_ref(rid, args...) | |||
if isa(v, RemoteException) | |||
if isa(v, Remot1eException) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
deliberate?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think you know the answer is to that
e6e88a5
to
57dd49c
Compare
Looks good! |
It is to be noted that this adds a minimum of an extra 20 bytes for every message. While it is not much generally, it will worsen the already poor scalability of our existing DGC mechanism. That issue should be tackled by having write-once Futures (in most cases where RemoteRefs are currently used), and a complete removal of RemoteRefs from DArrays. |
If the 20 bytes really is a concern, then @ssfrr proposal to use https://en.wikipedia.org/wiki/Consistent_Overhead_Byte_Stuffing could be used. But I do totally agree that if we're at the point where 20 bytes/message is a concern, it's a symbol of the fact that too many small messages are being sent and that can be more productively addressed through architectural changes than by byte-shaving. |
I prefer this scheme to COBS which will have quite a computation overhead for long messages, no? |
Looking at Wikipedia's C reference implementation, It will be O(n) in the length of the message with a constant pretty close to 1. |
How would COBS work for direct read/write to large array locations without any intermediate buffer? |
COBS required a fixed 256-byte buffer independent of the size of the stream it's encoding. So we could create a new IO type that has as members an internal buffer of Vector{UInt}(256) and a reference to the underlying (TCP) stream. The calls to But COBS introduces one byte of overhead for every 256 bytes in the original message.So by the time the message is (20*256)=5120 bytes, the overhead is greater than the random-10-byte boundary marker scheme. |
What I meant was the computational overhead will include the cost of a |
Ya, that's true. |
Anyways, I'm definitely not advocating for COBS, just pointing it out. I'd vote for merging this as-is and then tacking the general latency and performance problems with routine |
I don't like the idea that there's even a remote chance of accidental incorrect decoding. That just seems like asking for trouble. But I also don't want to pick a scheme that precludes zero-copy I/O for arrays. |
I think the risk is being overstated - if deserialization occurs correctly, then it doesn't matter if the serialized data has the boundary bytes in it. That's the key thing to remember. The only risk is if deserialization throws an error, and then in the middle of the deserialization stream past where the error was thrown the 10 boundary bytes occur (there's a 1e-25 probability of any such 8-byte sequence accidentally matching the boundary bytes). And even in that vanishingly small situation that requires the confluence of deserialization failing and a ~1e-25 event occurring, the worker would just immediately throw a fatal error as it tries to read the next message in the stream. Cf the widely used HTTP multipart encoding, which uses this same scheme except will fail if the boundary bytes occur anywhere in the payload, regardless of there's any kind of error or not. |
That's a good argument. |
What's the current thinking about this? Anything more I can do here? |
Can you add more detailed comments as to what is going on (for instance the byte representation of a message stream)? |
LGTM. It will be good to test the impact of this change with timing, say a 10^5 |
Testing locally with addprocs(1)
@everywhere echo(x) = x
function test()
for n=1:10^5
remotecall_fetch(echo, 2, :test)
end
end
@time test()
@time test() gives So the regression is actually more serious than I thought. |
Maybe instead of allocating and randomly generating the 10-byte boundary marker each time a message sent, a single pre-generated boundary could be used for the entire Julia session. |
Why not just pick a fixed 10-byte boundary? Then it's deterministic, which I like better. I guess that makes it vulnerable to intentional attacks though. |
Ya, that's what I meant by 'pre-generated boundary'. Another other obvious optimization is to hoist the allocation of the buffer that stores the boundary bytes outside the loop in the worker message-processing loop. |
I don't see how it increases vulnerability since we are not depending on the boundary for message processing. We are dependent on it only for recovery from a deserialization error. We could have a random 10-byte boundary that is generated and sent once for every connection as part of the absolute first message sent from either side.This is used for the duration of the connection and we have a pair of them for every connection. For connections from master-worker, So, for a connection between workers say, 4 and 5, we have a 10-byte boundary defined for messages sent from 4 to 5 and a different boundary for messages sent from 5 to 4. |
What's gained from that vs a fixed Julia-wise 10 bytes?
|
Hmm, this whole PR kinda slipped my mind. I could probably look into it today. |
@amitmurthy I'm thinking of working on this during the hackathon today. Not sure it's worth the effort to formally rebase given the numbers of conflicts I'm getting, but I can reimplement the logic on top of current master without too much trouble. |
ced7520
to
1b7e3e1
Compare
As discussed, have rebased onto master. There is still a slowdown that needs to be tackled.
|
Great! Thinks to try:
|
Is there a reason to not make |
Not really. Making stuff immutable didn't help. Will try writing the ints directly next. |
Hmm, I pushed my own version of immutability but I'm not sure exactly what timing tests you're running. |
Locally testing. Running the same test as #13795 (comment) |
Serializing 4 ints and not the type returns
Better than master. What timings are you seeing with just making the header stack-allocatable? |
No improvement. I didn't look at the codegen though to see if its actually stack allocating though. It's also possible that even if the |
So seems the right thing to do is to just sent the raw ints? Probably still a good coding practice to change as many of the types in multi.jl to immutable as possible, but that can be a separate PR. |
Just pushed header-as-ints update. But this leads me to wonder what the actual overhead now is of each message itself wrapped in a type. |
Bypassing the serializer for
Will cleanup the code and push |
@@ -28,13 +28,8 @@ let REF_ID::Int = 1 | |||
end | |||
|
|||
immutable RRID | |||
<<<<<<< Updated upstream |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
be sure to squash this
ec16a4e
to
f317f5c
Compare
@malmaud , have pushed an update. I have optimized it a bit more and am seeing decent numbers now.
Please review and merge. On a different note, optimizing generic serialization of compound types will result in quite a large speedup of inter-process communication. But that is probably 0.6 work. |
end | ||
end | ||
end | ||
@eval begin |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
combine the two eval blocks?
f317f5c
to
ff784f8
Compare
LGTM! Let's merge once CI passes unless anyone has any further objections. Thanks @amitmurthy for all the help in seeing this through. |