Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix concurrent large transfers by using dedicated channel #38

Merged
merged 1 commit into from Feb 24, 2016

Conversation

@antrik
Copy link
Contributor

antrik commented Feb 21, 2016

As evidenced by the newly added "concurrent_senders" test case, the
previous approach never actually worked with concurrent large transfers:
when trying to "push back" packets to the "receiver" FD of the socket
pair, they do not show up on the "receiver" FD again, but rather the
"sender" FD.

This problem can be avoided entirely by automatically creating a
dedicated channel for each large transfer, so the followup fragments can
all be received in order, without interleaving with fragments from other
senders.

The first fragment of each large transfer is still sent through the main
channel, along with an FD for the dedicated channel; the remaining
fragments are then sent through the dedicated channel.

This approach adds a slight overhead for fragmented transfers when there
is no actual concurrency -- though it is way more efficient when actual
concurrency occurs than the push-back approach would have been...

With the dedicated channel, the follow-up fragments do not really need
the fragmentation headers anymore; and the header for the initial
fragment could be simplified as well. However, to keep changes minimal,
this commit doesn't change the header yet -- this is left for later
improvements.

@antrik
Copy link
Contributor Author

antrik commented Feb 21, 2016

According to @glennw this fixes hangs in WebRender.

iov_len: data_buffer.len() as size_t,
};
// Put this on the heap so address remains stable across function return.
let mut iovec = Box::new(iovec {

This comment has been minimized.

@emilio

emilio Feb 21, 2016

Member

Isn't this relying on the iovec not being freed? (so we're either using it after free or leaking it?)

This comment has been minimized.

@emilio

emilio Feb 21, 2016

Member

I don't know how critical it is, but maybe you could prevent this allocation, and the hackish iovec return passing an uninitialized &mut iovec to the closure.

This comment has been minimized.

@antrik

antrik Feb 21, 2016

Author Contributor

I tried this too; however, it can't actually be passed uninitialised as far as I can see (the compiler complains) -- so in the end it looked uglier to me than the Box approach.

(Especially since in truth the iovec field is actually a variable-sized array... Which I intend to make explicit in one of the follow-up commits.)

This comment has been minimized.

@emilio

emilio Feb 21, 2016

Member

There is std::mem::uninitialized() that allows you to do that :P

msg_flags: 0,
// Be sure to always return iovec -- whether the caller uses it or not --
// to prevent premature deallocation!
(msghdr, iovec)

This comment has been minimized.

@emilio

emilio Feb 21, 2016

Member

Whoops, forget that, just read this line.

*cmsg_padding_ptr = *DEV_NULL;
cmsg_padding_ptr = cmsg_padding_ptr.offset(1);
}
let construct_header = |channels: &Vec<UnixChannel>, data_buffer: &Vec<u8>| {

This comment has been minimized.

@emilio

emilio Feb 21, 2016

Member

nit: You can prevent the extra indirection using &[UnixChannel] instead of &Vec<UnixChannel>, and &[u8] instead of &Vec<u8>.

This comment has been minimized.

@antrik

antrik Feb 21, 2016

Author Contributor

I wonder whether this would improve readability and/or performance?

(BTW, I intend to make major changes to this code in further iterations -- so probably not worthwhile to spend too much time polishing details... But I can change it of course if it's generally considered better style.)

This comment has been minimized.

@emilio

emilio Feb 21, 2016

Member

I think it improves both (although performance-wise it's probably optimized away in release). It's fine for me if you don't change it just yet though.

This comment has been minimized.

@antrik

antrik Feb 21, 2016

Author Contributor

OK -- being a newbie at Rust, I'll just trust you on this :-)

@@ -691,21 +709,19 @@ fn recv(fd: c_int, blocking_mode: BlockingMode)
return Ok((main_data_buffer, channels, shared_memory_regions))
}

// Reassemble fragments.
let mut leftover_fragments = Vec::new();
// Reassomble fragments.

This comment has been minimized.

@emilio

emilio Feb 21, 2016

Member

nit: Reassemble

This comment has been minimized.

@antrik

antrik Feb 21, 2016

Author Contributor

Whoops... Not sure how this slipped through. I usually reread my diffs thoroughly to prevent this kind of fail :-(

@emilio
Copy link
Member

emilio commented Feb 21, 2016

If I understand this well, this works by creating an extra dedicated channel at send, and receiving the dedicated channel receiver at recv time, and receiving leftovers through the dedicated channel.

I don't know how performant is it, but at least it seems correct :)

FWIW, all the tests are passing locally for me in both debug and release mode.

r? @pcwalton

@emilio
Copy link
Member

emilio commented Feb 21, 2016

@antrik Do you know if the max message size received in a single fragment is well defined?

Could the extra channel creation be prevented with small messages?

I guess it's not, but...

@antrik
Copy link
Contributor Author

antrik commented Feb 21, 2016

@ecoal95 check out the rest of the function: fragmentation (and creation of the dedicated channel) is only done if sending in one piece failed. For small messages, this change shouldn't introduce any overhead whatsoever.

@emilio
Copy link
Member

emilio commented Feb 21, 2016

@antrik Whoops, sorry, that's true (didn't read thoroughly, and missed the early return Ok(())).

Nice then! :)

@antrik antrik force-pushed the antrik:fix-concurrent-send branch from 813c62d to 0bf557e Feb 21, 2016
@antrik
Copy link
Contributor Author

antrik commented Feb 21, 2016

Fixed the nits -- and also removed the now unused UnixCmsg.send() method, which I forgot to do earlier...

@pcwalton
Copy link
Collaborator

pcwalton commented Feb 22, 2016

Looks good other than one stylistic nit. Thanks so much for looking into this!

@antrik
Copy link
Contributor Author

antrik commented Feb 23, 2016

That was the other major stylistic doubt I had... ;-)

The only variable it closes over is shared_memory_regions. I considered passing it explicitly too for more uniformity... And I also tried declaring it as a "regular" function. It felt a bit awkward because of the additional type annotations needed with that syntax -- but I'm not really opposed to it.

However, if I understand you right, you would actually want that as an entirely separate function rather than a nested one? I'd rather not do that: this code is tightly coupled; I'm sure it will feel awkward when I will be making further changes to it later on...

@pcwalton
Copy link
Collaborator

pcwalton commented Feb 24, 2016

OK, I don't really care too much either way. I think I'd prefer a nested function if it isn't too much trouble.

As evidenced by the newly added "concurrent_senders" test case, the
previous approach never actually worked with concurrent large transfers:
when trying to "push back" packets to the "receiver" FD of the socket
pair, they do not show up on the "receiver" FD again, but rather the
"sender" FD.

This problem can be avoided entirely by automatically creating a
dedicated channel for each large transfer, so the followup fragments can
all be received in order, without interleaving with fragments from other
senders.

The first fragment of each large transfer is still sent through the main
channel, along with an FD for the dedicated channel; the remaining
fragments are then sent through the dedicated channel.

This approach adds a slight overhead for fragmented transfers when there
is no actual concurrency -- though it is way more efficient when actual
concurrency occurs than the push-back approach would have been...

With the dedicated channel, the follow-up fragments do not really need
the fragmentation headers anymore; and the header for the initial
fragment could be simplified as well. However, to keep changes minimal,
this commit doesn't change the header yet -- this is left for later
improvements.
@antrik antrik force-pushed the antrik:fix-concurrent-send branch from 0bf557e to ce7c296 Feb 24, 2016
@antrik
Copy link
Contributor Author

antrik commented Feb 24, 2016

Turned it into a "regular" nested function now.

@pcwalton
Copy link
Collaborator

pcwalton commented Feb 24, 2016

@bors-servo
Copy link
Contributor

bors-servo commented Feb 24, 2016

📌 Commit ce7c296 has been approved by pcwalton

@bors-servo
Copy link
Contributor

bors-servo commented Feb 24, 2016

Testing commit ce7c296 with merge 8bd8d3e...

bors-servo added a commit that referenced this pull request Feb 24, 2016
Fix concurrent large transfers by using dedicated channel

As evidenced by the newly added "concurrent_senders" test case, the
previous approach never actually worked with concurrent large transfers:
when trying to "push back" packets to the "receiver" FD of the socket
pair, they do not show up on the "receiver" FD again, but rather the
"sender" FD.

This problem can be avoided entirely by automatically creating a
dedicated channel for each large transfer, so the followup fragments can
all be received in order, without interleaving with fragments from other
senders.

The first fragment of each large transfer is still sent through the main
channel, along with an FD for the dedicated channel; the remaining
fragments are then sent through the dedicated channel.

This approach adds a slight overhead for fragmented transfers when there
is no actual concurrency -- though it is way more efficient when actual
concurrency occurs than the push-back approach would have been...

With the dedicated channel, the follow-up fragments do not really need
the fragmentation headers anymore; and the header for the initial
fragment could be simplified as well. However, to keep changes minimal,
this commit doesn't change the header yet -- this is left for later
improvements.
@antrik
Copy link
Contributor Author

antrik commented Feb 24, 2016

Weird: it turns out that this change actually slightly improves performance even in the non-concurrent case...

@bors-servo
Copy link
Contributor

bors-servo commented Feb 24, 2016

☀️ Test successful - travis

@bors-servo bors-servo merged commit ce7c296 into servo:master Feb 24, 2016
1 of 3 checks passed
1 of 3 checks passed
continuous-integration/appveyor/pr Waiting for AppVeyor build to complete
Details
homu Testing commit ce7c296 with merge 8bd8d3e...
Details
continuous-integration/travis-ci/pr The Travis CI build passed
Details
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Linked issues

Successfully merging this pull request may close these issues.

None yet

4 participants
You can’t perform that action at this time.