Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use autoincrementing number for handler ids #105

Merged
merged 3 commits into from
Nov 10, 2016

Conversation

dlrobertson
Copy link
Contributor

@dlrobertson dlrobertson commented Oct 1, 2016

Use Uuid instead of i64 for the handler keys to ensure file descriptor number reuse doesn't cause errors, and make sure to close file descriptors.

NB: The scope of the PR has changed to change the handler id's from a platform specific i64 to an autoincrementing u64. macos has not yet been ported, and probably suffers from the same leak as the unix module did prior to this PR

@vvuk
Copy link
Contributor

vvuk commented Oct 14, 2016

Instead of uuid, can we just have a i64 id that we just atomically increment when we need one? Would also mean no changes to anything but the platform backends.

@dlrobertson
Copy link
Contributor Author

Valid point. We could do that, but imo if we did that we should definitely implement some sort of id reuse. The creation of the id reuse algorithm could take as much effort as switching Uuid, depending on the route taken.

@vvuk
Copy link
Contributor

vvuk commented Oct 18, 2016

Why do we need id reuse? If a program allocated a new id every millisecond, it would take ~585 million years to exaust 2^64. I think that's a pretty good definition of "someone else's problem" ;) (1 per ns gives us 585 years, which is slightly more concerning, but only slightly.)

@dlrobertson
Copy link
Contributor Author

dlrobertson commented Oct 18, 2016

Haha very vary true and a very good point... I though about that, and AFAIK you'd have to exhaust ((2 ^ 32) - 1) due to the use of i64 and the two's complement, so you'd end up with ~292 years which from what I've heard is about as long as it would take to generate a duplicate Uuid (from what I've read on the internets). If we switched to u64 for the id we'd be at ~585 to cause overflow. But at that point, the chances of generating a duplicate or causing overflow is pretty low.

If the consensus is to use an auto incrementing i64 I'm 100% okay with that. I'm just hyper paranoid about writing code that allows for overflow, but I suppose that is what checked_add is for.

@antrik
Copy link
Contributor

antrik commented Oct 19, 2016

It's 2^63-1 till integer overflow -- actual collision would indeed be exactly 2^64 I believe. You should be able to avoid the overflow, to squeeze out that last bit (literally), by using u64 internally, and casting to i64 as necessary.

(For production builds, it should produce the exact same machine code I think?)

Either way, I think it will be more efficient and less invasive than using UUIDs.

I'd say the actual FD closing should probably go into a separate patch. (The ID change lays the groundwork by decoupling the handles from the FDs; while the main change builds on top of this to fix the leak AIUI...)

Changing the handling of r_id to use Option<> can probably also go into a separate patch before the main ID handling change, for easier review... Unless it's too much hassle to untangle it I guess.

@dlrobertson
Copy link
Contributor Author

using u64 internally, and casting to i64 as necessary.

Good point. I'll use this in the update

I'd say the actual FD closing should probably go into a separate patch.

Np. Nothing git rebase can't handle.

Thanks @antrik and @vvuk! I'll have some time this weekend and push the update using u64.

@bors-servo
Copy link
Contributor

☔ The latest upstream changes (presumably #102) made this pull request unmergeable. Please resolve the merge conflicts.

Use an Option<i64> for r_id instead of a plain i64, and use None instead
of -1 to indicate the id does not exist in the handles.
@dlrobertson dlrobertson force-pushed the use-uuid branch 2 times, most recently from eb72e53 to b8fdf19 Compare November 7, 2016 00:38
@dlrobertson
Copy link
Contributor Author

Updated to use an auto-incrementing u64

Copy link
Contributor

@antrik antrik left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The code looks good :-) (Aside from minor optimisations -- see below.)

The PR title however doesn't fit any more; and perhaps more importantly, I'm a bit unhappy with the commit messages... The second commit in particular actually has three somewhat distinct changes: switching the type of the tokens returned by OsIpcReceiverSet.add()/select() to u64, to delay potential integer overflow; factoring out the counter code from the inprocess implementation into a generic Incrementor facility, so it can be reused by the other back-ends; and decoupling the tokens used in the unix implementation from the actual FDs, to allow closing the FDs without introducing race conditions due to FD reuse. While I don't feel strongly about actually splitting these into separate commits (I don't think it substantially affects readability in this case), I think it would help a lot at least to explicitly mention and explain each of these changes in the commit message...

The third commit could mention that the FDs are closed in select(); and that this is to avoid them leaking. Also might want to point out that this is safe now with the previous changes.

The first patch ironically has the most explicit commit message, although it's the simplest change :-) Still, could add that this is in preparation for changing the type to u64.

Sorry for picking on formalities like this. I'm just pedantic in general; and I really like a readable history :-) As usual, feel free to ignore these remarks if you think them silly...

On an unrelated note, it might be good to mention somewhere that the macos implementation probably suffers from a similar leak? Though maybe that would be better placed in the issue tracker...

for fd in hangups.iter() {
self.fdids.remove(fd);
unsafe {
libc::close(*fd);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

While I do understand that we need to delay purging the entries from pollfds, I can't think of any reason for delaying the FD closing?...

Also, I just realised that fdids doesn't really need to be a HashMap: you could just put the IDs in a positional vector parallel to pollfds. Or if you keep it a HashMap, you can drop the fdids entries immediately; and then you won't need hangups any more, as you can then just use fdids to check which entries need to be retained in pollfds. Not sure which is more efficient -- and with the mio patch hopefully getting merged soon, it's probably not worth spending too much thought on it... Just pick whatever seems easier I'd say :-)

@dlrobertson dlrobertson changed the title [WIP] Use Uuid instead of i64 for handler keys Use autoincrementing number for handler ids Nov 7, 2016
Use an auto incrementing u64 value as the OsIpcReceiver's id within the
OsIpcReceiverSet in the inprocess and unix modules.

  - Change the type returned by OsIpcReceiverSet select and add to u64,
    to delay overflow.
  - Factor out the implementation of an auto-incrementing id
    inprocess into os::Incrementor, to be used by other platforms.
  - Use the Incrementor in the unix module to decouple the id of the
    OsIpcReceiver from the file descriptor to avoid race conditions due
    to file descriptor reuse.
@dlrobertson
Copy link
Contributor Author

Thanks again, great critiques! I clearly need to work on better commit messages 😄.

}
// Avoid the use of `self` in closue to avoid errors due to the
// mutable borrow of `self` at line 483
let ids = &self.fdids;
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The borrow checker has beaten me... I don't really like this, but this was the best workaround I could think of. If anyone can think of a better workaround, please let me know!

Either way... It's only temporary until the mio PR lands.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What happens if instead of doing an inline closure in line 483, you make a let-binding of a closure and pass it in as an argument to retain?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good idea! but sadly no :(

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I understand the situation; I don't understand though why you consider it a problem?...

It's not exactly uncommon to create temporary bindings to get around borrow conflicts. I'm not sure this even deserves a comment: it doesn't seem like an obscure situation that requires much of an explanation?...

(BTW, including line numbers in a comment is a bad idea: these are bound to change :-) )

On the other hand, I for my part would be tempted to add a comment explaining that the entries that got removed from fdids, because the channels got closed, also need to be purged from pollfds. (And perhaps even mention that we can only do that after we finished iterating pollfds?...) These are the kind of things that are probably not immediately obvious even to an experienced Rust programmer looking at this code for the first time.

(I am aware that the existing ipc-channel code doesn't have much in the manner of such explanations... Which I personally consider a pity.)

On a more nit-picky note, I'd drop the blank line after the temporary binding -- these two lines very much belong together. Also, I think it would actually be preferable to name the temporary as fdids as well...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(BTW, including line numbers in a comment is a bad idea: these are bound to change :-) )

Ha true! good call... Removed the line numbers

I also added a comment above the retain detailing the reason for the placement.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

BTW, just for explanation (in case it isn't clear): the issue here isn't the borrow checker being dumb, but rather it's about the way how closures are handled. They automatically close over any binding used within the closure's code -- in this case the binding is self. The compiler doesn't try to figure out that you are actually accessing only one sub-field. If you want only the sub-field to be closed over, you have to create an explicit binding for it... Which is precisely what you did :-)

For unix OSes make sure to close the file descriptor on ChannelClosed.
The file descriptors should be closed in select to avoid leaking fds
after removal from pollfds (servo#96). After e06edbc this is safe and
avoids the previously seen race condition due to file descriptor
reuse.
@antrik
Copy link
Contributor

antrik commented Nov 10, 2016

Looks good to me :-) Now we just need someone to approve it...

@emilio
Copy link
Member

emilio commented Nov 10, 2016

@bors-servo r=antrik

@bors-servo
Copy link
Contributor

📌 Commit 2e1081c has been approved by antrik

@bors-servo
Copy link
Contributor

⌛ Testing commit 2e1081c with merge 357abb9...

bors-servo pushed a commit that referenced this pull request Nov 10, 2016
Use autoincrementing number for handler ids

Use `Uuid` instead of `i64` for the handler keys to ensure file descriptor number reuse doesn't cause errors, and make sure to close file descriptors.

**NB:** The scope of the PR has changed to change the handler id's from a platform specific `i64` to an autoincrementing `u64`. `macos` has not yet been ported, and probably suffers from the same leak as the `unix` module did prior to this PR
@bors-servo
Copy link
Contributor

☀️ Test successful - status-appveyor, status-travis

@bors-servo bors-servo merged commit 2e1081c into servo:master Nov 10, 2016
@dlrobertson
Copy link
Contributor Author

Cool beans! I'll move back onto #94

@dlrobertson dlrobertson deleted the use-uuid branch November 11, 2016 01:22
@emilio
Copy link
Member

emilio commented Nov 11, 2016

Thanks for doing this work, it's awesome :)

On Thu, Nov 10, 2016 at 05:22:13PM -0800, Dan Robertson wrote:

Cool beans! I'll move back onto #94

You are receiving this because you commented.
Reply to this email directly or view it on GitHub:
#105 (comment)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants