Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

IO: Polish IO closing for MT #8733

Merged
merged 4 commits into from
Feb 11, 2020
Merged

Conversation

bcardiff
Copy link
Member

@bcardiff bcardiff commented Feb 3, 2020

For now IO#close is not thread-safe. You should not close the IO from concurrently.

Even though if that's the case, before this PR the LibEvent events could be freed twice and the awaiting readers/writers fibers could be enqueued again. This is now fixed by the changes in io/evented.cr.

I also align the exception raised when the awaiting reader is awakened with a EBADF / closed fd.

The IO::Error added is to match the description when a writer is awakened with a EBADF. These does not match 100% the error if the fd was closed from the beginning. This bugs me a little.

Also the socket.cr does not have this special handling of Errno in unbuffered_read/unbuffered_write.

The exact exception type and message might come back with Errno exceptions are revisited.

After this PR if IO#close is called concurrently from multiple threads one might fail with Errno.new("Error closing file") in Crystal::System::FileDescriptor#file_descriptor_close since the fd might be already closed. Without adding a lock for IO#close it is not possible to prevent this situation. The implementation of evented_close it is thread-safe now and that protects the runtime enough.

Copy link
Contributor

@RX14 RX14 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

each_and_clear seems an ugly primitive - can't there be some kind of sync method on thread_local and support recursive locking?

@RX14
Copy link
Contributor

RX14 commented Feb 7, 2020

I'd at least like a better name...

@bcardiff
Copy link
Member Author

bcardiff commented Feb 7, 2020

The lock in thread local value did not need to be exposed until now at least.

I agree the name is not the best, but it's private api and is only used in the lines that were changed in this PR. I see it more like a helper for the implementation.

Another name could be consume_each or something like that. But altough each_and_clear seems ugly, it's pretty clear what it does.

@RX14
Copy link
Contributor

RX14 commented Feb 8, 2020

I'd prefer consume_each actually, it seems more obvious to me what it does. But only a style issue, I'll ack this if you don't wish to change the name.

@bcardiff
Copy link
Member Author

Rebased on master and method renamed. Approval pending.

@bcardiff bcardiff merged commit a2bba01 into crystal-lang:master Feb 11, 2020
@bcardiff bcardiff deleted the mt/io-close branch February 12, 2020 12:18
@@ -6,7 +6,11 @@ module Crystal::System::FileDescriptor
private def unbuffered_read(slice : Bytes)
bytes_read = LibC._read(@fd, slice, slice.size)
if bytes_read == -1
raise Errno.new("Error reading file")
if Errno.value == Errno::EBADF
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

File descriptors can be reused - if this if branch can be hit, then doesn't that imply that a write may happen to the wrong file descriptor because of a race condition?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure how probably that reutilization.
The same check was added to #unbuffered_write in #6497.
Also with waj we mention that these checks might need to move to evented directly.

Copy link
Contributor

@RX14 RX14 Feb 14, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure how probably that reutilization

Well they're sequential, so very probably.

If you open, close, open again, the second open will have exactly the same FD as the first. There must be no point after LibC.close is called that the FD is used again in another syscall (for the same IO::FileDescriptor, there's not much we can do if there exists two objects).

Copy link
Contributor

@rdp rdp Feb 15, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah looks like a possible race condition, tricky one... Appears that java avoids this by setting the internal "fd" to -1 before doing a close: https://github.com/openjdk-mirror/jdk7u-jdk/blob/f4d80957e89a19a29bb9f9807d2a28351ed7f7df/src/solaris/native/java/io/io_util_md.c#L93 they also synchronize the calls to close FWIW: https://github.com/openjdk-mirror/jdk7u-jdk/blob/master/src/share/classes/java/io/FileInputStream.java#L289

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants