IO: Polish IO closing for MT #8733

bcardiff · 2020-02-03T20:49:19Z

For now IO#close is not thread-safe. You should not close the IO from concurrently.

Even though if that's the case, before this PR the LibEvent events could be freed twice and the awaiting readers/writers fibers could be enqueued again. This is now fixed by the changes in io/evented.cr.

I also align the exception raised when the awaiting reader is awakened with a EBADF / closed fd.

The IO::Error added is to match the description when a writer is awakened with a EBADF. These does not match 100% the error if the fd was closed from the beginning. This bugs me a little.

Also the socket.cr does not have this special handling of Errno in unbuffered_read/unbuffered_write.

The exact exception type and message might come back with Errno exceptions are revisited.

After this PR if IO#close is called concurrently from multiple threads one might fail with Errno.new("Error closing file") in Crystal::System::FileDescriptor#file_descriptor_close since the fd might be already closed. Without adding a lock for IO#close it is not possible to prevent this situation. The implementation of evented_close it is thread-safe now and that protects the runtime enough.

RX14

each_and_clear seems an ugly primitive - can't there be some kind of sync method on thread_local and support recursive locking?

RX14 · 2020-02-07T16:52:57Z

I'd at least like a better name...

bcardiff · 2020-02-07T18:11:25Z

The lock in thread local value did not need to be exposed until now at least.

I agree the name is not the best, but it's private api and is only used in the lines that were changed in this PR. I see it more like a helper for the implementation.

Another name could be consume_each or something like that. But altough each_and_clear seems ugly, it's pretty clear what it does.

RX14 · 2020-02-08T11:12:08Z

I'd prefer consume_each actually, it seems more obvious to me what it does. But only a style issue, I'll ack this if you don't wish to change the name.

ThreadLocalValue#each and ThreadLocalValue#clear were requesting a lock separately. If two threads were closing the IO the event could be freed twice.

bcardiff · 2020-02-10T18:22:31Z

Rebased on master and method renamed. Approval pending.

RX14 · 2020-02-13T16:52:20Z

src/crystal/system/win32/file_descriptor.cr

@@ -6,7 +6,11 @@ module Crystal::System::FileDescriptor
  private def unbuffered_read(slice : Bytes)
    bytes_read = LibC._read(@fd, slice, slice.size)
    if bytes_read == -1
-      raise Errno.new("Error reading file")
+      if Errno.value == Errno::EBADF


File descriptors can be reused - if this if branch can be hit, then doesn't that imply that a write may happen to the wrong file descriptor because of a race condition?

I'm not sure how probably that reutilization.
The same check was added to #unbuffered_write in #6497.
Also with waj we mention that these checks might need to move to evented directly.

I'm not sure how probably that reutilization

Well they're sequential, so very probably.

If you open, close, open again, the second open will have exactly the same FD as the first. There must be no point after LibC.close is called that the FD is used again in another syscall (for the same IO::FileDescriptor, there's not much we can do if there exists two objects).

Yeah looks like a possible race condition, tricky one... Appears that java avoids this by setting the internal "fd" to -1 before doing a close: https://github.com/openjdk-mirror/jdk7u-jdk/blob/f4d80957e89a19a29bb9f9807d2a28351ed7f7df/src/solaris/native/java/io/io_util_md.c#L93 they also synchronize the calls to close FWIW: https://github.com/openjdk-mirror/jdk7u-jdk/blob/master/src/share/classes/java/io/FileInputStream.java#L289

bcardiff added kind:bug topic:stdlib:concurrency topic:stdlib:networking labels Feb 3, 2020

bcardiff requested a review from waj February 5, 2020 16:56

bcardiff added this to the 0.33.0 milestone Feb 5, 2020

bcardiff requested a review from RX14 February 5, 2020 19:50

RX14 reviewed Feb 7, 2020

View reviewed changes

rdp mentioned this pull request Feb 10, 2020

Crystal::ThreadLocalValue(Deque(Fiber)), Thread... Server hangs, crash, Deadlock? #8714

Closed

bcardiff added 4 commits February 10, 2020 15:13

IO: Ensure thread local values are consumed by the same thread on close

a46ae09

ThreadLocalValue#each and ThreadLocalValue#clear were requesting a lock separately. If two threads were closing the IO the event could be freed twice.

IO: Reorganize closing specs

94bf654

IO: Raise IO::Error when file descriptor is closed while waiting

98214d0

Rename each_and_clear to consume_each

e6ace56

bcardiff force-pushed the mt/io-close branch from 08cfa24 to e6ace56 Compare February 10, 2020 18:15

waj approved these changes Feb 11, 2020

View reviewed changes

bcardiff merged commit a2bba01 into crystal-lang:master Feb 11, 2020

bcardiff deleted the mt/io-close branch February 12, 2020 12:18

RX14 reviewed Feb 13, 2020

View reviewed changes

bcardiff mentioned this pull request Mar 2, 2020

IO: Set fd to -1 on closed File/Socket #8873

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

IO: Polish IO closing for MT #8733

IO: Polish IO closing for MT #8733

bcardiff commented Feb 3, 2020

RX14 left a comment

RX14 commented Feb 7, 2020

bcardiff commented Feb 7, 2020

RX14 commented Feb 8, 2020

bcardiff commented Feb 10, 2020

RX14 Feb 13, 2020

bcardiff Feb 14, 2020

RX14 Feb 14, 2020 •

edited

Loading

rdp Feb 15, 2020 •

edited

Loading

IO: Polish IO closing for MT #8733

IO: Polish IO closing for MT #8733

Conversation

bcardiff commented Feb 3, 2020

RX14 left a comment

Choose a reason for hiding this comment

RX14 commented Feb 7, 2020

bcardiff commented Feb 7, 2020

RX14 commented Feb 8, 2020

bcardiff commented Feb 10, 2020

RX14 Feb 13, 2020

Choose a reason for hiding this comment

bcardiff Feb 14, 2020

Choose a reason for hiding this comment

RX14 Feb 14, 2020 • edited Loading

Choose a reason for hiding this comment

rdp Feb 15, 2020 • edited Loading

Choose a reason for hiding this comment

RX14 Feb 14, 2020 •

edited

Loading

rdp Feb 15, 2020 •

edited

Loading