IO#fsync doesn't trigger an fsync(2) syscall? #4073

Closed
marktriggs opened this Issue Aug 14, 2016 · 8 comments

Comments

Projects
None yet
2 participants
@marktriggs
Contributor

marktriggs commented Aug 14, 2016

Hi there,

When I make a call to IO#fsync on an open file handle, I expected to
see an fsync(2) syscall triggered (as MRI Ruby does), but in the
current JRuby master branch I don't see that.

I'm testing on Linux thweeble 3.16.0-4-amd64 #1 SMP Debian 3.16.7-ckt20-1+deb8u4 (2016-02-29) x86_64 GNU/Linux as follows:

 # MRI Ruby
 #
 $ ruby --version
 ruby 2.1.5p273 (2014-11-13) [x86_64-linux-gnu]

 $ strace -qq -e fsync,fdatasync -f ruby -e 'File.open("/tmp/testfile", "w") do |fh| fh.write("test\n"); fh.fsync; end'
 [pid 13600] fsync(7)                    = 0

 # JRuby built from commit 0ebd63589cf430a171f4224da4226e8c86e52203
 #
 $ strace -qq -e fsync,fdatasync -f java -cp lib/jruby.jar org.jruby.Main -e 'File.open("/tmp/testfile", "w") do |fh| fh.write("test\n"); fh.fsync; end'
 [pid 13702] --- SIGSEGV {si_signo=SIGSEGV, si_code=SEGV_MAPERR, si_addr=0} ---
 [pid 13702] --- SIGSEGV {si_signo=SIGSEGV, si_code=SEGV_MAPERR, si_addr=0xc} ---
 [pid 13722] --- SIGSEGV {si_signo=SIGSEGV, si_code=SEGV_ACCERR, si_addr=0x7ff5988cf700} ---
 [pid 13708] <... futex resumed> )       = ? <unavailable>

Does this seem like a potential bug? I couldn't find any other issues
referencing this.

Thanks!

Mark

@headius

This comment has been minimized.

Show comment
Hide comment
@headius

headius Aug 24, 2016

Member

Well let's see what our fsync does. It is most likely just calling JDK classes, which may sync in a different way...

Member

headius commented Aug 24, 2016

Well let's see what our fsync does. It is most likely just calling JDK classes, which may sync in a different way...

@headius

This comment has been minimized.

Show comment
Hide comment
@headius

headius Aug 24, 2016

Member

So on master, it appears that fsync is implemented to only do the flush to OS, not the subsequent OS sync. That could indeed be a bug. It appears part of the MRI code I ported did not make it over.

Interestingly, though, I see that 1.7 also does not make any call which would eventually sync at the OS level. So if this is a bug, it's an undiscovered bug for quite some time.

Member

headius commented Aug 24, 2016

So on master, it appears that fsync is implemented to only do the flush to OS, not the subsequent OS sync. That could indeed be a bug. It appears part of the MRI code I ported did not make it over.

Interestingly, though, I see that 1.7 also does not make any call which would eventually sync at the OS level. So if this is a bug, it's an undiscovered bug for quite some time.

@headius

This comment has been minimized.

Show comment
Hide comment
@headius

headius Aug 24, 2016

Member

Ok, I'm rigging a patch that will actually fsync anything that has a FileChannel and anything in a NativeSelectableChannel from jnr-enxio. I don't know how to do the same for e.g. JDK SocketChannel, pipes, and other non-file streams.

Member

headius commented Aug 24, 2016

Ok, I'm rigging a patch that will actually fsync anything that has a FileChannel and anything in a NativeSelectableChannel from jnr-enxio. I don't know how to do the same for e.g. JDK SocketChannel, pipes, and other non-file streams.

@marktriggs

This comment has been minimized.

Show comment
Hide comment
@marktriggs

marktriggs Aug 24, 2016

Contributor

Sounds good to me, thanks @headius :)

Contributor

marktriggs commented Aug 24, 2016

Sounds good to me, thanks @headius :)

@headius

This comment has been minimized.

Show comment
Hide comment
@headius

headius Aug 24, 2016

Member

Hmm...my patch seems like it should be doing the right thing, but I'm not sure if it's working. Can you test it out?

diff --git a/core/src/main/java/org/jruby/RubyIO.java b/core/src/main/java/org/jruby/RubyIO.java
index 1f54e7e..aaa2bb9 100644
--- a/core/src/main/java/org/jruby/RubyIO.java
+++ b/core/src/main/java/org/jruby/RubyIO.java
@@ -1754,10 +1754,18 @@ public class RubyIO extends RubyObject implements IOEncodable {
         if (fptr.io_fflush(context) < 0)
             throw runtime.newSystemCallError("");

-//        # ifndef _WIN32  /* already called in io_fflush() */
-//        if ((int)rb_thread_io_blocking_region(nogvl_fsync, fptr, fptr->fd) < 0)
-//            rb_sys_fail_path(fptr->pathv);
-//        # endif
+        if (!Platform.IS_WINDOWS) { /* already called in io_fflush() */
+            try {
+                if (fptr.fileChannel() != null) fptr.fileChannel().force(true);
+                if (fptr.fd().chNative != null) {
+                    int ret = runtime.getPosix().fsync(fptr.fd().chNative.getFD());
+                    if (ret < 0) throw runtime.newErrnoFromInt(runtime.getPosix().errno());
+                }
+            } catch (IOException ioe) {
+                throw runtime.newIOErrorFromException(ioe);
+            }
+        }
+        
         return RubyFixnum.zero(runtime);
     }
Member

headius commented Aug 24, 2016

Hmm...my patch seems like it should be doing the right thing, but I'm not sure if it's working. Can you test it out?

diff --git a/core/src/main/java/org/jruby/RubyIO.java b/core/src/main/java/org/jruby/RubyIO.java
index 1f54e7e..aaa2bb9 100644
--- a/core/src/main/java/org/jruby/RubyIO.java
+++ b/core/src/main/java/org/jruby/RubyIO.java
@@ -1754,10 +1754,18 @@ public class RubyIO extends RubyObject implements IOEncodable {
         if (fptr.io_fflush(context) < 0)
             throw runtime.newSystemCallError("");

-//        # ifndef _WIN32  /* already called in io_fflush() */
-//        if ((int)rb_thread_io_blocking_region(nogvl_fsync, fptr, fptr->fd) < 0)
-//            rb_sys_fail_path(fptr->pathv);
-//        # endif
+        if (!Platform.IS_WINDOWS) { /* already called in io_fflush() */
+            try {
+                if (fptr.fileChannel() != null) fptr.fileChannel().force(true);
+                if (fptr.fd().chNative != null) {
+                    int ret = runtime.getPosix().fsync(fptr.fd().chNative.getFD());
+                    if (ret < 0) throw runtime.newErrnoFromInt(runtime.getPosix().errno());
+                }
+            } catch (IOException ioe) {
+                throw runtime.newIOErrorFromException(ioe);
+            }
+        }
+        
         return RubyFixnum.zero(runtime);
     }
@marktriggs

This comment has been minimized.

Show comment
Hide comment
@marktriggs

marktriggs Aug 24, 2016

Contributor

Yep! I'm seeing the call now:

$ strace -qq -e fsync,fdatasync -f java -cp lib/jruby.jar org.jruby.Main -e 'File.open("/tmp/testfile", "w") do |fh| fh.write("test\n"); fh.fsync; end'
[pid 28139] --- SIGSEGV {si_signo=SIGSEGV, si_code=SEGV_MAPERR, si_addr=0} ---
[pid 28139] --- SIGSEGV {si_signo=SIGSEGV, si_code=SEGV_MAPERR, si_addr=0xc} ---
[pid 28149] --- SIGSEGV {si_signo=SIGSEGV, si_code=SEGV_ACCERR, si_addr=0x7fdb4d36c480} ---
[pid 28150] --- SIGSEGV {si_signo=SIGSEGV, si_code=SEGV_ACCERR, si_addr=0x7fdb4d36c680} ---
[pid 28139] fsync(14)                   = 0

Unfortunately I don't have a full app to use to check it more thoroughly, but at least it matches MRI Ruby now :)

Thanks again,
Mark

Contributor

marktriggs commented Aug 24, 2016

Yep! I'm seeing the call now:

$ strace -qq -e fsync,fdatasync -f java -cp lib/jruby.jar org.jruby.Main -e 'File.open("/tmp/testfile", "w") do |fh| fh.write("test\n"); fh.fsync; end'
[pid 28139] --- SIGSEGV {si_signo=SIGSEGV, si_code=SEGV_MAPERR, si_addr=0} ---
[pid 28139] --- SIGSEGV {si_signo=SIGSEGV, si_code=SEGV_MAPERR, si_addr=0xc} ---
[pid 28149] --- SIGSEGV {si_signo=SIGSEGV, si_code=SEGV_ACCERR, si_addr=0x7fdb4d36c480} ---
[pid 28150] --- SIGSEGV {si_signo=SIGSEGV, si_code=SEGV_ACCERR, si_addr=0x7fdb4d36c680} ---
[pid 28139] fsync(14)                   = 0

Unfortunately I don't have a full app to use to check it more thoroughly, but at least it matches MRI Ruby now :)

Thanks again,
Mark

@headius

This comment has been minimized.

Show comment
Hide comment
@headius

headius Aug 24, 2016

Member

Excellent, I'll pull the change in. Thanks!

Member

headius commented Aug 24, 2016

Excellent, I'll pull the change in. Thanks!

@headius

This comment has been minimized.

Show comment
Hide comment
@headius

headius Aug 24, 2016

Member

Fixed for 9.1.3.0 and 1.7.26.

Member

headius commented Aug 24, 2016

Fixed for 9.1.3.0 and 1.7.26.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment