Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

IO#fsync doesn't trigger an fsync(2) syscall? #4073

Closed
marktriggs opened this issue Aug 14, 2016 · 8 comments
Closed

IO#fsync doesn't trigger an fsync(2) syscall? #4073

marktriggs opened this issue Aug 14, 2016 · 8 comments

Comments

@marktriggs
Copy link
Contributor

@marktriggs marktriggs commented Aug 14, 2016

Hi there,

When I make a call to IO#fsync on an open file handle, I expected to
see an fsync(2) syscall triggered (as MRI Ruby does), but in the
current JRuby master branch I don't see that.

I'm testing on Linux thweeble 3.16.0-4-amd64 #1 SMP Debian 3.16.7-ckt20-1+deb8u4 (2016-02-29) x86_64 GNU/Linux as follows:

 # MRI Ruby
 #
 $ ruby --version
 ruby 2.1.5p273 (2014-11-13) [x86_64-linux-gnu]

 $ strace -qq -e fsync,fdatasync -f ruby -e 'File.open("/tmp/testfile", "w") do |fh| fh.write("test\n"); fh.fsync; end'
 [pid 13600] fsync(7)                    = 0

 # JRuby built from commit 0ebd63589cf430a171f4224da4226e8c86e52203
 #
 $ strace -qq -e fsync,fdatasync -f java -cp lib/jruby.jar org.jruby.Main -e 'File.open("/tmp/testfile", "w") do |fh| fh.write("test\n"); fh.fsync; end'
 [pid 13702] --- SIGSEGV {si_signo=SIGSEGV, si_code=SEGV_MAPERR, si_addr=0} ---
 [pid 13702] --- SIGSEGV {si_signo=SIGSEGV, si_code=SEGV_MAPERR, si_addr=0xc} ---
 [pid 13722] --- SIGSEGV {si_signo=SIGSEGV, si_code=SEGV_ACCERR, si_addr=0x7ff5988cf700} ---
 [pid 13708] <... futex resumed> )       = ? <unavailable>

Does this seem like a potential bug? I couldn't find any other issues
referencing this.

Thanks!

Mark

@headius
Copy link
Member

@headius headius commented Aug 24, 2016

Well let's see what our fsync does. It is most likely just calling JDK classes, which may sync in a different way...

@headius
Copy link
Member

@headius headius commented Aug 24, 2016

So on master, it appears that fsync is implemented to only do the flush to OS, not the subsequent OS sync. That could indeed be a bug. It appears part of the MRI code I ported did not make it over.

Interestingly, though, I see that 1.7 also does not make any call which would eventually sync at the OS level. So if this is a bug, it's an undiscovered bug for quite some time.

@headius
Copy link
Member

@headius headius commented Aug 24, 2016

Ok, I'm rigging a patch that will actually fsync anything that has a FileChannel and anything in a NativeSelectableChannel from jnr-enxio. I don't know how to do the same for e.g. JDK SocketChannel, pipes, and other non-file streams.

@marktriggs
Copy link
Contributor Author

@marktriggs marktriggs commented Aug 24, 2016

Sounds good to me, thanks @headius :)

@headius
Copy link
Member

@headius headius commented Aug 24, 2016

Hmm...my patch seems like it should be doing the right thing, but I'm not sure if it's working. Can you test it out?

diff --git a/core/src/main/java/org/jruby/RubyIO.java b/core/src/main/java/org/jruby/RubyIO.java
index 1f54e7e..aaa2bb9 100644
--- a/core/src/main/java/org/jruby/RubyIO.java
+++ b/core/src/main/java/org/jruby/RubyIO.java
@@ -1754,10 +1754,18 @@ public class RubyIO extends RubyObject implements IOEncodable {
         if (fptr.io_fflush(context) < 0)
             throw runtime.newSystemCallError("");

-//        # ifndef _WIN32  /* already called in io_fflush() */
-//        if ((int)rb_thread_io_blocking_region(nogvl_fsync, fptr, fptr->fd) < 0)
-//            rb_sys_fail_path(fptr->pathv);
-//        # endif
+        if (!Platform.IS_WINDOWS) { /* already called in io_fflush() */
+            try {
+                if (fptr.fileChannel() != null) fptr.fileChannel().force(true);
+                if (fptr.fd().chNative != null) {
+                    int ret = runtime.getPosix().fsync(fptr.fd().chNative.getFD());
+                    if (ret < 0) throw runtime.newErrnoFromInt(runtime.getPosix().errno());
+                }
+            } catch (IOException ioe) {
+                throw runtime.newIOErrorFromException(ioe);
+            }
+        }
+        
         return RubyFixnum.zero(runtime);
     }
@marktriggs
Copy link
Contributor Author

@marktriggs marktriggs commented Aug 24, 2016

Yep! I'm seeing the call now:

$ strace -qq -e fsync,fdatasync -f java -cp lib/jruby.jar org.jruby.Main -e 'File.open("/tmp/testfile", "w") do |fh| fh.write("test\n"); fh.fsync; end'
[pid 28139] --- SIGSEGV {si_signo=SIGSEGV, si_code=SEGV_MAPERR, si_addr=0} ---
[pid 28139] --- SIGSEGV {si_signo=SIGSEGV, si_code=SEGV_MAPERR, si_addr=0xc} ---
[pid 28149] --- SIGSEGV {si_signo=SIGSEGV, si_code=SEGV_ACCERR, si_addr=0x7fdb4d36c480} ---
[pid 28150] --- SIGSEGV {si_signo=SIGSEGV, si_code=SEGV_ACCERR, si_addr=0x7fdb4d36c680} ---
[pid 28139] fsync(14)                   = 0

Unfortunately I don't have a full app to use to check it more thoroughly, but at least it matches MRI Ruby now :)

Thanks again,
Mark

@headius
Copy link
Member

@headius headius commented Aug 24, 2016

Excellent, I'll pull the change in. Thanks!

@headius
Copy link
Member

@headius headius commented Aug 24, 2016

Fixed for 9.1.3.0 and 1.7.26.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Linked pull requests

Successfully merging a pull request may close this issue.

None yet
2 participants
You can’t perform that action at this time.