Add a system sync() to start_worker() for aggressive SGE buffering #12439

davidavdav · 2015-08-03T11:25:30Z

As reported here, on a very large SGE cluster, the i/o buffering for stdout/stderr can be so aggressive that the (port,worker-ip) is never flushed to file---the very file that the master uses to find out what worker machines to connect to. Thus, the workers time-out and the connection cannot be established. Effectively, addprocs_sge() doesn't work on such a cluster.

The only solution I could get working is to include a sync() in the workers, after the flush(), to ensure stdout---which is redirected by SGE to a file---is actually flushed.

StefanKarpinski · 2015-08-03T14:04:20Z

base/multi.jl

@@ -980,6 +980,7 @@ function start_worker(out::IO)
    print(out, LPROC.bind_addr)
    print(out, '\n')
    flush(out)
+    ccall((:sync, "libc"), Void, ())   # necessary for SGE, which redirects and buffers stdout


I wonder if the flush function shouldn't do this automatically. In my experience you either don't care about buffering, in which case you wouldn't all flush anyway, or you do and you want to flush everything as thoroughly as possible.

The whole point is that flush() doesn't do this in this case.

In the case of addprocs_sge(), out == STDOUT. The worker flushes STDOUT, but apparently SGE redirects stdout to a file locally at the worker, and somehow adds its own buffering again in that process. This makes the flush() useless, but the sync() seems to circumvent the problem, because STDOUT has turned into a file on the filesystem.

My point was that perhaps flush should do this.

davidavdav · 2015-08-03T14:12:37Z

Looking at the appveyor build, I suppose libc and/or sync does not exist in windows. So the sync should be surrounded by the proper ifdefs.

tkelman · 2015-08-03T14:30:23Z

Is this a problem on master? Would much rather have the PR submitted against master first, and it could be considered for backporting at least several days after being merged.

davidavdav · 2015-08-03T16:18:51Z

This would also apply for 0.4-dev. However, I can't compile 0.4-dev on this large cluster, for all kinds of reasons. The latest reason is an "could not allocate pools" error (I do not have virtual address space limits set, afaik). So I won't be able to debug.

tkelman · 2015-08-03T16:25:05Z

That's #10390, there should be a line of code that you can change to hopefully fix that one.

amitmurthy · 2015-08-04T05:55:57Z

By googling a bit on flush vs sync, I have the following reservations.

sync flushes all buffered filesystem changes to disk - system wide
sync(fd) flushes all buffered changes to the filesystem in which fd resides. Not applicable here since while the fd is STDOUT, external redirection is causing the write to disk.
Not portable (as mentioned above)
SGE specific stuff should really be in the SGE ClusterManager in ClusterManagers.jl

davidavdav · 2015-08-04T06:16:18Z

I agree on all points, but the sgemanager needs to be rewritten, probably as you suggested in https://groups.google.com/forum/#!topic/julia-dev/Ms9pTNGoIvA . I am not convinced I'm the one to do that, as I don't really understand the intricacies between julia and clustermanager. The general solution http://docs.julialang.org/en/latest/manual/parallel-computing/#clustermanagers obviously doesn't work for me here, because of the buffering.

Add a system sync() to start_worker() for aggressive SGE buffering

5b8c794

StefanKarpinski reviewed Aug 3, 2015
View reviewed changes

davidavdav closed this Aug 4, 2015

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add a system sync() to start_worker() for aggressive SGE buffering #12439

Add a system sync() to start_worker() for aggressive SGE buffering #12439

davidavdav commented Aug 3, 2015

StefanKarpinski Aug 3, 2015

davidavdav Aug 3, 2015

StefanKarpinski Aug 3, 2015

davidavdav commented Aug 3, 2015

tkelman commented Aug 3, 2015

davidavdav commented Aug 3, 2015

tkelman commented Aug 3, 2015

amitmurthy commented Aug 4, 2015

davidavdav commented Aug 4, 2015

Add a system sync() to start_worker() for aggressive SGE buffering #12439

Add a system sync() to start_worker() for aggressive SGE buffering #12439

Conversation

davidavdav commented Aug 3, 2015

StefanKarpinski Aug 3, 2015

Choose a reason for hiding this comment

davidavdav Aug 3, 2015

Choose a reason for hiding this comment

StefanKarpinski Aug 3, 2015

Choose a reason for hiding this comment

davidavdav commented Aug 3, 2015

tkelman commented Aug 3, 2015

davidavdav commented Aug 3, 2015

tkelman commented Aug 3, 2015

amitmurthy commented Aug 4, 2015

davidavdav commented Aug 4, 2015