Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add a system sync() to start_worker() for aggressive SGE buffering #12439

Closed
wants to merge 1 commit into from

Conversation

davidavdav
Copy link
Contributor

As reported here, on a very large SGE cluster, the i/o buffering for stdout/stderr can be so aggressive that the (port,worker-ip) is never flushed to file---the very file that the master uses to find out what worker machines to connect to. Thus, the workers time-out and the connection cannot be established. Effectively, addprocs_sge() doesn't work on such a cluster.

The only solution I could get working is to include a sync() in the workers, after the flush(), to ensure stdout---which is redirected by SGE to a file---is actually flushed.

@@ -980,6 +980,7 @@ function start_worker(out::IO)
print(out, LPROC.bind_addr)
print(out, '\n')
flush(out)
ccall((:sync, "libc"), Void, ()) # necessary for SGE, which redirects and buffers stdout
Copy link
Sponsor Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if the flush function shouldn't do this automatically. In my experience you either don't care about buffering, in which case you wouldn't all flush anyway, or you do and you want to flush everything as thoroughly as possible.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The whole point is that flush() doesn't do this in this case.

In the case of addprocs_sge(), out == STDOUT. The worker flushes STDOUT, but apparently SGE redirects stdout to a file locally at the worker, and somehow adds its own buffering again in that process. This makes the flush() useless, but the sync() seems to circumvent the problem, because STDOUT has turned into a file on the filesystem.

Copy link
Sponsor Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My point was that perhaps flush should do this.

@davidavdav
Copy link
Contributor Author

Looking at the appveyor build, I suppose libc and/or sync does not exist in windows. So the sync should be surrounded by the proper ifdefs.

@tkelman
Copy link
Contributor

tkelman commented Aug 3, 2015

Is this a problem on master? Would much rather have the PR submitted against master first, and it could be considered for backporting at least several days after being merged.

@davidavdav
Copy link
Contributor Author

This would also apply for 0.4-dev. However, I can't compile 0.4-dev on this large cluster, for all kinds of reasons. The latest reason is an "could not allocate pools" error (I do not have virtual address space limits set, afaik). So I won't be able to debug.

@tkelman
Copy link
Contributor

tkelman commented Aug 3, 2015

That's #10390, there should be a line of code that you can change to hopefully fix that one.

@amitmurthy
Copy link
Contributor

By googling a bit on flush vs sync, I have the following reservations.

  • sync flushes all buffered filesystem changes to disk - system wide
  • sync(fd) flushes all buffered changes to the filesystem in which fd resides. Not applicable here since while the fd is STDOUT, external redirection is causing the write to disk.
  • Not portable (as mentioned above)
  • SGE specific stuff should really be in the SGE ClusterManager in ClusterManagers.jl

@davidavdav
Copy link
Contributor Author

I agree on all points, but the sgemanager needs to be rewritten, probably as you suggested in https://groups.google.com/forum/#!topic/julia-dev/Ms9pTNGoIvA . I am not convinced I'm the one to do that, as I don't really understand the intricacies between julia and clustermanager. The general solution http://docs.julialang.org/en/latest/manual/parallel-computing/#clustermanagers obviously doesn't work for me here, because of the buffering.

@davidavdav davidavdav closed this Aug 4, 2015
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants