You can clone with
HTTPS or Subversion.
I'm seeing weird crashes during LEMON writer construction, but so far only with the 16x4 parallelization. It does not occur for every single run...
# Writing gauge field to .conf.tmp.
# Constructing LEMON writer for file .conf.tmp for append = 0
Abort(1) on node 0 (rank 0 in comm 1140850688): Fatal error in PMPI_Bcast: Invalid root, error stack:
PMPI_Bcast(1478): MPI_Bcast(buf=0x1fbfffb064, count=1, MPI_INT, root=-1076188541, comm=0x84000007) failed
PMPI_Bcast(1440): Invalid root (value given was -1076188541)
2012-09-11 15:52:37.265 (WARN ) [0x400011a8b10] :81834:ibm.runjob.client.Job: terminated by signal 6
2012-09-11 15:52:37.265 (WARN ) [0x400011a8b10] :81834:ibm.runjob.client.Job: abnormal termination by signal 6 from rank 0
I haven't investigated at all what could be causing this but post this issue in case someone else comes across the problem.
Still a problem?
It hasn't happened since. Since this was at the very beginning of testing on BG/Q, it might have had to do with the incomplete state of IO at the time.
I've had a look at the code to see where this could originate, but it seems the MPI_Bcast is indeed happening somewhere in the I/O implementation. The root number is obviously nonsense, so this does look like a configuration issue. I would vote to change this from a bug to something like Won't fix...