You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Traceback (most recent call last):
File "/gscratch/stf/danylo/lcss/lib/main.py", line 25, in <module>
main()
File "/gscratch/stf/danylo/lcss/lib/main.py", line 20, in main
scheduler.main()
File "/gscratch/stf/danylo/lcss/lib/scheduler.py", line 528, in main
scheduler.spin()
File "/gscratch/stf/danylo/lcss/lib/scheduler.py", line 412, in spin
status = self.status_msg[i].receive('newest')
File "/gscratch/stf/danylo/lcss/lib/tools.py", line 108, in receive
msg_available,data = self.req.test()
File "mpi4py/MPI/Request.pyx", line 243, in mpi4py.MPI.Request.test
File "mpi4py/MPI/msgpickle.pxi", line 434, in mpi4py.MPI.PyMPI_test
File "mpi4py/MPI/msgpickle.pxi", line 404, in mpi4py.MPI.PyMPI_load
File "mpi4py/MPI/msgpickle.pxi", line 111, in mpi4py.MPI.Pickle.load
File "mpi4py/MPI/msgpickle.pxi", line 101, in mpi4py.MPI.Pickle.cloads
_pickle.UnpicklingError: invalid load key, '\x00'.
n2149.hyak.local.28388Exhausted 1048576 MQ irecv request descriptors, which usually indicates a user program error or insufficient request d\
escriptors (PSM2_MQ_RECVREQS_MAX=1048576)
The text was updated successfully, but these errors were encountered:
Possibly fixed this issue by using a MAX_ASYNC_SEND variable and implementing the relevant functionality in tools.py:MPICommunicator:send() which waits until scheduler reads the worker's messages once the buffer of messages is full up to MAX_ASYNC_SEND. This makes sure that at most MAX_ASYNC_SEND*(MPI.COMM_WORLD.Get_size()-1) messages are ever in the MPI message pipe, which will hopefully be not too much and prevent an overflow of messages that (I think) is what is causing the above issue.
Traceback from Hyak:
The text was updated successfully, but these errors were encountered: