You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Problem discovered in libfabric 1.1.0 on 10/31/2016
For reasons tangential to the problem at hand, I wish to send a message from one unconnected endpoint to another using the address vector in a “insert, send, remove” pattern.
After performing one insert, send, remove iteration, the second fi_av_insert causes the sockets provider to throw a debug message saying that the sockets address given is invalid (even though it was successfully used to send a message on the last iteration). Additionally, the fi_av_insert call yields a fi_addr_t equal to FI_ADDR_NOTAVAIL.
I held two theories for the cause of this behavior:
The fi_av_insert call mangles the address we give it, perhaps setting it to NULL, or freeing it
The fid_av has latent state, either unknown or errant, which causes the behavior
I did some digging and discovered that:
If I ensure that I only ever give fresh, dedicated copies of my raw addresses to the fi_av_insert call, the symptoms persist. The first message is successful, but the second AV insert results in failure. This seems to debunk cause A bunch of little fixes #1.
If, before every send operation, I create and bind a new AV, and afterward I close it, then the allegedly-errant behavior is eliminated – both messages are correctly transmitted. This seems to support cause A series of small fixes #2.
The text was updated successfully, but these errors were encountered:
Problem discovered in libfabric 1.1.0 on 10/31/2016
For reasons tangential to the problem at hand, I wish to send a message from one unconnected endpoint to another using the address vector in a “insert, send, remove” pattern.
After performing one insert, send, remove iteration, the second fi_av_insert causes the sockets provider to throw a debug message saying that the sockets address given is invalid (even though it was successfully used to send a message on the last iteration). Additionally, the fi_av_insert call yields a fi_addr_t equal to FI_ADDR_NOTAVAIL.
I held two theories for the cause of this behavior:
I did some digging and discovered that:
The text was updated successfully, but these errors were encountered: