Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Windows][melodic-devel] workaround WSAPoll doesn't report failed connections. #1816

Merged
merged 2 commits into from
Feb 3, 2020

Conversation

seanyen
Copy link
Contributor

@seanyen seanyen commented Sep 26, 2019

This fixes ms-iot/ROSOnWindows#151

The original problem manifests as ros::master::check() never return back when the master is not running. It turns out that it is waiting for WSAPoll to be returned, which is at this below line:

int nEvents = poll(&fds[0], source_cnt, (timeout_ms < 0) ? -1 : timeout_ms);

However, for a non-blocking socket which has a failed connection, WSAPoll exhibits a behavior which is not to signal this error for the caller, and with the infinite timeout, it just waits forever. There are many discussions about this behavior.

To workaround this issue, I am proposing for Windows platform adding a connection readiness pre-checks for writable sockets, which can be done by calling select and it will come back with a result immediately whether it is an connection-error socket or writable-ready socket. After we have the result, we can decide to error out or go ahead with the original WSAPoll to finish its job.

@seanyen
Copy link
Contributor Author

seanyen commented Jan 14, 2020

@mikepurvis This is ready for review and merge. Thanks!

else
{
return WSAPoll(pfd, nfds, timeout);
}
}

# define USE_FTIME
Copy link
Member

@mikepurvis mikepurvis Jan 15, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I know it's not directly related here, but in the process of reviewing this, I noticed some if/endif mismatching going on. The #endif at the bottom of the file is labeled for USE_FTIME but I believe actually corresponds to WINDOWS.

@seanyen Do you mind looking into this and correcting the comments if applicable?

Copy link
Contributor Author

@seanyen seanyen Jan 15, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

By checking the commit history, it seems to me that USE_FTIME was first introduced because Windows doesn't have gettimeofday so it ended up with diverged implementations. However, two years ago, a PR was merged and now non-Windows implementation is in favor of ros::ros_steadytime.

I would say perhaps we should see if there is any opportunity to converge the implementations now, since ros::ros_steadytime is a portable thing and theoretically it should behave the same across the platforms.

@mikepurvis, what do you think?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since this is orthogonal please follow up on this separately form this PR.

@dirk-thomas
Copy link
Member

Thanks for the patch.

@dirk-thomas dirk-thomas merged commit 1d5c217 into ros:melodic-devel Feb 3, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

ros::master::check() hanging
3 participants