-
Notifications
You must be signed in to change notification settings - Fork 65
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Windows Socket IO causes VM crash (Issue #495) #497
Conversation
Socket IO on Windows can crash the VM with an access violation due to a race condition on memory freeing in aioWin.c. The sequence of events that can lead to the crash is: allHandles is malloc'd in aioPoll() and passed to sliceWaitForMultipleObjects(). sliceWaitForMultipleObjects() stores a pointer to allHandles in sliceData->handles. waitHandlesThreadFunction() is then called from one or more threads and copies the data from allHandles. aioPoll() then waits for an event using WaitForMultipleObjectsEx(). Once WaitForMultipleObjectsEx() returns it checks the results and frees allHandles. However this assumes that every thread gets a chance to run prior to WaitForMultipleObjectsEx() returning. But WaitForMultipleObjectsEx() will return either after a timeout or after the first thread indicates an event - since WaitForMultipleObjectsEx() is called with bWaitAll = FALSE. On a machine with only a couple of cores and a large number of sockets open allHandles can be freed prior to all threads getting enough CPU time, causing an access violation. The solution is to allocate a buffer per thread and copy the relevant portion of allHandles in to it prior to spawning the thread. As not all threads get a chance to complete prior to WaitForMultipleObjectsEx() returning, it also means that socket IO may not be recognised, and the socket not read (or written). Polling all sockets even on timeout ensures that all IO is recognised. The time for checkEventsInHandles() is less than 1mS, so the overhead is minimal, with basically no work being done if no handles are registered for asynchronous IO. Fixes: pharo-project#495
@@ -506,16 +497,6 @@ EXPORT(long) aioPoll(long microSeconds){ | |||
CloseHandle(waitingHandles[i]); | |||
} | |||
|
|||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Which is the reason to not having the quick return when the timeout has arrived? Because we are testing the handlers when we have already testing them
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi Pablo,
Because not all the threads may have had a chance to execute before the timeout arrives. If the quick return is used there is a good chance that sockets will incorrectly not be signaled as having IO complete - we saw this happen quite regularly.
So the WaitForMultipleObjectsEx()
call will return early if there is an event, and may return early if there is IO, allowing the event or IO to be processed, but if it times out it isn't really giving any information about IO completion. And checking whether IO has occurred is very quick, 1000 sockets are checked in less than a millisecond (from memory).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, perfect, thanks for the explanation
Socket IO on Windows can crash the VM with an access violation due to a race condition on memory freeing in aioWin.c.
The sequence of events that can lead to the crash is:
However this assumes that every thread gets a chance to run prior to WaitForMultipleObjectsEx() returning. But WaitForMultipleObjectsEx() will return either after a timeout or after the first thread indicates an event - since WaitForMultipleObjectsEx() is called with bWaitAll = FALSE.
On a machine with only a couple of cores and a large number of sockets open allHandles can be freed prior to all threads getting enough CPU time, causing an access violation.
The solution is to allocate a buffer per thread and copy the relevant portion of allHandles in to it prior to spawning the thread.
As not all threads get a chance to complete prior to WaitForMultipleObjectsEx() returning, it also means that socket IO may not be recognised, and the socket not read (or written).
Polling all sockets even on timeout ensures that all IO is recognised. The time for checkEventsInHandles() is less than 1mS, so the overhead is minimal, with basically no work being done if no handles are registered for asynchronous IO.
Fixes: #495