Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

AccessViolationException in AsyncIO #12

Open
msykora opened this issue Apr 10, 2017 · 24 comments
Open

AccessViolationException in AsyncIO #12

msykora opened this issue Apr 10, 2017 · 24 comments

Comments

@msykora
Copy link

msykora commented Apr 10, 2017

Hello,

I get this intermittently when I'm trying to load test my communication component based on NetMQ 3.3.3.4, sending messages from thousands of router sockets on a single machine, connected to a single router socket on the "server":
Unhandled Exception: System.AccessViolationException: Attempted to read or write protected memory. This is often an indication that other memory is corrupt.
at System.Runtime.InteropServices.Marshal.ReadInt32(IntPtr ptr, Int32 ofs)
at AsyncIO.Windows.Overlapped.get_Success()
at AsyncIO.Windows.CompletionPort.HandleCompletionStatus(CompletionStatus& completionStatus, IntPtr overlappedAddress, IntPtr completionKey, Int32 bytesTransferred)
at AsyncIO.Windows.CompletionPort.GetMultipleQueuedCompletionStatus(Int32 timeout, CompletionStatus[] completionStatuses, Int32& removed)
at NetMQ.Core.Utils.Proactor.Loop()
at System.Threading.ThreadHelper.ThreadStart_Context(Object state)
at System.Threading.ExecutionContext.RunInternal(ExecutionContext executionContext, ContextCallback callback, Object state, Boolean preserveSyncCtx)
at System.Threading.ExecutionContext.Run(ExecutionContext executionContext, ContextCallback callback, Object state, Boolean preserveSyncCtx)
at System.Threading.ExecutionContext.Run(ExecutionContext executionContext, ContextCallback callback, Object state)
at System.Threading.ThreadHelper.ThreadStart()

I can't really include my code, but it seems like there is a possible bug in AsyncIO library?

Martin

@southie
Copy link

southie commented Apr 17, 2017

I noticed this while tracking down access violations in my project. Interesting to see someone else with a similar problem.

I believe this can happen if the socket is aborted while an overlapped IO is still in progress. The overlapped buffer gets marked as disposed, then freed and can be garbage collected before the Success field is checked. I made the following modifications, I haven't made a patch yet, because I am not convinced it is the source of my errors.

My Windows.Overlapped.cs:

        public bool Success { get; private set; }
        //{
        //    get { return Marshal.ReadIntPtr(m_address).Equals(IntPtr.Zero); }
        //}
         public static Overlapped CompleteOperation(IntPtr overlappedAddress)
        {
            IntPtr managedOverlapped = Marshal.ReadIntPtr(overlappedAddress, MangerOverlappedOffset);

            GCHandle handle = GCHandle.FromIntPtr(managedOverlapped);

            Overlapped overlapped = handle.Target as Overlapped;
            overlapped.InProgress = false;

            if (overlapped.Disposed)
            {
                overlapped.Free();
                overlapped.Success = false;
            }
            else
            {
                overlapped.Success = Marshal.ReadIntPtr(overlapped.m_address).Equals(IntPtr.Zero);
            }
            return overlapped;          
        }        

Note that CompletionPort.cs needs to also check Disposed in this case, as it can see a Success = false, and try to get a Socket error, and retrieve a SocketError.Succcess code, which is possible if the socket was aborted but the operation was completed successfully.

I did something like this:

 if (!overlapped.Disposed)
                        {
                            bool operationSucceed = UnsafeMethods.WSAGetOverlappedResult(overlapped.AsyncSocket.Handle,
                                overlappedAddress,
                                out bytesTransferred, false, out socketFlags);
                            if (!operationSucceed)
                            {
                                socketError = (SocketError)Marshal.GetLastWin32Error();
                            }
                        }
                        else
                        {
                            socketError = SocketError.OperationAborted;
                        }

@msykora
Copy link
Author

msykora commented Apr 18, 2017

Would it be possible to path anyway if it's a clear error scenario? Thing is, this causes an exception from a background thread that takes down my entire application... I can't really catch it, so it's quite a big issue, only alternative I am seeing now is not running poller as async and instead permanently block a thread for it so I can run it in try/catch, which I don't like much... Or try NetMQ 4 which is no small deal and might not help anyway...

@southie southie mentioned this issue Apr 18, 2017
@southie
Copy link

southie commented Apr 18, 2017

I forked and created a pull request. I had trouble setting up the environment for the whole project on my box ((I have been using a private fork for my work because of this).

If you'd like, you could use my fork of AsyncIO to test things out and see if it fixes your problem. This issue came up for me (I believe) when I had a heartbeat close non-responsive sockets while operations were still outstanding.

@msykora
Copy link
Author

msykora commented Apr 27, 2017

Sorry, I couldn't test it as I'd hoped as I'm putting out several fires at once, I only know it happens very intermittently... And when I tried to replicate the scenario simply - connecting a socket, sending a frame with sendmore flag and then closing the server socket while a message has not finished receiving, I got a different exception, from NetMQ (also from a background thread so also quite bad):
System.Net.Sockets.SocketException occurred
HResult=0x80004005
Message=An existing connection was forcibly closed by the remote host
Source=System
StackTrace:
at System.Net.Sockets.Socket.Receive(Byte[] buffer, Int32 offset, Int32 size, SocketFlags socketFlags)
at NetMQ.Core.Mailbox.TryRecv(Int32 timeout, Command& command)
at NetMQ.Core.SocketBase.ProcessCommands(Int32 timeout, Boolean throttle)
at NetMQ.Core.SocketBase.GetSocketOption(ZmqSocketOption option)
at NetMQ.Core.Utils.Selector.Select(SelectItem[] items, Int32 itemsCount, Int64 timeout)
at NetMQ.NetMQPoller.Run()
at System.Threading.ThreadHelper.ThreadStart_Context(Object state)
at System.Threading.ExecutionContext.RunInternal(ExecutionContext executionContext, ContextCallback callback, Object state, Boolean preserveSyncCtx)
at System.Threading.ExecutionContext.Run(ExecutionContext executionContext, ContextCallback callback, Object state, Boolean preserveSyncCtx)
at System.Threading.ExecutionContext.Run(ExecutionContext executionContext, ContextCallback callback, Object state)
at System.Threading.ThreadHelper.ThreadStart()

@msykora
Copy link
Author

msykora commented Apr 27, 2017

Is it possible to path this and release new version if I am not able to test the fix from my side?

@southie
Copy link

southie commented Apr 27, 2017

I believe you may be running into a separate issue I had, where on UWP, functions were returning failure from the socket API, but with an error code of SocketError.Success.
Which framework are you targeting?
I have a branch on my fork that addresses it. I thought it was only an issue with the .Net Native socket implementation when targeting UWP. I posted a comment about it under NetMQ Issue 105

Also, are you running NetMQ-3 or 4? I noticed you posted an issue under NetMQ-3. I am less familiar with 3, sorry.

@pnrajesh
Copy link

pnrajesh commented Nov 2, 2017

Hi, I am using NetMQ 4.0.0.1 and AsyncIO 0.1.26 and I am getting the AccessViolationException exception:

Unhandled Exception: System.AccessViolationException: Attempted to read or write protected memory. This is often an indication that other memory is corrupt.
at System.Runtime.InteropServices.Marshal.ReadInt32(IntPtr ptr, Int32 ofs)
at AsyncIO.Windows.Overlapped.get_Success()
at AsyncIO.Windows.CompletionPort.HandleCompletionStatus(CompletionStatus& completionStatus, IntPtr overlappedAddress, IntPtr completionKey, Int32 bytesTransferred)
at AsyncIO.Windows.CompletionPort.GetMultipleQueuedCompletionStatus(Int32 timeout, CompletionStatus[] completionStatuses, Int32& removed)
at NetMQ.Core.Utils.Proactor.Loop()
at System.Threading.ThreadHelper.ThreadStart_Context(Object state)
at System.Threading.ExecutionContext.RunInternal(ExecutionContext executionContext, ContextCallback callback, Object state, Boolean preserveSyncCtx)
at System.Threading.ExecutionContext.Run(ExecutionContext executionContext, ContextCallback callback, Object state, Boolean preserveSyncCtx)
at System.Threading.ExecutionContext.Run(ExecutionContext executionContext, ContextCallback callback, Object state)
at System.Threading.ThreadHelper.ThreadStart()

My application would close the NETMQ socket on idle and then reopen if new data has to be transmitted and I see this exception causing the application to crash once in a while.

@southie / @somdoron , can you please confirm if the fix is part of the 0.1.26 version of AsyncIO. Github shows the fix being commited on Apr 29 and the 0.1.26 version was actually released on Jul 18

Regards,
Rajesh

@somdoron
Copy link
Owner

somdoron commented Nov 2, 2017

i think it was included but I think i still see a problem.

I can try and provide a fix, would you be able to test it?

@somdoron
Copy link
Owner

somdoron commented Nov 2, 2017

sorry, it was not included, it was april this year, version released last year.
I will push a new version next week. For now you can build from source and use it.

@pnrajesh
Copy link

pnrajesh commented Nov 2, 2017

That would be great. I have trouble building the source with Visual Studio 2017 (some targetframework not getting recognized and processor architecture mismatch errors). I will try to resolve the build issues and try the fix. If I am not successful, I will take the fix after you make the release next week.

Thanks a lot @somdoron !

@somdoron
Copy link
Owner

somdoron commented Nov 4, 2017

@pnrajesh new AsyncIO version released.
Going to release a NetMQ prerelease after the PR is meged

@pnrajesh
Copy link

Hi @somdoron
I got the new AsyncIO version from NuGet, but the existing NetMQ version 4.0.0.1 throws the below exception when the application is run. Not sure how to resolve it. Would you be releasing a new NetMQ version that would work with the new AsyncIO 0.1.40?

Exception : Could not load file or assembly 'AsyncIO, Version=0.1.25.0'

@msykora
Copy link
Author

msykora commented Nov 16, 2017

I know you're focusing mostly on NetMQ4, but there be any chance of releasing a NetmQ 3 version that can actually upgrade AsyncIO? I cannot get any of these fixes as current NetMQ 3 has somewhere baked in dependency on AsyncIO 0.1.20.0, I tried everything, redirect doesn't work, editing the package dependency doesn't help... I would appreciate much if NetMQ could use the latest AsyncIO.

@somdoron
Copy link
Owner

somdoron commented Nov 16, 2017 via email

@msykora
Copy link
Author

msykora commented Nov 16, 2017

OK I can do that :)

@somdoron
Copy link
Owner

somdoron commented Nov 16, 2017 via email

@somdoron
Copy link
Owner

somdoron commented Nov 16, 2017 via email

@msykora
Copy link
Author

msykora commented Nov 16, 2017

siki

@msykora
Copy link
Author

msykora commented Nov 16, 2017

I would need to make a pull request for NetMQ 3 though, right?

@somdoron
Copy link
Owner

somdoron commented Nov 16, 2017 via email

@msykora
Copy link
Author

msykora commented Nov 16, 2017

Hhmm can't update to AsyncIO since part of NetMQ targets framework 3.5. Should I just leave that part on the old AsyncIO?

BTW not all unit tests are passing :-( how can I determine updating to AsyncIO 1.40 won't break anything when I can't trust the tests?

@msykora
Copy link
Author

msykora commented Nov 16, 2017

Is the 3.5 NetMQ still relevant? If it is in the same nuget package and has the same nuspec, I can't set the dependency to AsyncIO 1.40...
BTW when I tried to use AsyncIO 1.26 for NetMQ 3.5 it just wouldn't build and always be looking for 1.25 instead... Not sure what causes it, I hope this won't ruin attempts to use higher AsyncIO versions with NetMQ the same way as now, I suspect packaging both NetMQ versions together i sgoing to cause problems.

Would you be OK with releasing a new NetMQ package 3.3.3.5 with only the 4.0 target and newest AsyncIO?

@msykora
Copy link
Author

msykora commented Nov 16, 2017

Tried to push a branch to remote but I don't have rights :)

@msykora
Copy link
Author

msykora commented Dec 22, 2017

So how can I make that pull request?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants