Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

NullReferenceExceptions in OnClientSocketReceive and unhandeled SocketExceptions in async process #243

Closed
sprehn-ero opened this issue Feb 7, 2024 · 15 comments

Comments

@sprehn-ero
Copy link

sprehn-ero commented Feb 7, 2024

Hi TalAloni,

I can confirm, that with release 1.5.1.1 the ArgumentNullException has been resolved.
image

We are still seeing some NullReferenceExceptions in
void SMB2Client.OnClientSocketReceive(IAsyncResult ar)
image

We are also getting regularly SocketExceptions, which I am not sending to Sentry, cause it is waisting our quota.

Unfortunately I don't understand on which line, or which reference is set to null here.

I think this happens because the callback OnClientSocketReceive is still registered on the socket, even after Disconnect is called. It has somewhat todo with the Issue I raised earlier: #225 . We never found a root cause or solution to it.

Is there a specific reason you are calling Disconnect instead of Close? In line 462 close is called btw.

Documentation on BeginRecieve states this about Close (which essentially calls Dispose on the Socket):
https://learn.microsoft.com/en-US/dotnet/api/system.net.sockets.socket.beginreceive?view=net-8.0 :

Close the Socket to cancel a pending BeginReceive. When the Close method is called while an asynchronous operation is in progress, the callback provided to the BeginReceive method is called.

{
    if (m_isConnected)
    {
        m_clientSocket.Disconnect(reuseSocket: false); // in line 462 Close() is called, which is mentioned in the documentation as well.
        m_connectionState.ReceiveBuffer.Dispose();
        m_isConnected = false;
        m_messageID = 0u;
        m_sessionID = 0uL;
        m_availableCredits = 1;
    }
}```



@sprehn-ero
Copy link
Author

Difference between Disconnect and Socket was asked and answered on Stack Overflow
https://stackoverflow.com/questions/31031835/socket-disconnect-vs-socket-close

Does the answer make sense to you?

@sprehn-ero
Copy link
Author

sprehn-ero commented Feb 8, 2024

dms-app  | Unhandled exception. System.NullReferenceException: Object reference not set to an instance of an object.
dms-app  |    at SMBLibrary.NetBios.NBTConnectionReceiveBuffer.get_AvailableLength()
dms-app  |    at SMBLibrary.Client.SMB2Client.OnClientSocketReceive(IAsyncResult ar)
dms-app  |    at System.Threading.ExecutionContext.RunInternal(ExecutionContext executionContext, ContextCallback callback, Object state)
dms-app  | --- End of stack trace from previous location ---
dms-app  |    at System.Threading.ExecutionContext.RunInternal(ExecutionContext executionContext, ContextCallback callback, Object state)
dms-app  |    at System.Threading.Tasks.AwaitTaskContinuation.RunCallback(ContextCallback callback, Object state, Task& currentTask)
dms-app  | --- End of stack trace from previous location ---
dms-app  |    at System.Threading.Tasks.Task.<>c.<ThrowAsync>b__128_1(Object state)
dms-app  |    at System.Threading.ThreadPoolWorkQueue.Dispatch()
dms-app  |    at System.Threading.PortableThreadPool.WorkerThread.WorkerThreadStart()
dms-app  | Aborted (core dumped)
dms-app exited with code 0

this actually crashes the whole api

@sprehn-ero
Copy link
Author

SMB2Client
private void OnClientSocketReceive(IAsyncResult ar)
line 433 calls getter on property AvailableLength:
clientSocket.BeginReceive(receiveBuffer.Buffer, receiveBuffer.WriteOffset, receiveBuffer.AvailableLength, SocketFlags.None, OnClientSocketReceive, connectionState);

NBTConnectionReceiveBuffer
line 31
public int AvailableLength => m_buffer.Length - (m_readOffset + m_bytesInBuffer);

m_buffer must be null at this point (which means the receive buffer is already disposed at this point)

I believe it is therefore important to shutdown, and close the socket before disposing the buffer.
SMBClient2 lines 212 and 213

         m_clientSocket.Disconnect(reuseSocket: false);
         m_connectionState.ReceiveBuffer.Dispose();

@TalAloni
Copy link
Owner

Thanks,
I was not able to reproduce this, but calling Close on the socket is consistent with the server implementation, I have update the client code to invoke Close before disposing of the buffer - this change is included in v1.5.1.2, please update if there is any further issue.

@sprehn-ero
Copy link
Author

Hi,
wanted to upgrade to 1.5.1.2 and test it, but don't find the version yet. Are you sure 1.5.1.2 is released?

@TalAloni
Copy link
Owner

Sorry, forgot to hit the button, 1.5.1.2 is out now.

@argentini
Copy link

@TalAloni I'm using 1.5.1.3 and this is still happening randomly... sometimes when I'm just recursing a directory path, or creating a file or directory. Is there a recommended way to handle the lifecycle of client requests? I've been using try/catch/finally, and disconnecting the client and then the filestore in the "finally" block.

@TalAloni
Copy link
Owner

TalAloni commented May 5, 2024

@argentini adding information about a 1.5.1.1 issue that was already resolved and presenting it as an 1.5.1.3 issue is a sure way for miscommunication and I suspect it will not lead to any resolution.
Please create a new issue and explain exactly what are you doing and what are the symptoms and how to recreate. Thanks!

@argentini
Copy link

@TalAloni Understood. I assumed this was the same issue so I thought it made sense to add a comment here. Since the exception cannot be caught I cannot inspect it, and I'm not sure how to reproduce it. The large file read/write client examples appear to be incorrect. They use a streams but do not close the streams or declare them in a using block. So all this is why I was asking about the lifecycle. I want to make sure I'm using the library correctly to see if the issue reappears so I can try to find replication steps. Otherwise you won't stand a chance at fixing it.

@sprehn-ero
Copy link
Author

I am running 1.5.1.3 and we still see System.NullReferenceException
image

It has to do with some background task still trying to process a response, even after we initiated the disconnect.

@TalAloni
Copy link
Owner

@sprehn-ero
Thanks, any further information you can provide on how to reproduce this would be extremely helpful.

@sprehn-ero
Copy link
Author

private async Task<T?> ExecuteWithFileStoreAsync<T>(Func<SMB2Client, ISMBFileStore, Task<T>> operation)
{
    var client = new SMB2Client();
    ISMBFileStore? fileStore = null;
    try
    {
        if (!client.Connect(_smbHostname, SMBTransportType.DirectTCPTransport))
        {
            _logger.LogError("Failed to connect to SMB server {hostname}", _smbHostname);
            return default;
        }

        if (client.Login(_smbDomain, _smbUsername, _smbPassword) != NTStatus.STATUS_SUCCESS)
        {
            _logger.LogError("Failed to authenticate on SMB server {hostname} as {domain} {username}", _smbHostname, _smbDomain, _smbUsername);
            return default;
        }

        fileStore = client.TreeConnect(_smbShare, out NTStatus status);
        if (status != NTStatus.STATUS_SUCCESS)
        {
            _logger.LogError("client.TreeConnect returned with status: {status}", status);
            return default;
        }

        if (fileStore == null)
        {
            _logger.LogError("client.TreeConnect returned null");
            return default;
        }
        return await operation.Invoke(client, fileStore);

    }
    catch (SocketException ex)
    {
        _logger.LogError(ex, "Failed to connect to SMB server {hostname}", _smbHostname);
        return default;
    }
    finally
    {
        try
        {
            if (client.IsConnected && fileStore != null && fileStore.Disconnect() != NTStatus.STATUS_SUCCESS) _logger.LogWarning("Failed to disconnect from fileStore on SMB server {hostname}", _smbHostname);
            if (client.IsConnected && client.Logoff() != NTStatus.STATUS_SUCCESS) _logger.LogWarning("Failed to log off from SMB server {hostname}", _smbHostname);
            if (client.IsConnected) client.Disconnect();
        }
        catch (SocketException ex)
        {
            _logger.LogWarning(ex, "Failed to cleanly disconnect from SMB server {hostname}", _smbHostname);
        }
    }
}

We use this method as a wrapper arround operations which make use of SMBClient:

for example

await ExecuteWithFileStoreAsync(async (client, fileStore) =>
{
    EnsureDocumentFolderExists(fileStore, folder);
    DateTime? result = GetLastModifiedDate(fileStore, filePath);

    if (!(result == null || docMeta.ModificationDate > result)) return false;
  
     using var stream = await httpClient.GetStreamAsync($"{_configuration["DMS:Url"]!}dms/v1/documents/{docMeta.ID}/data");
    UploadFile(client, fileStore, filePath, stream);
    RemoveAnyOtherFileInFolder(client, fileStore, folder, docMeta.Name);
  
    return true;
});

I think implementation of EnsureDocumentFolderExists, UploadFile, RemoveAnyOtherFileInFolder is not relevant.
But ExecuteWithFileStoreAsync gets called in a sequential batch operations couple thousand times. I suspect to reproduce the exception, you just need to execute one of your examples really often in a loop.

Somehow the receiveBuffer is already released and some background task is still trying to process a response

@TalAloni
Copy link
Owner

@sprehn-ero
I tried various loops and I'm not able to reproduce.

I tried to analyze the stack trace you provided: Getting NullReferenceException in OnClientSocketReceive when calling clientSocket.BeginReceive (in SMB2Client line 404) means that the connection buffer was disposed while receiving a packet.
I'm not sure how this can happen unless two threads are working with the same client simultaneously.
if you can help repro it will be very helpful.

@TalAloni
Copy link
Owner

#257 was created to track the NullReferenceException issue in v1.5.13

@TalAloni
Copy link
Owner

TalAloni commented Jun 5, 2024

The follow-up issue that has been tracked by #257 has been resolved in v1.5.2
Thank you guys for reporting the original issue and the follow-up issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants