Skip to content

Conversation

@skigrinder
Copy link
Contributor

Exception handling & memory leak fixes

Description

Exception handling & memory leak fixes

Motivation and Context

nanoFramework libraries Net and Net.Http were not handling basic exceptions and were causing applications to crash.
These libraries were also responsible for a memory leak that would cause ESP32 failures after a period of time.
No nanoFramework issues were opened for these problems

How Has This Been Tested?

Modifications tested with multiple ESP32 units sending & receiving http requests.
Verified that the Net and Net.Http libraries are no longer causing nanoFramework applications to crash.
Verified that the memory leak has been fixed.

Screenshots

Types of changes

  • Improvement (non-breaking change that improves a feature, code or algorithm)
  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to change)
  • Config and build (change in the configuration and build system, has no impact on code or features)
  • Dependencies (update dependencies and changes associated, has no impact on code or features)

Checklist:

  • My code follows the code style of this project.
  • My change requires a change to the documentation.
  • I have updated the documentation accordingly.
  • I have read the CONTRIBUTING document.
  • I have added tests to cover my changes.
  • All new and existing tests passed.

@nfbot nfbot added the Type: bug Something isn't working label Dec 11, 2020
@dnfadmin
Copy link

dnfadmin commented Dec 11, 2020

CLA assistant check
All CLA requirements met.

Copy link
Member

@josesimoes josesimoes left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@skigrinder please read my comments. Basically, all these methods are meant to throw exceptions when things go wrong. It's up to the caller to deal with those.

Wrapping them in try catch blocks does allow the caller app to continue, but it's not how it's meant to work. Exceptions aren't just evil they are supposed to provide information on what when wrong and allow the app to take steps to deal with them.

/// </remarks>
public Socket(AddressFamily addressFamily, SocketType socketType, ProtocolType protocolType)
{
m_Handle = NativeSocket.socket((int)addressFamily, (int)socketType, (int)protocolType);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is meant to be this way, in case there is an exception when creating a socket that is supposed to be returned to the caller. See the documentation here
SocketException is supposed to be thrown and it's up the caller to deal with it.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Understood and agreed.
SocketException is now being passed on to the caller

public void Connect(EndPoint remoteEP)
{
if (m_Handle == -1)
try
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Similar comment. This is supposed to throw exceptions when something goes wrong. See here

Again: we can hide those, it's up to the caller to handle them.
Otherwise we are are breaking the compatibility/behaviour with the full .NET framework.

{
// if we are on blocking connect
//Poll(-1, SelectMode.SelectWrite); // skigrinder - old code would poll until connection was established or exception thrown - this hung nanoFramework applications
if (!Poll(1500000, SelectMode.SelectWrite)) // 'microseconds' values (1st param) - 1500000 is 1.5 seconds;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the socket is set to blocking, the call is supposed to be blocked until the connection happens.
The way it's implemented it shouldn't prevent the other threads from running. If that's the case than it's an issue with the interpreter not here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we leave this timeout in until the interpreter is fixed??
Hanging the application here causes the .Net & .Net.Http libraries to be unusable.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How's that? The -1 here is exactly to make the execution stop until a connection is established (or the default timeout occurring)... Isn't that happening now?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No. My applications were hanging here. But I haven't tested the -1 option since the recent interpreter/extension changes were made.

/// </remarks>
public bool Poll(int microSeconds, SelectMode mode)
{
if (m_Handle == -1)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same here. See the official documentation here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

SocketException now being thrown


return NativeSocket.recv(this, buffer, offset, size, (int)socketFlags, m_recvTimeout);
//return NativeSocket.recv(this, buffer, offset, size, (int)socketFlags, m_recvTimeout);
return NativeSocket.recv(this, buffer, offset, size, (int)socketFlags, 1500);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We shouldn't override the m_recvTimeout field with a fixed value, instead the requested value should be honoured.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, this was a test case that I forgot to change back. I'm still getting CLR_E_FAIL errors on NativeSocket.recv()
Is this something that can be fixed in the interpreter/extension?

++++ Exception System.Net.Sockets.SocketException - CLR_E_FAIL (5) ++++
++++ Message: 
++++ System.Net.Sockets.NativeSocket::recv [IP: 0000] ++++
++++ System.Net.Sockets.Socket::Receive [IP: 0019] ++++
++++ System.Net.Sockets.NetworkStream::Read [IP: 0062] ++++
++++ System.Net.InputNetworkStreamWrapper::RefillInternalBuffer [IP: 0038] ++++
++++ System.Net.InputNetworkStreamWrapper::Read_HTTP_Line [IP: 004b] ++++
++++ System.Net.HttpWebRequest::ParseHTTPResponse [IP: 002e] ++++
++++ System.Net.HttpWebRequest::GetResponse [IP: 0040] ++++
++++ ESP32server.ESP32server::processOutgoingAPIrequests [IP: 0084] ++++
++++ ESP32server.ESP32server+<>c::<Main>b__24_0 [IP: 0003] ++++
++++ Exception System.Net.WebException - 0x00000000 (5) ++++
++++ Message: GetResponse() failed

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All the exceptions being thrown in native code look correct. Nothing obvious is misplaced.
Can you catch the SocketException and see is the error code?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've modified my application so that HttpWebRequest.Timeout is not set (i.e. I commented it out).
The application no longer gets the CLR_E_FAIL exceptions from NativeSocket.recv()
I suspect the exceptions had something to do with timeouts not being properly bubbled up to the user code.
Not sure if this is something that needs to be fixed immediately, but it would be nice to document the workaround.
I.e. "Don't set HttpWebRequest.Timeout before calling GetResponse()"

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is interesting... Follows my thoughts about this.

  1. m_timeout in HttpWebRequest is only used to setup a TimerCallback it never reaches the underlying socket.
  2. It's default is WebRequest.DefaultTimeout.
  3. If you're not getting an exception anymore, that's because everything happens as expected and the flow and sequence don't misbehave.
  4. If you get an exception when setting the timeout, it's likely that's happening inside the TimerCallback handler code, when it does something unexpected or invalid with the underlying socket. There are 2 calls to ParseHTTPResponse inside that handler, so this seems reasonable. It could be that extra checks are required there (or try catchs) to handle situations on which the expected sequence is processed gracefully but as the timer callback would be running, at expiry, it will wrongly try to do things with the socket that don't make sense anymore because the response has already been received and processed.

Just a theory, I haven't done anything to prove it, just looked at the code.

Copy link
Member

@josesimoes josesimoes left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

@josesimoes
Copy link
Member

@skigrinder I'm good with merging this, as the current issues are at HTTP library and don't seem to point here...
Anything against it?

@skigrinder
Copy link
Contributor Author

@josesimoes , Sounds good. Thanks for helping me to understand .NET exception handling better!!

@josesimoes josesimoes merged commit 2f66aa9 into nanoframework:develop Dec 23, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Type: bug Something isn't working

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants