Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add std::nothrow and null checks to netsocket lib #14223

Closed
wants to merge 1 commit into from

Conversation

pea-pod
Copy link
Contributor

@pea-pod pea-pod commented Feb 1, 2021

Summary of changes

This changes adds std::nowthrow and the checks for allocation failure to all places in the netsocket portion that lacked it. This change addresses #14210.

Impact of changes

add_event_listener in NetworkInterface now returns an error if the method fails. Previous attempts to add the event listener would attempt to use an unchecked standard dynamically allocated ns_list_* item.

In other cases, the dynamically allocated items will now be checked, and if unsuccessful, will return after cleaning up any outstanding issues.

TCPSocket::accept will now check that its own internally allocated new TCPSocket call will succeed, and if not, will clean up the stack resources. This should help when memory is low but an incoming connection requests a connection when the TCPSocket is listening.

Migration actions required

As new calls are now handled, code that did not check against this failure may now check for failure and handle it at the application layer.

add_event_listener now returns nsapi_error_t instead of void. The two return values possible are NSAPI_ERROR_OK and NSAPI_ERROR_NO_MEMORY in the case of memory allocation failure.

Documentation

NetworkInterface.h now has changes to the add_event_listener method. This method will now return nsapi_error_t instead of void. The docstring for the interface has been updated.


Pull request type

[] Patch update (Bug fix / Target update / Docs update / Test update / Refactor)
[] Feature update (New feature / Functionality change / New API)
[X] Major update (Breaking change E.g. Return code change / API behaviour change)

Test results

[] No Tests required for this change (E.g docs only update)
[X] Covered by existing mbed-os tests (Greentea or Unittest)
[] Tests / results supplied as part of this PR

Reviewers


@pea-pod
Copy link
Contributor Author

pea-pod commented Feb 1, 2021

@0xc0170 I had a question: I changed a return value from void to nsapi_error_t. This should not be a breaking change, but it is also an API change. Right now, I have it listed as a patch update because all existing code should still be able to call the function and ignore the return code but now with a warning. I can change this to a major change, since it does change the API and change a return value.

@ciarmcom ciarmcom added the release-type: patch Indentifies a PR as containing just a patch label Feb 1, 2021
@ciarmcom ciarmcom requested review from a team February 1, 2021 15:00
@ciarmcom
Copy link
Member

ciarmcom commented Feb 1, 2021

@pea-pod, thank you for your changes.
@ARMmbed/mbed-os-connectivity @ARMmbed/mbed-os-maintainers please review.

Copy link
Member

@pan- pan- left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the submission @pea-pod .

I think that is a good change but any change to an API signature is a breaking change.
For example, the following code will stop to work:

void sink(void (NetworkInterface::*)(mbed::Callback<void(nsapi_event_t, intptr_t)>));

void foo() { 
    // Error, the signature was void(mbed::Callback<void(nsapi_event_t, intptr_t)>)
    // and is now nsapi_error_t(mbed::Callback<void(nsapi_event_t, intptr_t)>)
    sink(&NetworkInterface::add_event_listener);
}

These type of code are rare but will break with the proposed change.
There is also a change from a behavior perspective. The pointers were not unchecked; they were simply not used because the system abort the execution of the program if new returns a nullptr.

To avoid backward compatibility hell, I'd suggest adding a compilation flag that allows the application to select the new behavior and signature. The void signature can exhibit include a deprecation notice explaining that it will be removed and which compiler flag should be enabled to make it go away.

Internally, the implementation can retain uses of std::nothrow and call mbed_error to replicate the old behavior or return the error for the new behavior.

When a major release is made, we can remove the compilation flag and associated wards in the code.

query->socket_cb_data = new (std::nothrow) SOCKET_CB_DATA;
if (!query->socket_cb_data) {
nsapi_dns_query_async_resp(query, NSAPI_ERROR_NO_MEMORY, NULL);
return;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not return an error ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

While I won't pretend I understand this portion too much, the function signature here is static void nsapi_dns_query_async_create(void *ptr). This function is a helper for the async DNS query, and I believe the nsapi_dns_query_async_resp call takes care of the error return call.

I did not believe I looked hard enough, however, because I believe that I did not clean up the UDP socket and the appropriate call to the stack created earlier in the same function call.

@pea-pod
Copy link
Contributor Author

pea-pod commented Feb 1, 2021

Thanks for the submission @pea-pod .

I think that is a good change but any change to an API signature is a breaking change.
For example, the following code will stop to work:

void sink(void (NetworkInterface::*)(mbed::Callback<void(nsapi_event_t, intptr_t)>));

void foo() { 
    // Error, the signature was void(mbed::Callback<void(nsapi_event_t, intptr_t)>)
    // and is now nsapi_error_t(mbed::Callback<void(nsapi_event_t, intptr_t)>)
    sink(&NetworkInterface::add_event_listener);
}

These type of code are rare but will break with the proposed change.

I will admit that I never really conceived of this level of indirection. Regarding your example specifically, is this actually possible as is? I am not trying to nitpick, but don't you also need need a reference to the instance? Assuming that change went into your example, would it change anything? I think not, but I am not nearly as knowledgeable here.

There is also a change from a behavior perspective. The pointers were not unchecked; they were simply not used because the system abort the execution of the program if new returns a nullptr.

I understand what you are getting at in a low level, but I would not consider the device trapping due to attempting to set values at location 0x00000000 an intentional behavior of the underlying software. Unless I am mistaken, the drivers, etc., should not crash the system if a function call or operation cannot be completed, assuming the input was valid to begin with. This point here may be moot due to the return value change making this a breaking/new API change.

To avoid backward compatibility hell, I'd suggest adding a compilation flag that allows the application to select the new behavior and signature. The void signature can exhibit include a deprecation notice explaining that it will be removed and which compiler flag should be enabled to make it go away.

Internally, the implementation can retain uses of std::nothrow and call mbed_error to replicate the old behavior or return the error for the new behavior.

When a major release is made, we can remove the compilation flag and associated wards in the code.

I agree with the idea compilation flags for the add_event_listener. However, I do not with the other ones. Attempting to use a nullptr as a valid struct (or object, etc.) reference should be considered a bug, and thus the changes to fix it a bug fix. By this, I mean those changes which now handle nullptr returns should not require a compilation flag.

@pan-
Copy link
Member

pan- commented Feb 1, 2021

Of course, the code I provided is possible as is. You can see the compilation fail here: https://godbolt.org/z/j3Gb64

Unless I am mistaken, the drivers, etc., should not crash the system if a function call or operation cannot be completed, assuming the input was valid to begin with. This point here may be moot due to the return value change making this a breaking/new API change.

I agree with you and that is why I think that PR is important, however I wanted to point out that the system was not crashing due to an access to a nullptr. It crashes in the allocation routine which is - in a way - less damaging especially on system where 0x00000000 can be dereferenced.

That's something we lose in the new implementation of nsapi_dns_query_async_create. It just return, silently omitting something went wrong. This can lead to unexpected behavior further down the road.

@pea-pod
Copy link
Contributor Author

pea-pod commented Feb 2, 2021

Of course, the code I provided is possible as is. You can see the compilation fail here: https://godbolt.org/z/j3Gb64

Unless I am mistaken, the drivers, etc., should not crash the system if a function call or operation cannot be completed, assuming the input was valid to begin with. This point here may be moot due to the return value change making this a breaking/new API change.

I agree with you and that is why I think that PR is important, however I wanted to point out that the system was not crashing due to an access to a nullptr. It crashes in the allocation routine which is - in a way - less damaging especially on system where 0x00000000 can be dereferenced.

That's something we lose in the new implementation of nsapi_dns_query_async_create. It just return, silently omitting something went wrong. This can lead to unexpected behavior further down the road.

I apologize. I was not very clear in my response.

First, what I meant with the question about it working was simply that your example seemed to lack a context pointer in the sink function you created. Your linked example did not have it either. This was partially for my understanding. I am still unexperienced at C++ and I want to make sure I understand things correctly. I do appreciate the linked example, however, because it helped me understand your point.

Second, I do agree with adding the compiler checks to the add_event_handler function. Your example does prove it will not work without some modification to the user's application source.

My final point of clarification is how to deal with the other changes I made. I am not sure if you were saying that all changes should have the conditional compilation, or just the add_event_handler function itself. I think the correct way is to only add the compilation guards to the add_event_handler but as before, I am happy to consider alternatives or understand where I was wrong.

pan-
pan- previously requested changes Feb 2, 2021
Copy link
Member

@pan- pan- left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To move this PR forward I think the following points could be improved:

  • add_event_listener should have different signatures based on a compile time parameter. This will preserve backward compatibility.
  • TLSSocketWrapper::print_mbedtls_error can use a buffer allocated on the stack. It is just 128 bytes large.
  • The change in nsapi_dns_query_async_create does not require to forward the error in a return statement as the call to nsapi_dns_query_async_resp does it. However, the socket created above in the function must be deleted.

@@ -427,7 +435,11 @@ void TLSSocketWrapper::print_mbedtls_error(MBED_UNUSED const char *name, MBED_UN
{
// Avoid pulling in mbedtls_strerror when trace is not enabled
#if defined FEA_TRACE_SUPPORT && defined MBEDTLS_ERROR_C
char *buf = new char[128];
char *buf = new (std::nothrow) char[128];
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This can be a buffer on the stack: char buf[128]. We don't need memory allocation for 128 bytes.

@mergify mergify bot dismissed pan-’s stale review February 2, 2021 23:01

Pull request has been modified.

@pea-pod
Copy link
Contributor Author

pea-pod commented Feb 3, 2021

After the updates, I now fail spell checking in several locations, but none of the words were misspelled. I even put one instance in backticks but the spellchecker flagged it anyway.

I do not know how to proceed forward at this point.

@pea-pod
Copy link
Contributor Author

pea-pod commented Feb 3, 2021

@pan- @0xc0170

Since my issue is with the CI process, would one of you be able to help me with it soon? I don't mind fixing things, but this one has got me stumped.

Thanks.

@pan-
Copy link
Member

pan- commented Feb 5, 2021

@0xc0170 What's the error here ? Is it the use of nothrow ?

@pea-pod
Copy link
Contributor Author

pea-pod commented Feb 5, 2021

Per the build error on the CI:

Errors: 
405: nothrow
64: conf
66: conf
67: conf
78: conf
408: conf
531: conf
535: conf

It does choke on "std::nothrow" but it also fails at several places that I wrote "configure" or "configuration" (etc.) instead of "conf".

@pea-pod
Copy link
Contributor Author

pea-pod commented Feb 10, 2021

@0xc0170 @pan- @adbridge: Could someone look at this please? It's been open for 9 days, and last change 8 days ago, and the last mbed person commenting 5 days ago.

I am waiting, presumably, for someone to tell me where the error is or even just what I need to do in order for this to go in.

@ciarmcom ciarmcom added the stale Stale Pull Request label Feb 14, 2021
@ciarmcom
Copy link
Member

This pull request has automatically been marked as stale because it has had no recent activity. @pea-pod, please carry out any necessary work to get the changes merged. Thank you for your contributions.

@pea-pod
Copy link
Contributor Author

pea-pod commented Feb 14, 2021

I have changed the PR to reflect the return code change, but given that I have #if guards, I am not certain that this is correct. Also, this has been marked stale because I am awaiting a response from someone in the mbed organization with understanding of why my PR is failing the spell check.

@pea-pod
Copy link
Contributor Author

pea-pod commented Feb 24, 2021

@0xc0170 Do you now have time to look at the CI error?

@adbridge
Copy link
Contributor

@pea-pod I've pinged @0xc0170 directly

Copy link
Contributor

@0xc0170 0xc0170 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These 2 should fix the Travis failures.

Sorry for the delay

* Application may only use attach() or add_event_listener() interface. Mixing usage
* of both leads to undefined behavior.
*
* @warning This version of the function does not use the `std::nothrow` feature. Subsequently,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To fix the spellchecker:

add nothrow to mbed-os/tools/test/travis-ci/doxy-spellchecker/ignore.en.pws

* @warning This version of the function does not use the `std::nothrow` feature. Subsequently,
* the function may fail to allocate memory and cause a system error. To use the new
* version with the changes, set "nsapi.add-event-listener-return-change": 1 in the
* target overrides section in your mbed_app.conf file.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this should be mbed_app.json (not conf)

@0xc0170 0xc0170 requested a review from pan- March 11, 2021 08:18
@pea-pod
Copy link
Contributor Author

pea-pod commented Mar 15, 2021

@0xc0170 Thank you. I should be able to get to these changes this week.

I believe that since my changes are guarded with #ifdef guards, this should be a patch. When the guards get removed, it should be a major update. I am going to change it back to patch for now. Whenever you go from 6.x to 7.0, I (or someone) can adjust for this.

This isn't the right place for discussion, but is someone collecting a punch list of changes that should go into a 7.0? I would not mind even starting a PR that would get shelved, but I think it would be in danger of becoming unable to merge at some point. A list would be an compromise between no changes and a shelved PR.

@chrisswinchatt-arm
Copy link
Contributor

chrisswinchatt-arm commented Apr 1, 2021

Hi @pea-pod,

Is the intention here that the API consumer be able to handle the OOM condition, or is there just a concern that the allocation would silently return nullptr? Because if you look at platform/source/mbed_retarget.cpp, all the operator new variants, with the exception of the std::nothrow variants, handle allocation failure internally by calling mbed_error and crashing the program. Only the nothrow versions can actually return nullptr.

I can pick this PR up and get it merged if there is a desire to return the OOM code from the API so the consumer can handle it somehow (e.g. delete some cached objects) but if it is just to prevent silent failures, then calling the nothrow versions and calling mbed_error if they return nullptr feels redundant.

@0xc0170
Copy link
Contributor

0xc0170 commented May 4, 2021

I'll close this as it has not been updated (as proposed the new PR can be created once we get more details as requested above)

@0xc0170 0xc0170 closed this May 4, 2021
@mergify mergify bot removed needs: work release-type: patch Indentifies a PR as containing just a patch stale Stale Pull Request labels May 4, 2021
chrisswinchatt-arm added a commit to chrisswinchatt-arm/mbed-os that referenced this pull request Jun 7, 2021
chrisswinchatt-arm added a commit to chrisswinchatt-arm/mbed-os that referenced this pull request Jun 7, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants