Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

WIP: fix lwip polling to unconditionally return POLLHUP/ERR/NVAL #5222

Conversation

dpgeorge
Copy link
Member

See #4290 and #5172

@dpgeorge
Copy link
Member Author

The hard part with this is to find tests that test the error paths. I've found 2 such tests (submitted here) for 2 of the cases returning HUP/ERR/NVAL (poll on a new socket, and poll after socket closure). A third WIP test is added for another case (peer did a RST). And there's still a 4th case which I did not fix yet because I couldn't find a test for it (other error not within one of the previous 3 classes).

Note that this PR also adds POLLNVAL internally to get compliance with POSIX behaviour.

@dpgeorge
Copy link
Member Author

@t35tB0t it would be great if you could test this.

@t35tB0t
Copy link

t35tB0t commented Oct 17, 2019 via email

@dpgeorge
Copy link
Member Author

@t35tB0t yes there are 5 paths to test. I tried to write minimal tests for these but it's not easy. It'd be great if you could try to test them in whatever way you have available.

  1. This is easy, the modified tests/net_hosted/connect_poll.py in this PR tests this path.
  2. As you say, this currently will not return HUP/ERR/NVAL, and probably it shouldn't, although if you can find a test that should (eg by using CPython, or unix MicroPython) then that would be valuable (but I wouldn't waste time on it).
  3. I'm confident this part is now correct but I couldn't find a simple and reliable test for it, to show it's correct; tests/net_hosted/poll_errors.py in this PR is my attempt at a test.
  4. This is a new else-if path that I added specifically for the case of EBADF, and the test is easy, see modification to tests/extmod/uselect_poll_basic.py in this PR. Probably this case doesn't affect the issues you were seeing.
  5. I didn't fix this path (make it return POLLERR unconditionally) because I couldn't find a way to trigger this path with a test. If you have any ideas, please let us know!

@t35tB0t
Copy link

t35tB0t commented Oct 18, 2019 via email

@t35tB0t
Copy link

t35tB0t commented Oct 18, 2019 via email

@dpgeorge
Copy link
Member Author

However, not returning an err on condition #5 is concerning.
...
It might be safer to either return the error condition
...
If it isn't ever supposed to happen, then it would be a very bad thing - IMHO we still should trap it here and report it up the call stack.

Yes I agree that (5) should be fixed as well (to unconditionally return an error), and I'm happy to fix it even without having a test for it. It would just be nice to find such a test if possible (although not critical to move forward with this fix).

@t35tB0t
Copy link

t35tB0t commented Oct 19, 2019

@dpgeorge - Testing confirms unsolicited errors are returned when remote connections are abandoned or reset. It is important to note that unsolicited errors must now be handled in user code (this includes uasyncio and other modules). Specifically, existing modules have race conditions and missing exception handling which will now require fixing. Previously, these modules were prone to hanging when the socket errors were not returned by modlwip. This PR fixes the hanging and (with proper error handling), user apps can now tolerate fairly poor networking connections.

The POLLHUP response when socket is reset was tested with the server (mPy PR5222 on STM32) running the server script below. The remote client was running a variable delay connection open/close using the SO_LINGER socket option (see pollhup_client.py). We are looking for a simple way to stimulate the fifth error condition (negative state) in modlwip. In the meantime, it seems prudent to modify the fifth clause in modlwip. Providing complete condition coverage will prevent applications from hanging under any socket error state.

pollhup_server.zip
pollhup_client.zip

@t35tB0t
Copy link

t35tB0t commented Oct 20, 2019 via email

@dpgeorge
Copy link
Member Author

@t35tB0t thanks for the testing. I managed to use your SO_LINGER test to reliably trigger a TCP RST after connection, thus providing a test for case #3 above. In my tests, comparing to CPython behaviour, I found that poll should return (unsolicited) both of POLLHUP and POLLERR.

The POLLHUP response when socket is reset was tested with the server (mPy PR5222 on STM32) running the server script below. The remote client was running a variable delay connection open/close using the SO_LINGER socket option (see pollhup_client.py).

Ok, so we agree on this case #3 then. I've pushed a commit to add POLLERR to the return of case #3 and tested that it works with these test scripts of yours.

(I also found some other minor bugs with lwip, like abandoning queued incoming data when it gets a RST, but that's independent to the poll issues here and can be looked at later.)

@dpgeorge
Copy link
Member Author

Using the sandbox.zip tests above I can trigger cases #2 and #3, and according to this test these cases are now handled correctly by the commits in this PR.

@dpgeorge
Copy link
Member Author

And I've also pushed a final commit to this PR to unconditionally return POLLERR for case number #5 (general socket error). Even though we don't have a test for this I'm confident with this change based on how poll should behave.

@dpgeorge
Copy link
Member Author

I slightly modified this PR when handling case #3, to not add a return of POLLERR, because in some cases it should not be returned in this path.

Merged in 71401d5 through 26d8fd2

@dpgeorge dpgeorge closed this Oct 31, 2019
@dpgeorge dpgeorge deleted the extmod-lwip-poll-unconditional-error branch October 31, 2019 02:47
tannewt pushed a commit to tannewt/circuitpython that referenced this pull request Aug 27, 2021
…tout

Make the board ID available in board and boot_out
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants