Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Ensure orderly shutdown of ssl socket #7291

Merged
merged 3 commits into from
Dec 8, 2022
Merged

Conversation

jepler
Copy link
Member

@jepler jepler commented Dec 2, 2022

A crash would occur if an SSL socket was not shut down before gc_deinit().

I do not fully understand the root cause, but some object deinitialization / deallocation prior to gc_deinit leaves the SSL object in an inconsistent state.

Rather than resolve the root cause, instead ensure that the closing of the user socket also closes the SSL socket.

Testing performed: on Feather ESP32S3 TFT run the reproducer script >10 times. Before, it crashed after 1 time. However, running 10 times helps show I'm not using up socket file descriptors accidentally.

This feels a bit voodoo so I'd love it if instead of incorporating this blindly, someone else went deeper and found the truth.

Closes: #6502

A crash would occur if an SSL socket was not shut down before
`gc_deinit()`.

I do not fully understand the root cause, but some object deinitialization
/ deallocation prior to `gc_deinit` leaves the SSL object in an
inconsistent state.

Rather than resolve the root cause, instead ensure that the closing of
the user socket also closes the SSL socket.

Closes: adafruit#6502
@dhalbert
Copy link
Collaborator

dhalbert commented Dec 2, 2022

I am going to spend a little time seeing if I can find the root cause.

@dhalbert
Copy link
Collaborator

dhalbert commented Dec 2, 2022

I do not understand the flow all that well, but this seems a bit fishy to me:

void socketpool_socket_reset(socketpool_socket_obj_t *self) {
if (self->base.type == &socketpool_socket_type) {
return;
}
self->base.type = &socketpool_socket_type;
self->connected = false;
self->num = -1;
}

Under what circumstances is the passed in socket not a Socket object? Could it be an SSLSocket? If so then fields in it are set willy-nilly. Also the SSLSocket would not be shut down. And if it's a plain Socket, it's kind of being rendered unfunctional, but the underlying lwip socket isn't decommissioned in any way.

Copy link
Collaborator

@dhalbert dhalbert left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like the basic idea of this: it is more robust than the previous code.

One q on a specific change, and also a question in the main comments thread.

ports/espressif/common-hal/socketpool/Socket.h Outdated Show resolved Hide resolved
@jepler
Copy link
Member Author

jepler commented Dec 7, 2022

I had a theory that for an ssl socket sss created from a regular socket s, closing s first then sss could provoke the same problem without going through interpreter shutdown. However, I was not able to make this happen in practice.

@jepler jepler requested a review from dhalbert December 7, 2022 21:51
@jepler
Copy link
Member Author

jepler commented Dec 7, 2022

The program I tried:

import wifi, socketpool, ssl, time, gc
wifi.radio.connect('<omitted>')
import socketpool
socket = socketpool.SocketPool(wifi.radio)
ctx = ssl.create_default_context()

b = bytearray(24)
for i in range(8):
    print(i)
    s = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
    sss = ctx.wrap_socket(s)
    sss.connect(('finance.yahoo.com', 443))
    print(sss.send(b"GET /quote/%5EIXIC HTTP/1.0\r\nHost: finance.yahoo.com\r\n\r\n"))
    print(sss.recv_into(b))
    print(bytes(b))
    s.close()
    gc.collect()
    sss.close()
    gc.collect()

@jepler
Copy link
Member Author

jepler commented Dec 7, 2022

Under what circumstances is the passed in socket not a Socket object? Could it be an SSLSocket? If so then fields in it are set willy-nilly. Also the SSLSocket would not be shut down. And if it's a plain Socket, it's kind of being rendered unfunctional, but the underlying lwip socket isn't decommissioned in any way.

Scott answered this in a voice chat. The routine is actually initializing a socket object for use by the web workflow. The name is not very self-documenting, so changing it and adding appropriate comments would be welcome as a separate PR.

Copy link
Collaborator

@dhalbert dhalbert left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

@dhalbert dhalbert merged commit 44af052 into adafruit:main Dec 8, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
2 participants