Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

tls transport StoreProhibited exception when connection closed by server (IDFGH-7307) #8895

Closed
MarcGeh opened this issue May 4, 2022 · 14 comments
Labels
Resolution: Done Issue is done internally Status: Done Issue is done internally

Comments

@MarcGeh
Copy link

MarcGeh commented May 4, 2022

Environment

  • Development Kit: ESP32-Wrover-Kit
  • Kit version: v3
  • Module or chip used: ESP32-WROVER-E
  • IDF version: v4.4.1
  • Build System: idf.py
  • Compiler version: xtensa-esp32-elf-gcc (crosstool-NG esp-2021r2-patch3) 8.4.0
  • Operating System: Linux
  • Using an IDE?: No
  • Power Supply: USB

Problem Description

Hi guys, I'm currently trying to bring up the azure-iot-middleware-freertos to connect the ESP32 to azure. I'm using tls sockets for the communication based on the esp32 example iot-middleware-freertos-samples

Once the server is closing the connection, the firmware crashes with the unhandled exception "StoreProhibited".
I'm using esp_transport_ssl_init() to initialize the context for the transaction.
esp_transport_connect and subsequent calls to esp_transport_read/esp_transport_write are successful as well, until the server closes the connection, then I get the exception on a call of esp_transport_read.

I've been tracing the issue back to transport.c, where in the function capture_tcp_transport_error the err_handle from the esp_transport_handle_t is used to save the error code of the transaction. However, the err_handle was never initialized and thus a write into a memory region referenced by a NULL pointer is attempted, which causes the problem. I'm writing this issue since it makes the impression onto me, that esp_transport_ssl_init does not initialize the context properly, since the esp_transport_init function called by the esp_transport_ssl_init function never initializes the underlying transport foundation which is referenced by the capture_tcp_transport_error function.

Steps to reproduce

You can try to run the example from iot-middleware-freertos-samples repository, but setting it up is quite a hassle.

Debug Logs

I (10454) MQTT: Packet received. ReceivedBytes=2.
I (10454) MQTT: CONNACK session present bit not set.
I (10454) MQTT: Connection accepted.
I (10474) MQTT: Received MQTT CONNACK successfully from broker.
I (10474) MQTT: MQTT connection established with the broker.
I (10484) IOT: AzureIoTProvisioning established an MQTT connection with global.azure-devices-provisioning.net
I (14484) IOT: AzureIoTProvisioning attempting to subscribe to the MQTT topic: devices/+/messages/devicebound/#
I (14654) MQTT: Packet received. ReceivedBytes=3.
I (18964) MQTT: Packet received. ReceivedBytes=192.
I (18964) MQTT: De-serialized incoming PUBLISH packet: DeserializerResult=MQTTSuccess.
I (18964) MQTT: State record updated. New state=MQTTPublishDone.
Guru Meditation Error: Core  1 panic'ed (StoreProhibited). Exception was unhandled.

Core  1 register dump:
PC      : 0x4022d1e3  PS      : 0x00060e30  A0      : 0x8022da42  A1      : 0x3fff0710  
0x4022d1e3: capture_tcp_transport_error at /home/marc/esp/esp-idf_4_4/components/tcp_transport/transport.c:332

A2      : 0x3fff2698  A3      : 0x00008008  A4      : 0x00000001  A5      : 0x3f41212e  
A6      : 0x00000001  A7      : 0x0000000f  A8      : 0x8022d1b4  A9      : 0x3fff06e0  
A10     : 0x00000000  A11     : 0x3287c132  A12     : 0x3287c132  A13     : 0x3fff2740  
A14     : 0x00000001  A15     : 0x00000000  SAR     : 0x0000000a  EXCCAUSE: 0x0000001d  
EXCVADDR: 0x00000000  LBEG    : 0x4000c2e0  LEND    : 0x4000c2f6  LCOUNT  : 0xffffffff  


Backtrace:0x4022d1e0:0x3fff07100x4022da3f:0x3fff0740 0x4022cfb2:0x3fff0780 0x4010851f:0x3fff07b0 0x4010c3cd:0x3fff07f0 0x4010ab5d:0x3fff0820 0x4010ac9c:0x3fff0850 0x40109574:0x3fff0880 0x40108f6e:0x3fff08b0 0x40109264:0x3fff08f0 0x401002be:0x3fff0920 0x401003f1:0x3fff0a90 0x4010041f:0x3fff0ac0 0x40287ff9:0x3fff0af0 0x400d3630:0x3fff0b10 0x40096402:0x3fff0b40 
0x4022d1e0: capture_tcp_transport_error at /home/marc/esp/esp-idf_4_4/components/tcp_transport/transport.c:332

0x4022da3f: ssl_read at /home/marc/esp/esp-idf_4_4/components/tcp_transport/transport_ssl.c:241

0x4022cfb2: esp_transport_read at /home/marc/esp/esp-idf_4_4/components/tcp_transport/transport.c:190
@espressif-bot espressif-bot added the Status: Opened Issue is new label May 4, 2022
@github-actions github-actions bot changed the title tls transport StoreProhibited exception when connection closed by server tls transport StoreProhibited exception when connection closed by server (IDFGH-7307) May 4, 2022
@david-cermak
Copy link
Collaborator

Hi @MarcGeh

Thanks for the report!
This is a bug in tcp_transport component, the reason why it's not initialized is probably that the project uses the component without a list (internal container for multiple transport layers). This scenario is supported and should work properly, but it's not so widely used. Internally we use the transport only in lists, for example:

client->transport_list = esp_transport_list_init();
ESP_WS_CLIENT_MEM_CHECK(TAG, client->transport_list, goto _websocket_init_fail);
esp_transport_handle_t tcp = esp_transport_tcp_init();
ESP_WS_CLIENT_MEM_CHECK(TAG, tcp, goto _websocket_init_fail);
esp_transport_set_default_port(tcp, WEBSOCKET_TCP_DEFAULT_PORT);
esp_transport_list_add(client->transport_list, tcp, "_tcp"); // need to save to transport list, for cleanup

Sorry for the inconvenience, will fix it soon (similar issue has been fixed recently in e794786, again partly initialized transport, using one layer separately)

@MarcGeh
Copy link
Author

MarcGeh commented May 4, 2022

Hi @david-cermak thanks for the fast response. I need to use websockets in the future anyways, so I might switch over to use the websocket client right away to avoid this issue.

@tmaltesen
Copy link

Is there a workaround for this issue. Alle esp32 examples seems to be based on the same initialization using TLS. I am not sure how to fix this.

@euripedesrocha
Copy link
Collaborator

@tmaltesen for now you should use all transports from a transport_list. This will guarantee that all initialization is made correctly. The fix to allow usage without the list should land soon.

@tmaltesen
Copy link

Thanks a lot for the info.
I am new to this environment and are not exactly sure how to do it. Should I create a transport_list myself... can you give a little details. What is in the list, how to register it etc.

@euripedesrocha
Copy link
Collaborator

Look into Cermak 's answer for an example. You may also open a new issue with the description of the problem you are facing, presenting a short example on how you are using tcp_transport and the debug logs.

@tmaltesen
Copy link

Thanks again :-)
My problem is am an not using it directly, but through the "Azure IoT middleware for freertos" layer.
So it is deep into some code I have not deep insight into. And of course I need to get it running "yesterday" ;-(

@MarcGeh
Copy link
Author

MarcGeh commented Jun 23, 2022

@tmaltesen Unfortunately you will not make this work unless you are willing to make some minor changes to the transport_tls_esp32.c file. If this is an option for you I can guide you through this.

@tmaltesen
Copy link

tmaltesen commented Jun 23, 2022 via email

@tmaltesen
Copy link

tmaltesen commented Jun 23, 2022 via email

@MarcGeh
Copy link
Author

MarcGeh commented Jun 23, 2022

Change the NetworkContext definition to have an esp_transport_list_handle_t and a scheme name instead of the esp_transport_handle_t so it looks like this:

struct NetworkContext {
  const char* scheme;
  esp_transport_list_handle_t xTransportList;
  uint32_t ulReceiveTimeoutMs;
  uint32_t ulSendTimeoutMs;
};

Now in each function where pNetworkContext->xTransport is used, replace it by a local variable initialized like this:

  esp_transport_handle_t xTransport = esp_transport_list_get_transport(pNetworkContext->xTransportList, pNetworkContext->scheme);
  if (xTransport == NULL) {
    ESP_LOGE(TAG, "Failed to get transport handle for scheme %s", pNetworkContext->scheme);
    return eTLSTransportInvalidParameter;
  }

You still have to properly initialize the network struct and hand it over to the IoT client as part of the AzureIoTTransportInterface_t. E.g. my initialization of it looks like this:

  memset(&network_context_, 0, sizeof(NetworkContext_t));
  network_context_.xTransportList = esp_transport_list_init();
  network_context_.scheme = "_mqtt";
  if (network_context_.xTransportList == NULL) {
    ESP_LOGE(TAG, "Failed to create transport list");
  } else {
    esp_transport_handle_t mqtt = esp_transport_ssl_init();
    if (mqtt == NULL) {
      ESP_LOGE(TAG, "Failed to create transport handle");
    } else {
      esp_transport_list_add(network_context_.xTransportList, mqtt, network_context_.scheme);
    }
  }

Keep in mind that in the original example the call to esp_transport_ssl_init() was inside the TLS_Socket_Connect function. Now the initialization is handled outside of that module, so you have to take care to also tear it down once you are done with it.
For completeness, this is how your functions inside transport_tls_esp32.c should look like:

/*-----------------------------------------------------------*/

TlsTransportStatus_t TLS_Socket_Connect(NetworkContext_t *pNetworkContext, const char *pHostName, uint16_t usPort,
                                        const NetworkCredentials_t *pNetworkCredentials, uint32_t ulReceiveTimeoutMs, uint32_t ulSendTimeoutMs) {
  TlsTransportStatus_t xReturnStatus = eTLSTransportSuccess;

  if ((pNetworkContext == NULL) || (pHostName == NULL) || (pNetworkCredentials == NULL)) {
    ESP_LOGE(TAG,
             "Invalid input parameter(s): Arguments cannot be NULL. pNetworkContext=%p, "
             "pHostName=%p, pNetworkCredentials=%p.",
             pNetworkContext, pHostName, pNetworkCredentials);
    return eTLSTransportInvalidParameter;
  }

  esp_transport_handle_t xTransport = esp_transport_list_get_transport(pNetworkContext->xTransportList, pNetworkContext->scheme);
  if (xTransport == NULL) {
    ESP_LOGE(TAG, "Failed to get transport handle for scheme %s", pNetworkContext->scheme);
    return eTLSTransportInvalidParameter;
  }
  pNetworkContext->ulReceiveTimeoutMs = ulReceiveTimeoutMs;
  pNetworkContext->ulSendTimeoutMs = ulSendTimeoutMs;
  if (pNetworkCredentials->ppcAlpnProtos) {
    esp_transport_ssl_set_alpn_protocol(xTransport, pNetworkCredentials->ppcAlpnProtos);
  }

  if (pNetworkCredentials->xDisableSni) {
    esp_transport_ssl_skip_common_name_check(xTransport);
  }

  if (pNetworkCredentials->pucRootCa) {
    esp_transport_ssl_set_cert_data(xTransport, pNetworkCredentials->pucRootCa, strlen(pNetworkCredentials->pucRootCa));
  }

  if (pNetworkCredentials->pucClientCert) {
    esp_transport_ssl_set_client_cert_data(xTransport, pNetworkCredentials->pucClientCert, strlen(pNetworkCredentials->pucClientCert));
  }

  if (pNetworkCredentials->pucPrivateKey) {
    esp_transport_ssl_set_client_key_data(xTransport, pNetworkCredentials->pucPrivateKey, strlen(pNetworkCredentials->pucPrivateKey));
  }

  if (esp_transport_connect(xTransport, pHostName, usPort, 10000) < 0) {
    ESP_LOGE(TAG, "Failed establishing TLS connection (esp_transport_connect failed)");
    xReturnStatus = eTLSTransportConnectFailure;
  } else {
    xReturnStatus = eTLSTransportSuccess;
  }

  /* Clean up on failure. */
  if (xReturnStatus != eTLSTransportSuccess) {
    if (pNetworkContext != NULL) {
      esp_transport_close(xTransport);
    }
  } else {
    ESP_LOGI(TAG, "(Network connection %p) Connection to %s established.", pNetworkContext, pHostName);
  }

  return xReturnStatus;
}
/*-----------------------------------------------------------*/

void TLS_Socket_Disconnect(NetworkContext_t *pNetworkContext) {
  if ((pNetworkContext == NULL)) {
    ESP_LOGE(TAG, "Invalid input parameter(s): Arguments cannot be NULL. pNetworkContext=%p.", pNetworkContext);
    return;
  }

  esp_transport_handle_t xTransport = esp_transport_list_get_transport(pNetworkContext->xTransportList, pNetworkContext->scheme);
  if (xTransport == NULL) {
    ESP_LOGE(TAG, "Failed to get transport handle for scheme %s", pNetworkContext->scheme);
    return;
  }

  /* Attempting to terminate TLS connection. */
  esp_transport_close(xTransport);
}
/*-----------------------------------------------------------*/

int32_t TLS_Socket_Recv(NetworkContext_t *pNetworkContext, void *pBuffer, size_t xBytesToRecv) {
  int32_t tlsStatus = 0;

  if ((pNetworkContext == NULL) || (pBuffer == NULL) || (xBytesToRecv == 0)) {
    ESP_LOGE(TAG,
             "Invalid input parameter(s): Arguments cannot be NULL. pNetworkContext=%p, "
             "pBuffer=%p, xBytesToRecv=%d.",
             pNetworkContext, pBuffer, xBytesToRecv);
    return eTLSTransportInvalidParameter;
  }

  esp_transport_handle_t xTransport = esp_transport_list_get_transport(pNetworkContext->xTransportList, pNetworkContext->scheme);
  if (xTransport == NULL) {
    ESP_LOGE(TAG, "Failed to get transport handle for scheme %s", pNetworkContext->scheme);
    return eTLSTransportInvalidParameter;
  }

  tlsStatus = esp_transport_read(xTransport, pBuffer, xBytesToRecv, pNetworkContext->ulReceiveTimeoutMs);
  if (tlsStatus < 0) {
    ESP_LOGE(TAG, "Reading failed, errno= %d", errno);
    return ESP_FAIL;
  }

  return tlsStatus;
}
/*-----------------------------------------------------------*/

int32_t TLS_Socket_Send(NetworkContext_t *pNetworkContext, const void *pBuffer, size_t xBytesToSend) {
  int32_t tlsStatus = 0;

  if ((pNetworkContext == NULL) || (pBuffer == NULL) || (xBytesToSend == 0)) {
    ESP_LOGE(TAG,
             "Invalid input parameter(s): Arguments cannot be NULL. pNetworkContext=%p, "
             "pBuffer=%p, xBytesToSend=%d.",
             pNetworkContext, pBuffer, xBytesToSend);
    return eTLSTransportInvalidParameter;
  }

  esp_transport_handle_t xTransport = esp_transport_list_get_transport(pNetworkContext->xTransportList, pNetworkContext->scheme);
  if (xTransport == NULL) {
    ESP_LOGE(TAG, "Failed to get transport handle for scheme %s", pNetworkContext->scheme);
    return eTLSTransportInvalidParameter;
  }

  tlsStatus = esp_transport_write(xTransport, pBuffer, xBytesToSend, pNetworkContext->ulSendTimeoutMs);
  if (tlsStatus < 0) {
    ESP_LOGE(TAG, "Writing failed, errno= %d", errno);
    return ESP_FAIL;
  }

  return tlsStatus;
}
/*-----------------------------------------------------------*/

@tmaltesen
Copy link

tmaltesen commented Jun 23, 2022 via email

@MarcGeh
Copy link
Author

MarcGeh commented Jun 23, 2022

Make it on a as high as possible layer, since it is actually encouraged to reuse the same handle for as long as possible.

@tmaltesen
Copy link

tmaltesen commented Jun 23, 2022 via email

@espressif-bot espressif-bot added Resolution: Done Issue is done internally Status: Done Issue is done internally and removed Status: Opened Issue is new labels Jan 13, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Resolution: Done Issue is done internally Status: Done Issue is done internally
Projects
None yet
Development

No branches or pull requests

5 participants