Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

http_client (GET / POST) stability problems err no x7004 ESP_ERR_HTTP_FETCH_HEADER (IDFGH-5271) #7036

Closed
pdemianczuk07 opened this issue May 17, 2021 · 27 comments
Labels
Resolution: Done Issue is done internally Status: Done Issue is done internally

Comments

@pdemianczuk07
Copy link

pdemianczuk07 commented May 17, 2021

I have similar problem like in #2684 but the issue is closed.
I use ETH with W5500 module by SPI (but on wifi I also see the problem but rally rare).
My communication with server breaks sometimes.
I cannt recievd all fragments and in next attempt i see error log ESP_ERR_HTTP_FETCH_HEADER.
I have a little describe this problem in topic:
#2684 (comment)

I suspect a problem with the LWIP configuration and I have tried to tackle the problem on forum
https://www.esp32.com/viewtopic.php?f=13&t=20859&p=76493#p76493
but any one respond me.
I used ESP-IDF v4.4-dev-1254-g639e7ad49-dirty

esp_http_client_close() and/or esp_http_client_cleanup() dosnt solve the problem.
I used esp_restart() and initalization all peraph and connection and new connection.

Environment

Own product pcb using:

  • Module or chip used: [ESP32-WROOM-32D|]
  • IDF version (run git describe --tags to find it): ESP-IDF v4.4-dev-1254-g639e7ad49-dirty
  • Build System: [idf.py]
  • Compiler version (run xtensa-esp32-elf-gcc --version to find it): xtensa-esp32-elf-gcc (crosstool-NG esp-2019r2) 8.2.0
    // 1.22.0-80-g6c4433a
  • Operating System: [Linux]
  • Using an IDE?: [Yes (Eclipse]
  • Power Supply: [external 3.3V]

Problem Description

Problem with received all fragments of response in POST method.

Expected Behavior

Stabilibty connection with server.

Actual Behavior

I (00:07:05.830) HTTP_CLIENT: HTTP_EVENT_ON_HEADER, key=content-length, value=48000
I (00:07:05.860) HTTP_CLIENT: 8192 ---> 8192
I (00:07:05.869) HTTP_CLIENT: 8192 ---> 16384
I (00:07:05.872) HTTP_CLIENT: 8192 ---> 24576
I (00:07:05.875) HTTP_CLIENT: 8192 ---> 32768
I (00:07:05.877) HTTP_CLIENT: 3940 ---> 36708
--lost communication after timeout
and next attempt with log
E (00:07:18.005) HTTP_CLIENT: HTTP POST request failed: ESP_ERR_HTTP_FETCH_HEADER no 28676 (7004)

I have tried use the same connection for GET i POST method.

Code to reproduce this issue

void test_connect()
{
esp_http_client_config_t config = {
.event_handler = _http_event_handler,
.buffer_size = 20480,
//.method=HTTP_METHOD_GET,
};

http_client_set_config(&config); 
esp_http_client_handle_t client = esp_http_client_init(&config);
   err_post = http_client_post(&request_param, &client);
if (REQUEST_CODE_OK != (err_post>>16) ) {
		printf("[ERR][ODPOW]  CODE %d: \n %.*s \n  *Breaking...\n", (err_post>>16) , HOW_MANY_CHAR, response);
		printf("esp_http_client_close() \n");
		esp_http_client_close(client);
		//	printf("esp_http_client_cleanup() \n");
		//esp_http_client_cleanup(client);
		delay(5000);
		esp_restart();
		break;
}

}

void http_client_set_config (esp_http_client_config_t * com_config)
{
com_config->path = (get_info_node()->addr_url);
com_config->timeout_ms = 60001; //!< Network timeout in milliseconds */
com_config->port = ADRES_PORT;
com_config->transport_type = HTTP_TRANSPORT_OVER_TCP;
com_config->buffer_size_tx = 1024,
printf("PREPARE TO connect to host \t path \t :port \n");
printf("%s%s port:%d\n", com_config->host, com_config->path, com_config->port);
}

int http_client_post(void* data, esp_http_client_handle_t * client3 )
{
printf("\n f(): %s\n", FUNCTION);
esp_err_t err;
int ret_val = 0;
const int TX_SIZE = REQ_ENDLIST*2;

total_len = 0;
esp_http_client_set_header(*client3, "device-secret", DEVICE_KEY);
err = esp_http_client_set_post_field(*client3, (char *) data, TX_SIZE);

if (err == ESP_OK){

}
else {
	ESP_LOGE(TAG, "HTTP POST esp_http_client_set_post_field request failed: %s  no %d (%x)", esp_err_to_name(err), err, err);
}
err = esp_http_client_perform(*client3);

http_status =  esp_http_client_get_status_code(*client3);
	http_len =  esp_http_client_get_content_length(*client3);
if (err == ESP_OK) {
        ESP_LOGI(TAG, "HTTP POST Status = %d, content_length = %d",	http_status, http_len);
} else {
        ESP_LOGE(TAG, "HTTP POST request failed: %s  no %d (%x)", esp_err_to_name(err), err, err);
        ret_val = (short) err;
}

ret_val |=  (http_status<<16);
if (200 != http_status ){
	printf("   [ODPOW HTTP_CLIENT_POST] CODE %d: \t len %d \n", http_status , http_len);
}
    return ret_val;

}

@espressif-bot espressif-bot added the Status: Opened Issue is new label May 17, 2021
@github-actions github-actions bot changed the title http_client (GET / POST) stability problems err no x7004 ESP_ERR_HTTP_FETCH_HEADER http_client (GET / POST) stability problems err no x7004 ESP_ERR_HTTP_FETCH_HEADER (IDFGH-5271) May 17, 2021
@espressif-bot espressif-bot added Status: In Progress Work is in progress and removed Status: Opened Issue is new labels Jun 21, 2021
@ESP-YJM
Copy link
Collaborator

ESP-YJM commented Jun 21, 2021

I think ESP_ERR_HTTP_FETCH_HEADER error occurs as server has closed the connection that esp_transport_read returns -1.
Could you please help check with it.

@AxelLin
Copy link
Contributor

AxelLin commented Jun 22, 2021

Hi @ESP-YJM
If it's closed by peer, I would expect got ENOTCONN errno (Socket is not connected).
However, sometimes I also got EINPROGRESS (Connection already in progress). Why I got such errno?
(I send data every 5 minutes, so it's a surprise to get EINPROGRESS errno).

@ESP-YJM
Copy link
Collaborator

ESP-YJM commented Jun 22, 2021

@AxelLin EINPROGRESS always happend in non-blocking TCP connect. I'm confused why you got the EINPROGRESS. Could you please show when and where you print the errno value. Or you can add log in LWIP where the err set to ERR_INPROGRESS.

@AxelLin
Copy link
Contributor

AxelLin commented Jun 22, 2021

I simply call esp_http_client_open (with is_async=0)
It's really strange that I got EINPROGRESS when is_async is 0.

        err = esp_http_client_perform(my_client);
        if (err != ESP_OK) {
            ESP_LOGE(TAG, "esp_http_client_perform failed: %s, %d (%s)", esp_err_to_name(err), errno, strerror(errno));
            goto fail;
        }

sometimes I got below error:
esp_http_client_perform failed: ESP_ERR_HTTP_FETCH_HEADER, 119 (Connection already in progress)

@AxelLin
Copy link
Contributor

AxelLin commented Jun 23, 2021

@AxelLin EINPROGRESS always happend in non-blocking TCP connect. I'm confused why you got the EINPROGRESS. Could you please show when and where you print the errno value. Or you can add log in LWIP where the err set to ERR_INPROGRESS.

Hi @ESP-YJM

It's indeed non-blocking:

When client->is_async is false, esp_http_client_connect() calls esp_transport_connect().

esp_http_client_connect()
  esp_transport_connect()
    t->_connect()

In transport_tcp.c

In the tcp_connect() always set socket to non-blocking:
https://github.com/espressif/esp-idf/blob/release/v4.3/components/tcp_transport/transport_tcp.c#L125

My follow up question is: what is the suggest action after hitting EINPROGRESS?

@AxelLin
Copy link
Contributor

AxelLin commented Jun 23, 2021

BTW, I check my debug message, I have below tcp_poll_read select error just before esp_http_client_perform failure:

E (11117856) TRANS_TCP: tcp_poll_read select error 104, errno = Connection reset by peer, fd = 52
E (11117864) httpc: esp_http_client_perform failed: ESP_ERR_HTTP_FETCH_HEADER, 119 (Connection already in progress)

What make me confused is why then errno changed from 104 to 119.

I have http-keep-alive set, it looks like the sever closed the session.
But I would expect "Connection reset by peer" rather than "Connection already in progress".

@ESP-YJM
Copy link
Collaborator

ESP-YJM commented Jun 23, 2021

@AxelLin EINPROGRESS always happend in non-blocking TCP connect. I'm confused why you got the EINPROGRESS. Could you please show when and where you print the errno value. Or you can add log in LWIP where the err set to ERR_INPROGRESS.

Hi @ESP-YJM

It's indeed non-blocking:

When client->is_async is false, esp_http_client_connect() calls esp_transport_connect().

esp_http_client_connect()
  esp_transport_connect()
    t->_connect()

In transport_tcp.c

In the tcp_connect() always set socket to non-blocking:
https://github.com/espressif/esp-idf/blob/release/v4.3/components/tcp_transport/transport_tcp.c#L125

My follow up question is: what is the suggest action after hitting EINPROGRESS?

You can follow the flow like https://github.com/espressif/esp-idf/blob/release/v4.3/components/tcp_transport/transport_tcp.c#L134. Use select and SO_ERROR to check whether the connect is correct.

@AxelLin
Copy link
Contributor

AxelLin commented Jun 23, 2021

My follow up question is: what is the suggest action after hitting EINPROGRESS?

You can follow the flow like https://github.com/espressif/esp-idf/blob/release/v4.3/components/tcp_transport/transport_tcp.c#L134. Use select and SO_ERROR to check whether the connect is correct.

The esp_http_client API does not allow the user to access socket.
I just want to detect "closed by peer" case.
Checking "errno==ENOTCONN" is not enough, sometimes I got "errno==EINPROGRESS" and I have no idea why I got EINPROGRESS.

@ESP-YJM
Copy link
Collaborator

ESP-YJM commented Jun 23, 2021

My follow up question is: what is the suggest action after hitting EINPROGRESS?

You can follow the flow like https://github.com/espressif/esp-idf/blob/release/v4.3/components/tcp_transport/transport_tcp.c#L134. Use select and SO_ERROR to check whether the connect is correct.

The esp_http_client API does not allow the user to access socket.
I just want to detect "closed by peer" case.
Checking "errno==ENOTCONN" is not enough, sometimes I got "errno==EINPROGRESS" and I have no idea why I got EINPROGRESS.

Maybe you execute esp_http_client_perform and the client state is HTTP_STATE_INIT, will call esp_http_client_connect. In this case, maybe esp_http_client_connect will return EINPROGRESS .

@AxelLin
Copy link
Contributor

AxelLin commented Jun 23, 2021

Maybe you execute esp_http_client_perform and the client state is HTTP_STATE_INIT, will call esp_http_client_connect. In this case, maybe esp_http_client_connect will return EINPROGRESS .

@ESP-YJM

O_NONBLOCK is clear immediately after connect() returned.
https://github.com/espressif/esp-idf/blob/release/v4.3/components/tcp_transport/transport_tcp.c#L175

I don't really understand why calling esp_http_client_perform(my_client); can get ENOTCONN.
Remember that the err is ESP_ERR_HTTP_FETCH_HEADER.
It's clearly not returned by esp_http_client_connect().

@ESP-YJM
Copy link
Collaborator

ESP-YJM commented Jun 23, 2021

Maybe you execute esp_http_client_perform and the client state is HTTP_STATE_INIT, will call esp_http_client_connect. In this case, maybe esp_http_client_connect will return EINPROGRESS .

@ESP-YJM

O_NONBLOCK is clear immediately after connect() returned.
https://github.com/espressif/esp-idf/blob/release/v4.3/components/tcp_transport/transport_tcp.c#L175

I don't really understand why calling esp_http_client_perform(my_client); can get ENOTCONN.
Remember that the err is ESP_ERR_HTTP_FETCH_HEADER.
It's clearly not returned by esp_http_client_connect().

I think we both have confused the error values obtained by errno and SO_ERROR. errno value is updated by API sock_set_errno and SO_ERROR value is updated by err_tcp. And err_tcp will send a signal for userspace who select the errorfd. While errno will no signal. Only record current socket error value. err_tcp contains ERR_RST(ECONNRESET), ERR_CLSD(ENOTCONN) and ERR_ABRT(ECONNABORTED). The two values do not interfere with each other and will not be overwritten and modified by the other party. So your confuse that errno changed from 104 to 119, i think errno is always 119, while 104 is belong to SO_ERROR value. The value is not changed.

@AxelLin
Copy link
Contributor

AxelLin commented Jun 26, 2021

Hi @ESP-YJM

  1. Regarding errno:
    The issue is after esp_http_client_perform failed, usually I got 104 errno but sometimes I got errno 119.
    It's errno because I print the errno to check. (no confusion)
    BTW, I understand the errno could be overwritten if another error happen. But seems there is no better way to know if a http connection is closed by peer.

  2. Regarding err_tcp
    When I check the return value of esp_http_client_open() or esp_http_client_perform(),
    the err is not err_tcp.
    The ep_http_client API seems does not allow getting the current socket error value.

@ESP-YJM
Copy link
Collaborator

ESP-YJM commented Jun 30, 2021

@AxelLin Is it convenient to provide examples to reproduce the problem?

@ESP-YJM
Copy link
Collaborator

ESP-YJM commented Jun 30, 2021

Hi @ESP-YJM

1. Regarding errno:
   The issue is after esp_http_client_perform failed, usually I got 104 errno but sometimes I got errno 119.
   It's errno because I print the errno to check. (no confusion)
   BTW, I understand the errno could be overwritten if another error happen. But seems there is no better way to know if a http connection is closed by peer.

2. Regarding err_tcp
   When I check the return value of esp_http_client_open() or esp_http_client_perform(),
   the err is not err_tcp.
   The ep_http_client API seems does not allow getting the current socket error value.

@AxelLin I can reproduce the issue now. But i get the errno is different from you. When first esp_http_client_perform , it is successful, and the errno is 119. After 60s (HTTP Keep-alive, perr will send FIN to close the socket), and call esp_http_client_perform, errno is 128, no select error got. And then call esp_http_client_perform again, the errno is 128 and select error is 104 (RST by peer).

@ESP-YJM
Copy link
Collaborator

ESP-YJM commented Jul 2, 2021

@pdemianczuk07 @AxelLin Could you please modify code as follow to test. When we use esp_http_client_perform to send data in the same connection, maybe due to timeout that server will send FIN to close the connection. No userspace code can detect this util http client esp_http_client_fetch_headers to read data from LWIP receive buffer. So when we read length equal to 0 and errno is ENOTCONN, which means server close the connection. We need also close the connection.

--- a/components/esp_http_client/esp_http_client.c
+++ b/components/esp_http_client/esp_http_client.c
@@ -1051,6 +1051,9 @@ esp_err_t esp_http_client_perform(esp_http_client_handle_t client)
                 if (esp_http_client_fetch_headers(client) < 0) {
                     if (client->is_async && errno == EAGAIN) {
                         return ESP_ERR_HTTP_EAGAIN;
+                    } else if (errno == ENOTCONN) {
+                        ESP_LOGI(TAG, "Close connection due to FIN received");
+                        esp_http_client_close(client);
                     }
                     return ESP_ERR_HTTP_FETCH_HEADER;
                 }

@AxelLin
Copy link
Contributor

AxelLin commented Jul 2, 2021

@ESP-YJM

I already checked the errno when esp_http_client_perform() fails and call esp_http_client_close() in my application code.
You just move the esp_http_client_close() call into esp_http_client_perform() error path.
So I don't see any difference with your change with my current code.

The point is errno may be not always correct as I mentioned.
I sometimes got errno == EINPROGRESS when it's closed by peer.
And that is why I ask you "what is the proper way to know if a session is closed by peer".

@ESP-YJM
Copy link
Collaborator

ESP-YJM commented Jul 2, 2021

@AxelLin There is no proper way to know peer close the connection with FIN. It is a right way to close the connection. The LWIP will not inform the upper layer that peer sends FIN. We only call read and it reutns 0 , errno is set to ENOTCONN and know peer close the connection.

@AxelLin
Copy link
Contributor

AxelLin commented Jul 2, 2021

--- a/components/esp_http_client/esp_http_client.c
+++ b/components/esp_http_client/esp_http_client.c
@@ -1051,6 +1051,9 @@ esp_err_t esp_http_client_perform(esp_http_client_handle_t client)
                 if (esp_http_client_fetch_headers(client) < 0) {
                     if (client->is_async && errno == EAGAIN) {
                         return ESP_ERR_HTTP_EAGAIN;
+                    } else if (errno == ENOTCONN) {
+                        ESP_LOGI(TAG, "Close connection due to FIN received");
+                        esp_http_client_close(client);
                     }
                     return ESP_ERR_HTTP_FETCH_HEADER;
                 }

If you call esp_http_client_close() when errno == ENOTCONN.
What do you expect the caller of esp_http_client_perform() to do with other errno?

If in the end, the application code needs to call esp_http_client_close() anyway for other errors.
Why you need to add this special case for ENOTCONN?
(I guess most user won't check the errno and just check the return value of esp_http_client_perform().)

@ESP-YJM
Copy link
Collaborator

ESP-YJM commented Jul 2, 2021

@AxelLin esp_http_client_perform return ESP_ERR_HTTP_FETCH_HEADER now have two situations. One is peer close the connection, need close the connection. Another is read data timeout and the connection is on, maybe call esp_http_client_perform again can fetch the header. So need add errno value to judge which situation belongs to, or add another return value to judge it.

@AxelLin
Copy link
Contributor

AxelLin commented Jul 2, 2021

@ESP-YJM
But the application already can check the errno even without your change.
Once you call esp_http_client_close() before return error the application code lost the real error reason.

@ESP-YJM
Copy link
Collaborator

ESP-YJM commented Jul 2, 2021

@AxelLin I think application no need care about the errno, only judge the return value. I can add a new return value to tell application that need close the connection.

@AxelLin
Copy link
Contributor

AxelLin commented Jul 2, 2021

@AxelLin I think application no need care about the errno

This assumption is wrong.

@espressif-bot espressif-bot added Status: Done Issue is done internally Resolution: Done Issue is done internally and removed Status: In Progress Work is in progress labels Jul 23, 2021
@AxelLin
Copy link
Contributor

AxelLin commented Jul 27, 2021

@ESP-YJM

The fix only checks the closed by peer in esp_http_client_fetch_headers error path.
What if closed-by-peed happened on esp_http_client_send_post_data(),
esp_http_client_request_send() or other function error paths?

@ESP-YJM
Copy link
Collaborator

ESP-YJM commented Jul 27, 2021

@AxelLin If peer send FIN when happened on esp_http_client_send_post_data(),
esp_http_client_request_send(), we still can send data to peer and will fall through to esp_http_client_fetch_header. And now we read FIN from peer and close socket.

@AxelLin
Copy link
Contributor

AxelLin commented Jul 27, 2021

@AxelLin If peer send FIN when happened on esp_http_client_send_post_data(),
esp_http_client_request_send(), we still can send data to peer and will fall through to esp_http_client_fetch_header. And now we read FIN from peer and close socket.

You mean even the tcp session is already closed-by-peer, esp_http_client_request_send() will still success?
This is different from what I observed.

@ESP-YJM
Copy link
Collaborator

ESP-YJM commented Jul 28, 2021

@AxelLin Yes, The premise is that the peer sends FIN to disconnect the socket. At this time, our tx connection is still on and can send data. I capture the packet and the result is the same. The ip 192.168.3.39 is my client.
Screenshot from 2021-07-28 10-20-35

@AxelLin
Copy link
Contributor

AxelLin commented Apr 1, 2022

@ESP-YJM

I test sending multiple POST requests to https server with HTTP keepalive enabled.
I notice esp_http_client_perform returns ESP_ERR_HTTP_FETCH_HEADER, but the
esp_transport_get_errno(client->transport) returns 0 rather than ENOTCONN.
(i.e. It does not detect the connection is already closed by peer)
I add debug code to ssl_read (in components/tcp_transport/transport_ssl.c),
found it calls capture_tcp_transport_error(t, ERR_TCP_TRANSPORT_CONNECTION_CLOSED_BY_FIN);
https://github.com/espressif/esp-idf/blob/master/components/tcp_transport/transport_ssl.c#L241
But I don't find any code patch to check this ERR_TCP_TRANSPORT_CONNECTION_CLOSED_BY_FIN error.
I don't find the relationship of ERR_TCP_TRANSPORT_CONNECTION_CLOSED_BY_FIN and esp_transport_get_errno().

BTW, I also have a question regarding the esp_transport_get_errno():
https://github.com/espressif/esp-idf/blob/master/components/esp_http_client/esp_http_client.c#L1148

The esp_transport_get_errno() not only returned the sock_errno, it also clear
error_handle->sock_errno. So the return value depends on the order of the call.
i.e. If someone calls esp_transport_get_errno() first, it got the actual errno but
the following up calls to esp_transport_get_errno() returns 0 unless a new error is set.

esp_http_client_get_errno() is a public API, how do you make sure the esp_transport_get_errno()
in esp_http_client.c will get correct errno? And since the esp_http_client already call it if esp_http_client_fetch_headers() failed,
the application calling esp_http_client_get_errno() will return 0.

BTW, below comment says connection has been closed cleanly
https://github.com/espressif/esp-idf/blob/master/components/tcp_transport/transport_ssl.c#L240
But if I print the errno here, it shows 119 (EINPROGRESS) rather than ENOTCONN.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Resolution: Done Issue is done internally Status: Done Issue is done internally
Projects
None yet
Development

No branches or pull requests

4 participants