Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

httpd possible a race condition for lru purge enable (IDFGH-2680) #4753

Closed
patrykkrz opened this issue Feb 12, 2020 · 5 comments
Closed

httpd possible a race condition for lru purge enable (IDFGH-2680) #4753

patrykkrz opened this issue Feb 12, 2020 · 5 comments

Comments

@patrykkrz
Copy link
Contributor

esp-idf version - tag v4.0

When the server has a lot of open connections (for testing I run several browsers on several computers and connect to the http server).
I expect that the lru purge enable option to allow next clients to connect. Unfortunately, this does not happen. The browser is waiting for a response, and on the server side I see a lot of logs printing over and over:

D (749094) httpd: httpd_server: processing listen socket 48
D (749104) httpd_sess: httpd_sess_close_lru: fd = 54
D (749104) httpd: httpd_server: doing select maxfd + 1 = 61
D (749114) httpd: httpd_server: processing ctrl message
D (749114) httpd: httpd_process_ctrl_msg: work
D (749124) httpd_sess: Skipping session close for 54 as it seems to be a race condition

my http server configuration:

	config.backlog_conn = 0;
	config.lru_purge_enable = true;
	config.task_priority = 5;

This logs are printing when lru_counter is 0. Maybe in httpd_sess_close_lru, when searching the smallest lru, zero should not be assigned ?

This patch works for me, but I don't know if it's correct.

--- a/components/esp_http_server/src/httpd_sess.c
+++ b/components/esp_http_server/src/httpd_sess.c
@@ -342,7 +342,7 @@ esp_err_t httpd_sess_close_lru(struct httpd_data *hd)
         if (hd->hd_sd[i].fd == -1) {
             return ESP_OK;
         }
-        if (hd->hd_sd[i].lru_counter < lru_counter) {
+        if (hd->hd_sd[i].lru_counter < lru_counter && hd->hd_sd[i].lru_counter) {
             lru_counter = hd->hd_sd[i].lru_counter;
             lru_fd = hd->hd_sd[i].fd;
         }
@github-actions github-actions bot changed the title httpd possible a race condition for lru purge enable httpd possible a race condition for lru purge enable (IDFGH-2680) Feb 12, 2020
@shahpiyushv
Copy link
Collaborator

@patrykkrz , we will check this, but what happens if you keep config.backlog_conn at the default value of 5? Is the observation same?

@patrykkrz
Copy link
Contributor Author

When backlog_conn is 5, I have the same effect. But I think, I need more tries to repeat.

@gecko242
Copy link

gecko242 commented Feb 28, 2020

I can confirm I am experiencing the same issue. HTTPD config:
httpd_config_t config = HTTPD_DEFAULT_CONFIG(); config.uri_match_fn = httpd_uri_match_wildcard; config.task_priority = 8; config.lru_purge_enable = true; // purge least recently used request config.stack_size = 16384;

I experience httpd using lots of CPU time (but not causing WDT).
With the following message:

D (15:04:13.821) httpd_sess: Skipping session close for 57 as it seems to be a race condition D (15:04:13.829) httpd: httpd_server: processing listen socket 51 D (15:04:13.835) httpd_sess: httpd_sess_close_lru: fd = 57 D (15:04:13.841) httpd: httpd_server: doing select maxfd+1 = 61 D (15:04:13.848) httpd: httpd_server: processing ctrl message D (15:04:13.853) httpd: httpd_process_ctrl_msg: work

I will try @patrykkrz patch, and let you know.

EDIT:

With the patch I now get this:
W (15:18:31.313) httpd: httpd_server: error accepting new connection D (15:18:31.320) httpd: httpd_server: doing select maxfd+1 = 61 D (15:18:31.327) httpd: httpd_server: processing listen socket 51 D (15:18:31.332) httpd_sess: httpd_sess_close_lru: fd = -1

Maybe something else? I am also running a sort of DNS hijacker, which may be messing with things. (for captive portal functionality)

@GiulioDallaVecchia
Copy link

GiulioDallaVecchia commented Apr 24, 2020

I can confirm I am experiencing the same issue. I'm also tring to use the patch by Patrykkrz but now I have this assert fail: "assertion "invalid state" failed: file "/home/gdvecchia/esp/esp-idf/components/lwip/lwip/src/core/tcp.c", line 1758, function: tcp_kill_state"

@patrykkrz
Copy link
Contributor Author

I can confirm I am experiencing the same issue. I'm also tring to use the patch by Patrykkrz but now I have this assert fail: "assertion "invalid state" failed: file "/home/gdvecchia/esp/esp-idf/components/lwip/lwip/src/core/tcp.c", line 1758, function: tcp_kill_state"

It's another problem: #1194

Try this: SLICAN@0622415

patrykkrz added a commit to SLICAN/esp-idf that referenced this issue May 22, 2020
patrykkrz pushed a commit to SLICAN/esp-idf that referenced this issue May 22, 2020
LRU counter should be started from 1, and not 0, so that all checks
work fine.

Closes espressif#4753
espressif-bot pushed a commit that referenced this issue Jun 15, 2020
LRU counter should be started from 1, and not 0, so that all checks
work fine.

Closes #4753
espressif-bot pushed a commit that referenced this issue Jun 15, 2020
LRU counter should be started from 1, and not 0, so that all checks
work fine.

Closes #4753
projectgus pushed a commit that referenced this issue Jul 14, 2020
LRU counter should be started from 1, and not 0, so that all checks
work fine.

Closes #4753
espressif-bot pushed a commit that referenced this issue Sep 10, 2020
LRU counter should be started from 1, and not 0, so that all checks
work fine.

Closes #4753
loganfin pushed a commit to Lumenaries/esp_http_server that referenced this issue Apr 23, 2024
LRU counter should be started from 1, and not 0, so that all checks
work fine.

Closes espressif/esp-idf#4753
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants