Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Two inherited bugs in lwip tcp #365

Open
EFudd opened this issue Apr 15, 2017 · 5 comments
Open

Two inherited bugs in lwip tcp #365

EFudd opened this issue Apr 15, 2017 · 5 comments

Comments

@EFudd
Copy link

EFudd commented Apr 15, 2017

First bug: When attempting to use tcp_write() large data without making copies of the data: apiflags=0, and TCP_WRITE_FLAG_COPY configuration flag set to 0, the system crashes. The problem is that tcp_write() creates two linked pbufs, one for the header, and one with pointer to the data. Downstream in esp_interface low_level_output() contains a loop that writes the linked pbufs using sdk_ieee80211_output_pbuf() firmware code. sdk_ieee80211_output_pbuf() trashes the data in the 2nd pbuf. Evidence below shows pbufs before and after the call. Note sdk_ieee80211_output_pbuf() calls back to pbuf_ref() to update the reference count and the data in the 2nd pbuf has been corrupted. I also note that the low_level_output() loop in the official FREERTOS code looks suspicious (and does not work).

Pbuf: p: (0x3fff3e50) [00] next(0x3fff3ec8) payload(0x3fff3e8a) tlen: 0b7 len: 036 type:00 flags: 00 ref:01 eb(0xa5a5a5a5)
Pbuf: p->next (0x3fff3ec8) [00] next(0) payload(0x3fff3ee0) tlen: 081 len: 081 type:01 flags: 00 ref:01 eb(0xa5a5a5a5)
Pbuf:ref (0x3fff3e50) [00] next(0x3fff3ec8) payload(0x3fff3e8a) tlen: 0b7 len: 036 type:00 flags: 00 ref:01 eb(0xa5a5a5a5)
ref exit
Pbuf: loop next (0x3fff3ec8) [00] next(0) payload(0x3fff3ee0) tlen: 081 len: 081 type:01 flags: 00 ref:01 eb(0xa5a5a5a5) <- pbuf before sdk_ieee80211_output_pbuf
Pbuf:ref (0x3fff3ec8) [00] next(0x26000000) payload(0x9c893844) tlen: 0312e len: 03120 type:030 flags: 031 ref:05448 eb(0x312f5054) <-- trashed pbuf after sdk_ieee80211_output_pbuf
ref exit
Pbuf: loop next (0x26000000) [00] next(0xa65149c) payload(0xa65149c) tlen: 0149c len: 0a65 type:09c flags: 014 ref:0a65 eb(0xa65149c)

Second inherited bug: In api_lib.c
at netconn_write_partly()
the following two lines are obviously wrong. The "!=" and "==" should be swapped in these two lines.
LWIP_ERROR("netconn_write: invalid conn", (conn != NULL), return ERR_ARG;);
LWIP_ERROR("netconn_write: invalid conn->type", (conn->type == NETCONN_TCP), return ERR_VAL;);
This error is inheried from the official FREERTOS release.

@ourairquality
Copy link
Contributor

Looking at the code (disassembled) it appears to allocate an esf_buf via sdk_esf_buf_alloc which is passed the pbuf, this copies the payload pointer and the length of this buffer into the esf_buf slots and it later calls ppTxPkt with this esf_buf. Perhaps it sends one packet for each call, and can not build a packet from multiple pbufs?

I don't even know what the fields of the esf_buf are. The first slot gets the pbuf pointer, but the next slot points to an object into which the payload pointers is stored and seems to be statically allocated when these pools are initialized.

What would that mean for the lwip integration? Is it always wrong for low_level_output to send the linked pbufs as separate packets? Should it be copying them into one packet?

fwiw I have been working away slowly trying to get the code working with a recent lwip version and if you think that would help or you could contribute then please see https://github.com/ourairquality/esp-open-rtos

@EFudd
Copy link
Author

EFudd commented Apr 15, 2017 via email

@malachib
Copy link

@ourairquality thank you for your efforts. I dig lwip, and I hope your efforts can get it more solid in this project

@ourairquality
Copy link
Contributor

Fwiw tried to address this in #389 using the approach noted in the lwip v2 source code which uses pbuf_clone to create a single pbuf from multiple pbufs. Obviously this is not efficient, and deeper understanding and integration might be needed to implement something efficient, but would it happen to solve the bug?

@malachib
Copy link

Perhaps related to #322

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants