Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Did WiFiClient.connected recently change for small payloads? #5257

Closed
6 tasks done
marcelstoer opened this issue Oct 17, 2018 · 7 comments · Fixed by #5355
Closed
6 tasks done

Did WiFiClient.connected recently change for small payloads? #5257

marcelstoer opened this issue Oct 17, 2018 · 7 comments · Fixed by #5355
Assignees
Milestone

Comments

@marcelstoer
Copy link
Contributor

Basic Infos

  • This issue complies with the issue POLICY doc.
  • I have read the documentation at readthedocs and the issue is not addressed there.
  • I have tested that the issue is present in current master branch (aka latest git).
  • I have searched the issue tracker for a similar issue.
  • If there is a stack dump, I have decoded it.
  • I have filled out all fields below.

Platform

  • Hardware: ESP-WROOM-2
  • Core Version: 2..4.2
  • Development Env: Arduino IDE
  • Operating System: macOS

Settings in IDE

  • Module: ESP-WROOM-2
  • Flash Mode: qio
  • Flash Size: 4MB
  • lwip Variant: v2 Lower Memory
  • Reset Method: ck
  • Flash Frequency: 40Mhz
  • CPU Frequency: 80Mhz
  • Upload Using: SERIAL
  • Upload Speed: other

Problem Description

This is a follow-up to ThingPulse/esp8266-weather-station#140 which in turns follows up on as long thread in our support forum.

It appears that a (nested) HTTP/TCP parse loop that first checks for client.connected() and then for client.available() > 0 no longer works reliably for small payloads.

What seems to be happening on the networking stack is that "sometimes" the WiFiClient reports connected() == false immediately when the payload is sufficiently small. However, even though the client is no longer connected its internal buffer contains all the data and, hence, client.available() returns more than 0.

Can you confirm that such a double nested loop, which is a very common sketch template, should these days be changed to only check for client.available() > 0?

Sketch fragment

    // WiFiClient * client = http.getStreamPtr();
    // or something like
    // BearSSL::WiFiClientSecure client;
    // ...
    // if (client.connect(host, port))
    while(client.connected()) {
      while((size = client.available()) > 0) {
        // client->read() and do stuff
        // ...
        // give WiFi and TCP/IP libraries a chance to handle pending events
        yield();
      }
    }
  }
@d-a-v
Copy link
Collaborator

d-a-v commented Oct 18, 2018

It is true that semantic of client.connected() has changed some time ago.

Current semantic is:

  • client.connected() == true when the tcp link is still opened, meaning that:
    • we can write on it (remote is listening)
    • new data may be later received
  • client.available() == true means some data are currently buffered and available for reading now.

So client.available() can be true while client.connected() is not anymore, and that happens when remote sends data, then immediately closes before we read them all.

Thus a general inner communication loop could be:

    while (client.available() || client.connected())
    {
        if (client.available())
        {
            // client.available() bytes are immediately available for reading
            // warning: reading them *allows* peer to send more, so they should
            // be read *only* when they can immediately be processed, see below
            // for flow control
        }
        if (client.connected())
        {
            if (client.availableForWrite() >= N)
            {
                // TCP layer will be able to *bufferize* our N bytes,
                // and send them *asynchronously*, with by default
                // a small delay if those data are small
                // because Nagle is around (~200ms)
                // unless client.setNoDelay(true) was called.
                // 
                // In any case client.write() will *never* block here
            }
            else
            {
                // or we can send but it will take time because data are too
                // big to be asynchronously bufferized: TCP needs to receive
                // some ACKs to release its buffers.
                // That means that write() will block until it receives
                // authorization to send because we are not in a
                // multitasking environment

                // It is always OK to do this, client.availableForWrite() is
                // only needed when efficiency is a priority and when data
                // to send can wait where they currently are, especially
                // when they are in another tcp client.

                // Flow control:
                // It is also important to know that the ACKs we are sending
                // to remote are directly generated from client.read().
                // It means that:
                // Not immediately reading available data can be good for
                // flow control and avoid useless memory filling/overflow by
                // preventing peer from sending more data, and slow down
                // incoming bandwidth
                // (tcp is really a nice and efficient beast)
            }
        }
        
        // yield() may not be necessary:
        // - it is called in available()
        //   (when this order is respected: "while (available() || connected())")
        // - it is called when write() blocks
        yield();
    }

Historically, connected() was returning true even when tcp connection was just closed, and that caused some issues. I can find the exact origin of this change if that's desired.

@earlephilhower Is this this semantic the same when using BearSSL ?

We really should document that somewhere.
For core-2.5.0 we should revisit all examples and fix where relevant
(and also upgrade all axTLS examples to BearSSL).

@d-a-v d-a-v self-assigned this Oct 18, 2018
@devyte devyte added this to the 2.5.0 milestone Oct 18, 2018
@earlephilhower
Copy link
Collaborator

@d-a-v The SSL implementation of ::connected() is different from unencrypted in both axTLS and BearSSL.

For SSL "::connected()" means: "TCP is still connected OR there is some decrypted data in a SSL decrypted buffer."

axTLS:

uint8_t WiFiClientSecure::connected()
{
if (_ssl) {
if (_ssl->hasData()) {
return true;
}
if (_client && _client->state() == ESTABLISHED && _ssl->connected()) {
return true;
}
}
return false;
}

BearSSL copied this behavior:

uint8_t WiFiClientSecure::connected() {
if (available() || (_clientConnected() && _handshake_done)) {
return true;
}
return false;

@d-a-v
Copy link
Collaborator

d-a-v commented Oct 18, 2018

This change appeared in #4626.
If I remember well the issue happens when peer closes the connection while we are transmitting with data still available for reading.
In that case connected() was returning true, thus allowing to send data, and leading to an issue because the connection was closed.

This is a semantic change : connected() == allowed to write.
(it was: allowed to read and/or write)

Do you think it is wise to use the same semantic with BearSSL (and AxTLS) ?

Also, about availableForWrite() which is an addition (this one is only defined for Serial in arduino api),
do we know SSL overhead before knowing how much unencrypted data will passed to write() ?

@Bodmer
Copy link

Bodmer commented Oct 19, 2018

@d-a-v
I noticed lost data at the end of a json message parsing so deduced that there was still data available after disconnection and implemented the OR logic. However I do not understand why the calling order is important, ref your comment:

// available() must be tested first when using "(available() || connected())"

Under what conditions is this test order important.

@d-a-v
Copy link
Collaborator

d-a-v commented Oct 19, 2018

Under what conditions is this test order important.

Only if you don't call yield() or delay(0) yourself, because available() calls yield() but won't if tested after connected() which is true most of the time.

TD-er added a commit to TD-er/ESPEasy that referenced this issue Oct 19, 2018
TD-er added a commit to TD-er/ESPEasy that referenced this issue Oct 19, 2018
@marcelstoer
Copy link
Contributor Author

@d-a-v @Bodmer I propose we don't rely on such implementation details in our client implementation as they might change in the future w/o breaking API. Same goes for relying on how BearSSL and axTLS implemented connected().

The documented template should be robust enough to work reliably regardless of such subtleties (hence, yield() being part of the blue print).

@d-a-v
Copy link
Collaborator

d-a-v commented Nov 20, 2018

Made it a PR #5355.
Are something or some details missing ?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants