HTTP client fixes #2163

mikee47 · 2020-12-01T23:50:23Z

Improvements to pipelining.
Improvements to retrying the current request(as is the case for URLs protected with digest authentication).
Fixed post requests sending files and/or form data.
Small improvements to the handing of failed writes in TcpConnection.

slaff · 2020-12-02T09:31:15Z

@mikee47 Added one more commit to this PR which addresses possible problems caused by pipelining 3a248f9 . Can you take a look at it and test it on your Windows or Linux VM machine?

mikee47 · 2020-12-02T13:49:11Z

@slaff Looks fine, works well! Thank you.

mikee47 · 2020-12-03T06:45:11Z

Sming/Core/Data/Stream/MultipartStream.h

@@ -60,6 +60,7 @@ class MultipartStream : public MultiStream
 	Producer producer;
 	BodyPart bodyPart;
 	char boundary[16] = {0};
+	bool finished = false;


This shadows MultiStream::finished...

slaff · 2020-12-03T10:30:12Z

@mikee47 While trying to make the HttpClient work again I have noticed that the LimitedMemoryStream was broken in this PR #2119.

The changes in the isFinished(), given below

	bool isFinished() override
	{
		return (readPos >= capacity);
	}

cause endless loop when trying to copy the data:

size_t ReadWriteStream::copyFrom(IDataSourceStream* source, size_t size)
{
	if(source == nullptr || size == 0) {
		return 0;
	}

	auto bufSize = std::min(size, maxBufferSize);
	char buffer[bufSize];
	size_t total = 0;
	size_t count;
	while(!source->isFinished()) {
		count = source->readMemoryBlock(buffer, sizeof(buffer));
		if(count == 0) {
			continue;
		}
		auto written = write(reinterpret_cast<uint8_t*>(buffer), count);
		source->seek(written);
		total += count;
		if(written != count) {
			break;
		}
	}

	return total;
}

called from here:

int onDownload(HttpConnection& connection, bool success)
{
	debugf("\n=========[ URL: %s ]============", connection.getRequest()->uri.toString().c_str());
	debugf("RemoteIP: %s", connection.getRemoteIp().toString().c_str());
	debugf("Got response code: %d", connection.getResponse()->code);
	debugf("Success: %d", success);
	if(connection.getRequest()->method != HTTP_HEAD) {
		Serial.print(_F("Got content: "));
		auto stream = connection.getResponse()->stream;
		if(stream == nullptr || stream->available() == 0) {
			Serial.println("EMPTY!");
		} else {
			Serial.copyFrom(stream);

IMHO the isFinished method here should be just:

	bool isFinished() override
	{
		return (available() == 0 );
	}

@mikee47 Can you create a separate PR with a fix?

mikee47 · 2020-12-03T12:15:27Z

Sming/Core/Network/Http/HttpClientConnection.cpp

@@ -315,6 +315,7 @@ void HttpClientConnection::onClosed()
 		System.queueCallback([this]() {
 			cleanup();
 			init(HTTP_RESPONSE);
+			state = eHCS_Ready;


Should this be done in init() ?

Should this be done in init() ?

Yes, indeed. I will add this correction in a second...

slaff · 2020-12-03T12:30:12Z

@mikee47 There is a problem with SSL and sending data in the TcpConnection. I will address also this but it will be part of a separate PR.

There appear to be a couple of things at play here: 1. Need to move items off execution back onto waiting queue - cleanup() does this 2. Don't restart immediately, but defer using task queue - let's TCP stack sort itself out first

…s that don't support keep-alive connections. Initial conditions ------------------ 3 consecutive GET requests + 1 POST request go to the same HTTP server. Before this change ------------------ The HTTP client starts using HTTP pipelining (https://en.wikipedia.org/wiki/HTTP_pipelining) without making sure that the HTTP server is supporting it. It will try to send the first 3 HTTP requests in one go. If the HTTP server does not support pipelining it will either return bad request response or a normal response with Connection: close header. The first request will be successful but then the connection will be closed and the remaining two requests will be moved back to the waiting queue. They will be AFTER the POST request which is a **problem**. Once the POST request has completed the two GET requests will be moved back and forth between the waiting and execution queue which is another **problem**. After this change ----------------- The first GET request will be sent but the others will wait in the waiting queue. Once the request has completed the Connection response header will be analyzed. a) If the header is Connection:close then the connection will be closed and there will be no need to move something from the execution queue back to the waiting queue as there will be 0 requests in execution and the rest will be in the waiting queue. b) if there is no Connection:close header in the response the HTTP client will assume that the HTTP server supports pipelining and will allow pipelining consecutive requests in the same Http Client Connection. This way the remaining 2 GET requests will be pipelined and the POST request will be executed at the right time.

…n empty string after the extractAt call.

…in wrong multipart boudary headers and endless loop. This commit removes the introduced problems.

Add BodPart::bool() operator as it provides opportunity to clarify how it's used Rename `finished` -> `footerSent`

Retrying of the current request will keep its order and not get at the end of the execution queue. Fixes a problem with requests that go to URLs protected with digest authentication (requires the same request to be sent twice - first time without authentication headers to get the tokens and second one with authentication and calculated checksums.

slaff · 2020-12-03T16:46:25Z

@mikee47 Can you compile the latest code from this PR and test if you still have the problems described in #2158?

mikee47 · 2020-12-03T17:29:02Z

@slaff I got the failed requests again, so re-tested with the previous working commit and still failing! So I increased the scanner fetchInterval from 250 to 500 (ms) and it started working again! However, behaviour with/without the task queue is the same; I re-checked in the debugger and the stack usage is minimal so looks like that isn't necessary and commit 8bb1312 is fine.

I then revisited the issue wth using the task queue in ControlPoint - https://github.com/mikee47/Sming-UPnP/blob/3e66a17cbd9c9287485a4a2b177a79a0744d2f64/src/ControlPoint.cpp#L211 onwards. The issue persists, and is definitely related to this issue. I need to improve the debug output so requests can be correlated with responses more easily, then perhaps the logs will shed more light on exactly what's happening...

mikee47 · 2020-12-03T22:08:39Z

So at present we can retry sending headers but the body stream also needs to be recovered. I started attempting to fix that by restoring and resetting the stream (see commit for StreamTransformer::seekFrom()) but of course by that time it's been destroyed by the call to freeStreams() in TcpClient::pushAsyncPart().

Is there some way we can pre-empt all this and avoid starting a new request if the connection is due to be closed?

We have to analyze more carefully the problem reported in SmingHub#2158 and issue a separate PR with the fix.

slaff · 2020-12-04T09:27:37Z

@mikee47 Once Travis is ready I will merge this PR in develop. It does not solve your problem but at least has some useful fixes. Any idea how I can reproduce the problem that your experiencing on my machine? This will help me go step by step with the debugger and correlate network traffic with live wireshark capture.

mikee47 · 2020-12-04T09:29:38Z

@slaff I got the same issue running the Basic_ControlPoint sample on an ESP8266. I'll re-check to confirm and get back to you.

slaff · 2020-12-04T11:06:00Z

I'll re-check to confirm and get back to you.

Get the latest develop and rebase on it. Compile your sample using the following commands (adjust accordingly ):

cd <sample-dir>
make config-clean
make dist-clean
make SMING_ARCH=Host ENABLE_GDB=1 DEBUG_VERBOSE_LEVEL=3

slaff · 2020-12-04T13:17:05Z

There is a problem with SSL and sending data in the TcpConnection. I will address also this but it will be part of a separate PR.

See #2173 .

mikee47 · 2020-12-04T18:24:59Z

@slaff I've rebuilt and re-tested as follows:

cd %SMING_HOME%
make dist-clean
make fetch UPnP
cd Libraries\UPnP\samples\Basic_ControlPoint
make -j20 SMING_ARCH=Host ENABLE_GDB=1 DEBUG_VERBOSE_LEVEL=3
make run

The attached log initial-run shows the results of, all looks fine.

I then checked out the fix/http-debugging branch of UPnP d7ccee6e6f07b7742ae661bcd7a16978eda32e38:

make -j20 Sming-build all
make run

The results are in controlpoint-task-queue, and it should be visible that the response to the first action request gets lost in the HTTP connection closing business.

When a device has been discovered and created by ControlPoint, the client callback creates action requests and submits them.
The client should be able to successfully sent action requests at any time, so deferring the callback is a good test for this.
I've considered delaying the callback for a fixed time (e.g. 2 seconds) to allow any connection closing to complete - that's more of a workaround though and would just mask the problem.

debug_logs.zip

The action requests are for ContentDirectory services on my archer VR900 router - it has a USB stick with audio files on it. See https://www.tp-link.com/uk/support/download/archer-vr900/v1/. Here's the relevant DD fields:

        <deviceType>urn:schemas-upnp-org:device:MediaServer:1</deviceType>
        <manufacturer>Justin Maggard</manufacturer>
        <manufacturerURL>http://www.tp-link.com/</manufacturerURL>
        <modelDescription>MiniDLNA on Linux</modelDescription>
        <modelName>Windows Media Connect compatible (MiniDLNA)</modelName>
        <X_DLNADOC xmlns:dlna="urn:schemas-dlna-org:device-1-0">DMS-1.50</X_DLNADOC>

You could probably run MiniDLNA on another device to check this - seems very unlikely it's the router which is the problem but I'll see if I can find an alternative device to test this against.

slaff added this to the 4.2.0 milestone Dec 2, 2020

slaff added the 3 - Review label Dec 2, 2020

mikee47 force-pushed the fix/http-close-handling branch from 60ac778 to 4188f5c Compare December 2, 2020 16:42

slaff force-pushed the fix/http-close-handling branch from aa86117 to 167acf7 Compare December 2, 2020 16:58

slaff changed the title ~~Fix handling of closed HTTP connection~~ HTTP client fixes Dec 2, 2020

mikee47 commented Dec 3, 2020

View reviewed changes

slaff force-pushed the fix/http-close-handling branch from 6b1a73c to 25133a0 Compare December 3, 2020 10:47

mikee47 mentioned this pull request Dec 3, 2020

Fix LimitedMemoryStream::isFinished() #2170

Merged

mikee47 commented Dec 3, 2020

View reviewed changes

slaff force-pushed the fix/http-close-handling branch from 50003ab to feb57db Compare December 3, 2020 12:28

mikee47 and others added 6 commits December 3, 2020 13:37

Fix handling of closed HTTP connection

4c0b249

There appear to be a couple of things at play here: 1. Need to move items off execution back onto waiting queue - cleanup() does this 2. Don't restart immediately, but defer using task queue - let's TCP stack sort itself out first

Streamline code a little

548ba9f

Copy the name by value, otherwise the reference to it will point to a…

068af53

…n empty string after the extractAt call.

PR SmingHub#2123 broke the multipart file and post sending resulting …

9b3729d

…in wrong multipart boudary headers and endless loop. This commit removes the introduced problems.

Propose slight change to multipart

9e50aa2

Add BodPart::bool() operator as it provides opportunity to clarify how it's used Rename `finished` -> `footerSent`

slaff force-pushed the fix/http-close-handling branch from feb57db to 3116ba0 Compare December 3, 2020 12:38

slav-at-attachix added 2 commits December 3, 2020 14:55

Reset also the HttpClientConnection state.

ec22769

slaff force-pushed the fix/http-close-handling branch from 3116ba0 to ec22769 Compare December 3, 2020 13:55

slaff added the 0 - Backlog label Dec 3, 2020

slaff force-pushed the fix/http-close-handling branch 2 times, most recently from 36bb982 to 198b488 Compare December 4, 2020 08:48

slaff added 3.5 - Approved - Waiting for CI results and removed 0 - Backlog 3 - Review labels Dec 4, 2020

slav-at-attachix added 3 commits December 4, 2020 10:05

Finish early if there is a problem with writing the data.

22ed989

Show also the state on poll.

cc181a8

Remove for now the use of System.task.

4b61a3e

We have to analyze more carefully the problem reported in SmingHub#2158 and issue a separate PR with the fix.

slaff force-pushed the fix/http-close-handling branch from 198b488 to 4b61a3e Compare December 4, 2020 09:06

slaff merged commit da225ba into SmingHub:develop Dec 4, 2020

slaff removed the 3.5 - Approved - Waiting for CI results label Dec 4, 2020

mikee47 deleted the fix/http-close-handling branch December 4, 2020 15:00

slaff mentioned this pull request Dec 9, 2020

Preparation for Release 4.2.0 #2069

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

HTTP client fixes #2163

HTTP client fixes #2163

mikee47 commented Dec 1, 2020 •

edited by slaff

slaff commented Dec 2, 2020

mikee47 commented Dec 2, 2020

mikee47 Dec 3, 2020

slaff commented Dec 3, 2020

mikee47 Dec 3, 2020

slaff Dec 3, 2020

slaff commented Dec 3, 2020

slaff commented Dec 3, 2020

mikee47 commented Dec 3, 2020

mikee47 commented Dec 3, 2020 •

edited

slaff commented Dec 4, 2020

mikee47 commented Dec 4, 2020

slaff commented Dec 4, 2020

slaff commented Dec 4, 2020

mikee47 commented Dec 4, 2020

HTTP client fixes #2163

HTTP client fixes #2163

Conversation

mikee47 commented Dec 1, 2020 • edited by slaff

slaff commented Dec 2, 2020

mikee47 commented Dec 2, 2020

mikee47 Dec 3, 2020

Choose a reason for hiding this comment

slaff commented Dec 3, 2020

mikee47 Dec 3, 2020

Choose a reason for hiding this comment

slaff Dec 3, 2020

Choose a reason for hiding this comment

slaff commented Dec 3, 2020

slaff commented Dec 3, 2020

mikee47 commented Dec 3, 2020

mikee47 commented Dec 3, 2020 • edited

slaff commented Dec 4, 2020

mikee47 commented Dec 4, 2020

slaff commented Dec 4, 2020

slaff commented Dec 4, 2020

mikee47 commented Dec 4, 2020

mikee47 commented Dec 1, 2020 •

edited by slaff

mikee47 commented Dec 3, 2020 •

edited