Grizzly provider fails to handle HEAD with Content-Length header #150

cowtowncoder · 2012-10-12T22:16:14Z

I am trying to use Grizzly provider (v1.7.6), and noticed a timeout for simple HEAD request.
Since this is local test, with 15 second timeout, it looks like this is due to blocking.
Same does not happen with Netty provider.

My best guess to underlying problem is that Grizzly provider expects there to be content to read since Content-Length is returned. This would be incorrect assumption, since HTTP specification explicitly states that HEAD requests may contain length indicator, but there is never payload entity to return.

Looking at Netty provider code, I can see explicit handling for this use case, where connection is closed and any content flushes (in case server did send something).
I did not see similar handling in Grizzly provider, but since implementation code structure is very different it may reside somewhere else.

rlubke · 2012-10-15T19:14:00Z

I checked the Grizzly code, and there is logic to deal with HEAD requests and response bodies. I tested locally with AHC and didn't see any issues here.

Can you create a test case that I can run or perhaps get a thread dump when the hang occurs?

cowtowncoder · 2012-10-15T19:25:05Z

Just to be clear: my problem is not with HEAD with response bodies: I don't send response body (since spec forbids that). Rather, it's with no body, and Content-Length > 0, to indicate length one would get for GET.

I wish the project I am working on was public (it's at private github). AHC tests were pretty hard to follow last time I tried, but do you know of specific ones I could look at, as a starting point / template? I will go ahead myself and try to see -- challenge is that someone has refactored AHC codebase, so while it will be nicer separation, it is also quite different from last release (1.7.6)
I will try to see if I can find HEAD tests.

Hangup breakpoint might make sense, just need to raise timeout value to higher. Although with async operation it may be bit tricky.

rlubke · 2012-10-15T19:30:18Z

I understood what you meant. My response probably wasn't clear. I meant to say that it shouldn't expect content in the case of a HEAD request and so it shouldn't hang waiting for content.

You might look at api/src/test/java/com/ning/http/client/async/AsyncProvidersTest.asyncStatusHEADTest() as a template.

cowtowncoder · 2012-10-15T19:40:47Z

Ok gotcha. Just wanted to verify -- I often read bug reports too cursorily myself. :)

Thanks for the pointer & apologies for lack of test cases so far.

cowtowncoder · 2012-10-15T21:55:01Z

Looking at code I have, there is one other possible explanation: work-around for (old?) Netty prob with handling of chunk boundaries may be interfering. Hope to refactor code, then go back to trying out Grizzly provider.

Btw, what would be the latest stable grizzly version to use? AHC seems to use use "grizzly-websockets", 2.2.16.

rlubke · 2012-10-15T21:56:40Z

The latest is 2.2.19. AHC should work fine with it.

cowtowncoder · 2012-10-15T22:01:08Z

Ok. Tried upgrading to 2.2.19 (for some reason, websockets one is needed). Added explicit deps to grizzly-framework, grizzly-http, to reduce likelihood of transitive deps being to some old version.
Still get the timeout on HEAD, PUT, GET, DELETE work. Now need to figure out where to put the breakpoint when running test...

rlubke · 2012-10-15T22:05:04Z

The Grizzly WS library is needed for WS support within AHC. Grizzly WS is always released at the same time as the http and core frameworks, so the versions should always match.

Not sure if you need a breakpoint yet. If you extend the timeout, you could connect to the process using visualvm and obtain a thread dump.

cowtowncoder · 2012-10-15T22:17:59Z

This probably won't help a lot, but actual call is done using this part of API:

    final String path = resourcePath(server.resourceEndpoint(), contentId);
    try {
        BoundRequestBuilder reqBuilder = _httpClient.prepareHead(path);
        reqBuilder = addStandardParams(reqBuilder, config.getClientId());
        Future<Response> futurama = _httpClient.executeRequest(reqBuilder.build());
        Response resp;
        try {
            resp = futurama.get(timeout, TimeUnit.MILLISECONDS);
        } catch (TimeoutException e) {
            return new AHCHeadCallResult(CallFailure.timeout(server, startTime, System.currentTimeMillis()));
        }
        int statusCode = resp.getStatusCode();
        // one thing first: handle standard headers, if any?
        handleHeaders(server, resp, startTime);
        // call ok?
        if (!IOUtil.isHTTPSuccess(statusCode)) {
            // if not, why not? Any well-known problems? (besides timeout that was handled earlier)
            // then the default fallback
            String msg = getExcerpt(resp);
            return new AHCHeadCallResult(CallFailure.general(server, statusCode, startTime, System.currentTimeMillis(), msg));
        }
        FluentCaseInsensitiveStringsMap headers = resp.getHeaders();
        String lenStr = headers.getFirstValue(VagabondConstants.HTTP_HEADER_CONTENT_LENGTH);

and failure is via TimeoutException; timeout set at 15000 msecs (or whatever); timeout exception seems to be missing stack trace info during debugging.
Server does see HEAD request,

cowtowncoder · 2012-10-15T22:19:43Z

Oh Crivens! I can now make test pass by adding two things:

Changing timeout from 15000 msecs to 500 msecs
Adding 2 System.err.println() calls on server

so I suspect this may be a race condition of some kind. And those have been found from AHC codebase from time to time. :-(

For whatever it's worth, request is handled by a JAX-RS end point.

cowtowncoder · 2012-10-15T22:22:16Z

Oh. One realization on timeouts; this may be explained by fact that the FIRST HEAD call fails, but retry succeeds -- but this only happens when whole-operation timeout limit is not reached. With 15secs it is.
This would suggest that there is some state dependency in there.

cowtowncoder · 2012-10-16T02:34:28Z

I spent some more time ensuring that server implementation does not do something stupid, and as far as I can see it does not. I am also looking at AHC tests; one of HEAD tests is disabled, but it seems broken and not related to issue I am seeing.

At this point I think that the problem has to be related to reuse of the underlying HTTP connection; in my test I do a sequence of:

GET to see that resource does not exist (giving 404)
PUT resource
GET to verify contents
HEAD to verify access to headers (including Content-Length)

The fact that if retries are enabled and second try for HEAD works (with a "fresh" connection, I assume, since timeout probably forces closing of the first connection) suggests that this is somehow related to state handling.

I will next try to modify a test to do something like above.

cowtowncoder · 2012-10-16T03:55:18Z

Managed to write a unit test, similar Head302Test. After failing first, I managed to get it work; initial failure was not related (for HEAD, handler must still access the output stream). Test does GET, HEAD sequence with grizzly provider.

One additional difference is that test uses Async listener approach, whereas production code uses blocking executeRequest variant.

I will next try changing my HEAD handler to use bare async handler. Maybe I can at least work around the issue, even if I can not figure out what exactly is going on.

cowtowncoder · 2012-10-16T04:38:54Z

Change to pure async handler thing did not solve the problem, but it does give bit more visibility into state changes.
Callbacks are received as expected:

Status of 200
Headers, including Content-Length

but then I see two Throwables: first being java.io.IOException("Remotely Closed"), and second java.io.EOFException.
Both are passed to 'onThrowable', which is kind of interesting. I guess they are "normally" just eaten by a grue or something (ugh).
I'll check out stack traces, in case they might give some idea as to what is failing.

cowtowncoder · 2012-10-16T04:40:56Z

Here goes:

java.io.IOException: Remotely Closed
at com.ning.http.client.providers.grizzly.GrizzlyAsyncHttpProvider$AsyncHttpClientTransportFilter$1.failed(GrizzlyAsyncHttpProvider.java:740)
at org.glassfish.grizzly.filterchain.DefaultFilterChain.notifyFailure(DefaultFilterChain.java:559)
at org.glassfish.grizzly.filterchain.DefaultFilterChain.throwChain(DefaultFilterChain.java:443)
at org.glassfish.grizzly.filterchain.DefaultFilterChain.execute(DefaultFilterChain.java:162)
at org.glassfish.grizzly.filterchain.DefaultFilterChain.process(DefaultFilterChain.java:112)
at org.glassfish.grizzly.ProcessorExecutor.execute(ProcessorExecutor.java:78)
at org.glassfish.grizzly.nio.transport.TCPNIOTransport.fireIOEvent(TCPNIOTransport.java:770)
at org.glassfish.grizzly.strategies.AbstractIOStrategy.fireIOEvent(AbstractIOStrategy.java:112)
at org.glassfish.grizzly.strategies.WorkerThreadIOStrategy.run0(WorkerThreadIOStrategy.java:115)
at org.glassfish.grizzly.strategies.WorkerThreadIOStrategy.access$100(WorkerThreadIOStrategy.java:55)
at org.glassfish.grizzly.strategies.WorkerThreadIOStrategy$WorkerThreadRunnable.run(WorkerThreadIOStrategy.java:135)
at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
at java.lang.Thread.run(Thread.java:680)

and

java.io.EOFException
at org.glassfish.grizzly.nio.transport.TCPNIOTransport.read(TCPNIOTransport.java:875)
at org.glassfish.grizzly.nio.transport.TCPNIOTransportFilter.handleRead(TCPNIOTransportFilter.java:74)
at org.glassfish.grizzly.filterchain.TransportFilter.handleRead(TransportFilter.java:173)
at com.ning.http.client.providers.grizzly.GrizzlyAsyncHttpProvider$AsyncHttpClientTransportFilter.handleRead(GrizzlyAsyncHttpProvider.java:753)
at org.glassfish.grizzly.filterchain.ExecutorResolver$9.execute(ExecutorResolver.java:119)
at org.glassfish.grizzly.filterchain.DefaultFilterChain.executeFilter(DefaultFilterChain.java:265)
at org.glassfish.grizzly.filterchain.DefaultFilterChain.executeChainPart(DefaultFilterChain.java:200)
at org.glassfish.grizzly.filterchain.DefaultFilterChain.execute(DefaultFilterChain.java:134)
at org.glassfish.grizzly.filterchain.DefaultFilterChain.process(DefaultFilterChain.java:112)
at org.glassfish.grizzly.ProcessorExecutor.execute(ProcessorExecutor.java:78)
at org.glassfish.grizzly.nio.transport.TCPNIOTransport.fireIOEvent(TCPNIOTransport.java:770)
at org.glassfish.grizzly.strategies.AbstractIOStrategy.fireIOEvent(AbstractIOStrategy.java:112)
at org.glassfish.grizzly.strategies.WorkerThreadIOStrategy.run0(WorkerThreadIOStrategy.java:115)
at org.glassfish.grizzly.strategies.WorkerThreadIOStrategy.access$100(WorkerThreadIOStrategy.java:55)
at org.glassfish.grizzly.strategies.WorkerThreadIOStrategy$WorkerThreadRunnable.run(WorkerThreadIOStrategy.java:135)
at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
at java.lang.Thread.run(Thread.java:680)

so these are 2 exceptions caught during processing of problematic HEAD request.

cowtowncoder · 2012-10-16T04:46:48Z

Last comment of the day: timeout occurs for some reason 2 seconds after specified time (if I give 1500 msec, onThrowable gets called bit over 3500 msec after executing request). This on localhost, without real network access.
I guess there is probably some other additional internal timeout involved.

Also: in my test PUT plays a role; if it is commented out, failure does not occur. May try to add it in AHC test sequence tomorrow.

rlubke · 2012-10-16T04:47:20Z

OK, great - will dig in further on my side. Thanks!

rlubke · 2012-10-17T02:30:24Z

Could you please capture a wire trace of the http traffic for your failing case?

cowtowncoder · 2012-10-17T16:51:36Z

This is from unit tests on localhost. Wonder what would be the easiest way to capture it? (sorry it has been a while since I used tcpdump). I can easily run the test on Eclipse, or probably from command line.

rlubke · 2012-10-17T16:52:58Z

I'm a big fan of wireshark. Another good tool is ngrep.

cowtowncoder · 2012-10-17T20:22:58Z

Ok, I think I got stuff captured, exported as XML. But github issue tracker has no attachment capabilities.
So I'll need to find a place to upload it in... let's see:

https://jira.codehaus.org/browse/JACKSON-435

(see the attachment)

cowtowncoder · 2012-10-17T20:27:15Z

Data in question is just 12000 bytes of 0xAC (i.e. data is correct). But same happens with smaller payloads (120 bytes whatever).

cowtowncoder · 2012-10-18T02:24:56Z

One thing I started thinking is that perhaps I am not reading response that PUT is returning; and that could mess up handling.

rlubke · 2013-08-12T21:07:36Z

Closing this issue out as I'm unable to reproduce. Please feel free to re-open when you have time to revisit and have a test case we can run.

rlubke closed this as completed Aug 12, 2013

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Grizzly provider fails to handle HEAD with Content-Length header #150

Grizzly provider fails to handle HEAD with Content-Length header #150

cowtowncoder commented Oct 12, 2012

rlubke commented Oct 15, 2012

cowtowncoder commented Oct 15, 2012

rlubke commented Oct 15, 2012

cowtowncoder commented Oct 15, 2012

cowtowncoder commented Oct 15, 2012

rlubke commented Oct 15, 2012

cowtowncoder commented Oct 15, 2012

rlubke commented Oct 15, 2012

cowtowncoder commented Oct 15, 2012

cowtowncoder commented Oct 15, 2012

cowtowncoder commented Oct 15, 2012

cowtowncoder commented Oct 16, 2012

cowtowncoder commented Oct 16, 2012

cowtowncoder commented Oct 16, 2012

cowtowncoder commented Oct 16, 2012

cowtowncoder commented Oct 16, 2012

rlubke commented Oct 16, 2012

rlubke commented Oct 17, 2012

cowtowncoder commented Oct 17, 2012

rlubke commented Oct 17, 2012

cowtowncoder commented Oct 17, 2012

cowtowncoder commented Oct 17, 2012

cowtowncoder commented Oct 18, 2012

rlubke commented Aug 12, 2013

Grizzly provider fails to handle HEAD with Content-Length header #150

Grizzly provider fails to handle HEAD with Content-Length header #150

Comments

cowtowncoder commented Oct 12, 2012

rlubke commented Oct 15, 2012

cowtowncoder commented Oct 15, 2012

rlubke commented Oct 15, 2012

cowtowncoder commented Oct 15, 2012

cowtowncoder commented Oct 15, 2012

rlubke commented Oct 15, 2012

cowtowncoder commented Oct 15, 2012

rlubke commented Oct 15, 2012

cowtowncoder commented Oct 15, 2012

cowtowncoder commented Oct 15, 2012

cowtowncoder commented Oct 15, 2012

cowtowncoder commented Oct 16, 2012

cowtowncoder commented Oct 16, 2012

cowtowncoder commented Oct 16, 2012

cowtowncoder commented Oct 16, 2012

cowtowncoder commented Oct 16, 2012

rlubke commented Oct 16, 2012

rlubke commented Oct 17, 2012

cowtowncoder commented Oct 17, 2012

rlubke commented Oct 17, 2012

cowtowncoder commented Oct 17, 2012

cowtowncoder commented Oct 17, 2012

cowtowncoder commented Oct 18, 2012

rlubke commented Aug 12, 2013