Skip to content

Keep-Alive causing long delays #32

Closed
dgrijalva opened this Issue Sep 6, 2012 · 16 comments

4 participants

@dgrijalva

Using this most basic falcore app:

package main

import (
    "net/http"
    "github.com/ngmoco/falcore"
)

func main() {

    pipeline := falcore.NewPipeline()
    pipeline.Upstream.PushBack(falcore.NewRequestFilter(func(req *falcore.Request)*http.Response {
        return falcore.SimpleResponse(req.HttpRequest, 200, nil, "OK\n")
    }))

    server := falcore.NewServer(8081, pipeline)
    server.ListenAndServe()
}

A simple request is very fast:

Daves-MacBook-Air-2:ka_test dgrijalva$ time curl http://localhost:8081/
OK

real    0m0.015s
user    0m0.008s
sys 0m0.005s

If you add a Connection:Keep-Alive header, it always takes just over 5 seconds:

Daves-MacBook-Air-2:ka_test dgrijalva$ time curl http://localhost:8081/ -H"Connection:Keep-Alive"
OK

real    0m5.298s
user    0m0.005s
sys 0m0.003s
@dgrijalva

This is on OSX Lion and Snow Lion.

@dgrijalva

Smaller, but similar results on ubuntu:

dgrijalva@domU-12-31-39-16-28-95:~$ time curl localhost:9000
OK

real    0m0.010s
user    0m0.000s
sys 0m0.000s
dgrijalva@domU-12-31-39-16-28-95:~$ time curl localhost:9000 -H"Connection:Keep-Alive"
OK

real    0m0.241s
user    0m0.000s
sys 0m0.000s
@smw1218
smw1218 commented Sep 18, 2012

Does 23e3e56 fix this? Or is that another issue?

@dgrijalva

It does not. That commit fixes the issue that we were telling the client we were disconnecting, then not doing it.

It does allow for a workaround, since closing the connection force-flushes the connection. It does not improve performance when keepalive is enabled.

@smw1218
smw1218 commented Sep 18, 2012

it appears to be caused by:

    if e := syscall.SetsockoptInt(fd, syscall.IPPROTO_TCP, srv.sockOpt, 1); e != nil {
        return e
    }

in server_notwindows.go. This was part of the sendfile merge and I'm not sure what it does.

@dgrijalva

@gnanderson Got any ideas?

@dgrijalva

Okay. I got it. See this: http://www.baus.net/on-tcp_cork

We're setting TCP_CORK or TCP_NOPUSH depending on platform. That flag asks the kernel not to send packets until one of the following happens:

  • the flag is turned off
  • the maximum packet size is reached
  • the maximum timeout is reached (apparently 200ms on linux and 5s on mac)
  • the connection is going to close

It appears the appropriate behavior is to cycle this setting between messages to force-flush. Looking into that now.

Also, there are some platform specific differences in behavior, so that's a thing.

@dgrijalva

This should do the trick. @smw1218, thanks for identifying the issue.

@dgrijalva dgrijalva added a commit that closed this issue Sep 18, 2012
@dgrijalva dgrijalva fixes #32 42ba4fe
@dgrijalva dgrijalva closed this in 42ba4fe Sep 18, 2012
@gnanderson

Sorry, out of town right now. TCP_NODELAY should also stop any long waiting on for the tcp buffer to fill when using keep-alive, I suspect it may be better (higher throughput) option for small API style requests. The only thing is I noticed a while back when using TCP_NODELAY, I was sniffing some traffic on the wire and then looked inside http.Response.Write(), there are multple call sites where a few extra syscalls might go out writing the response if the writer arguement to http.Response.Write(w io.Writer) is not a buffered writer.

It may make more sense to use TCP_NODELAY, and just toggle TCP_CORK/NOPUSH for file serving, as its really in conjunction with sendfile syscall that you see the main benefit from it.

@dgrijalva
@gleecology

hi - i see this issue closed, but wanted to mention that K/A didn't work for me using the distro examples

startup (assumed: linux/ubuntu 12.04; go version 1.0.3 ):

go run examples/hello_world/hello_world.go

this works:
ab -n 200 -c 5 http://localhost:8000/
this does not:
ab -n 200 -c 5 -k http://localhost:8000/

Something missed here?

@dgrijalva

I believe ab doesn't use keep-alive. I could be wrong about that.

@smw1218
smw1218 commented Apr 3, 2013

No it does not. The keep-alive option never reuses a connection. If you run the test with more requests, you end up running out of sockets since all the connections will stay open but it opens a new one for every request. The -k option also, stupidly, doesn't actually add the keep alive header for you. You can try ab -n 200 -c 5 -k -H"Connection: Keep-Alive" http://localhost:8000/ but it will open 1000 connections instead of 5.

@gleecology

not trolling, but ab doc says:

-k
Enable the HTTP KeepAlive feature, i.e., perform multiple requests within one HTTP session. Default is no KeepAlive.

More to the point, after the first conn, it fails (even whilst adding the Connection: header):

--debug--

ab -v 6 -n 3 -c 1 -k http://localhost:8000/

This is ApacheBench, Version 2.3 <$Revision: 655654 $>
Copyright 1996 Adam Twiss, Zeus Technology Ltd, http://www.zeustech.net/
Licensed to The Apache Software Foundation, http://www.apache.org/

Benchmarking localhost (be patient)...INFO: POST header ==

GET / HTTP/1.0
Connection: Keep-Alive
Host: localhost:8000
User-Agent: ApacheBench/2.3
Accept: /


LOG: header received:
HTTP/1.1 200 OK
Content-Length: 12

hello world!
LOG: Response code = 200
apr_poll: The timeout specified has expired (70007)


...To this ignorant layman, KA looks broken.

@smw1218
smw1218 commented Apr 4, 2013

Opened a new issue #44

What is happening to you does not appear to be the same as what this issue describes so we should start fresh.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Something went wrong with that request. Please try again.