Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Requests from an HTTP Connection Pool hang after a couple of hours of running #908

Closed
pradyuman opened this issue Feb 28, 2017 · 21 comments
Closed
Labels
1 - triaged Tickets that are safe to pick up for contributing in terms of likeliness of being accepted bug t:client Issues related to the HTTP Client t:core Issues related to the akka-http-core module
Milestone

Comments

@pradyuman
Copy link

I have an HTTP Connection Pool that hangs after a couple of hours of running:

private def createHttpPool(host: String): SourceQueue[(HttpRequest, Promise[HttpResponse])] = {
  val pool = Http().cachedHostConnectionPoolHttps[Promise[HttpResponse]](host)
  Source.queue[(HttpRequest, Promise[HttpResponse])](config.poolBuffer, OverflowStrategy.dropNew)
    .via(pool).toMat(Sink.foreach {
      case ((Success(res), p)) => p.success(res)
      case ((Failure(e), p)) => p.failure(e)
    })(Keep.left).run
}

I enqueue items with:

private def enqueue(uri: Uri): Future[HttpResponse] = {
  val promise = Promise[HttpResponse]
  val request = HttpRequest(uri = uri) -> promise

  queue.offer(request).flatMap {
    case Enqueued => promise.future
    case _ => Future.failed(ConnectionPoolDroppedRequest)
  }
}  

And resolve the response like this:

private def request(uri: Uri): Future[HttpResponse] = {
    def retry = {
      logger.info(s"retrying in ${config.dispatcherRetryInterval} milliseconds")
      akka.pattern.after(config.dispatcherRetryInterval.millis, using = actorSystem.scheduler)(request(uri))
    }

    logger.info("req-start")
    for {
      response <- enqueue(uri)

      _ = logger.info("req-end")

      finalResponse <- response.status match {
        case TooManyRequests => retry
        case OK => Future.successful(response)
        case _ => response.entity.toStrict(10.seconds).map(s => throw Error(s.toString, uri.toString))
      }
    } yield finalResponse
}

The result of this function is then always transformed if the Future is successful:

def get(uri: Uri): Future[Try[JValue]] = {
  for {
    response <- request(uri)
    json <- Unmarshal(response.entity).to[Try[JValue]]
  } yield json
}

Everything works fine for a while and then all I see in the logs are req-start and no req-end.

My akka configuration is like this:

akka {
  actor.deployment.default {
    dispatcher = "charon-dispatcher"
  }
}

charon-dispatcher {
  type = Dispatcher
  executor = "fork-join-executor"

  fork-join-executor {
    parallelism-min = 256
    parallelism-factor = 128.0
    parallelism-max = 1024
  }
}

akka.http {
  host-connection-pool {
    max-connections = 512
    max-retries = 5
    max-open-requests = 16384
    pipelining-limit = 1
  }
}

I'm not sure if this is a configuration problem or a code problem. I have my parallelism and connection numbers so high because without it I get very poor req/s rate (I want to request as fast possible - I have other rate limiting code to protect the server).

This is a thread dump of the program state when the requests start hanging:
gist.github.com/pradyuman/bf83a8f3a293d8c679fcb6dc5f566a80

@pradyuman pradyuman changed the title Requests hang after a couple of hours of running Requests from an HTTP Connection Pool hang after a couple of hours of running Feb 28, 2017
@jrudolph jrudolph added 1 - triaged Tickets that are safe to pick up for contributing in terms of likeliness of being accepted bug labels Feb 28, 2017
@jrudolph
Copy link
Member

Thanks, @pradyuman for this report. Can you tell us a bit more?

  • What does "hang" mean exactly?
  • What's your usual request rate?
  • How many request fail with network problems?
  • Can you check with netstat if connections themselves are still alive?
  • If we could provide you with a custom akka-http version that logs more data, could you rerun and share the log with us?

@pradyuman
Copy link
Author

What does "hang" mean exactly?
By hang I mean I enqueue the request into the pool but never get a response. My logs show "req-start" but no more "req-end" (these logs have some more context - I simplified the above example).

What's your usual request rate?
My usual request rate is 30 req/s for two endpoints and 70 req/s for a third (total 130 req/s). I'm not sure if this is the max we can get - I tried reconfiguring the host pool multiple times but ended up settling on this configuration.

How many request fail with network problems?
All requests will fail after a certain point after which another internal service using it is backpressured and no longer requests any more.

Can you check with netstat if connections themselves are still alive?
I can check with netstat in a few hours - it's currently running in a docker container so I'll need to figure out how to use netstat with it.

If we could provide you with a custom akka-http version that logs more data, could you rerun and share the log with us?
I can definitely run my program with a custom akka-http version. I can also privately share the source code of the program if that will help (it's not very big - only 300-350 relevant lines of code).

@jrudolph
Copy link
Member

Cool, thanks. I'll get back, once I've added more logging something.

@jrudolph
Copy link
Member

@pradyuman I improved logging in #913. I published this PR to our bintray repository as version 10.0.4+10-a01c084d. Can you run this with DEBUG logging enabled and share the log with us? You can send it to johannes.rudolph@lightbend.com.

resolvers += "akka bintray" at "https://dl.bintray.com/akka/maven/"

libraryDependencies += "com.typesafe.akka" %% "akka-http" % "10.0.4+10-a01c084d"

@pradyuman
Copy link
Author

pradyuman commented Feb 28, 2017

I'll do this ASAP and send you the logs once it fails. Do you want me to remove any log messages from my code? Also, where do I specify DEBUG logging?

There seems to be some error with ECS right now (perhaps related to the S3 problems) so ECS can't actually launch any services so I can't test this till Amazon gets this fixed.

@pradyuman
Copy link
Author

Update on this - I couldn't get sbt to resolve the dependency with the above configuration. Is there something I'm missing?

resolvers += "akka bintray" at "https://dl.bintray.com/akka/maven/"

libraryDependencies ++= Seq(
  "ch.qos.logback" % "logback-classic" % "1.1.9",
  "net.logstash.logback" % "logstash-logback-encoder" % "4.8",
  "com.typesafe.akka" %% "akka-actor" % "2.4.17",
  "com.typesafe.akka" %% "akka-contrib" % "2.4.17",
  "com.typesafe.akka" %% "akka-http" % "10.0.4+10-a01c084d",
//  "com.typesafe.akka" %% "akka-http" % "10.0.4",
  "org.spire-math" %% "jawn-ast" % "0.10.4"
)

@jrudolph
Copy link
Member

jrudolph commented Mar 2, 2017

Hmm, strange. In the resolution error output it should show which URLs sbt tried to find the file. Can you post these?

@pradyuman
Copy link
Author

[warn] 	module not found: com.typesafe.akka#akka-http_2.12;10.0.4+10-a01c084d
[warn] ==== local: tried
[warn]   /Users/Pradyuman/.ivy2/local/com.typesafe.akka/akka-http_2.12/10.0.4+10-a01c084d/ivys/ivy.xml
[warn] ==== public: tried
[warn]   https://repo1.maven.org/maven2/com/typesafe/akka/akka-http_2.12/10.0.4+10-a01c084d/akka-http_2.12-10.0.4+10-a01c084d.pom
[warn] ==== activator-launcher-local: tried
[warn]   /Users/Pradyuman/.activator/repository/com.typesafe.akka/akka-http_2.12/10.0.4+10-a01c084d/ivys/ivy.xml
[warn] ==== activator-local: tried
[warn]   /Users/Pradyuman/Downloads/activator-dist-1.3.12/repository/com.typesafe.akka/akka-http_2.12/10.0.4+10-a01c084d/ivys/ivy.xml
[warn] ==== typesafe-releases: tried
[warn]   http://repo.typesafe.com/typesafe/releases/com/typesafe/akka/akka-http_2.12/10.0.4+10-a01c084d/akka-http_2.12-10.0.4+10-a01c084d.pom
[warn] ==== typesafe-ivy-releasez: tried
[warn]   http://repo.typesafe.com/typesafe/ivy-releases/com.typesafe.akka/akka-http_2.12/10.0.4+10-a01c084d/ivys/ivy.xml
[warn] ==== akka bintray: tried
[warn]   https://dl.bintray.com/akka/maven/com/typesafe/akka/akka-http_2.12/10.0.4+10-a01c084d/akka-http_2.12-10.0.4+10-a01c084d.pom

@jrudolph
Copy link
Member

jrudolph commented Mar 2, 2017

@pradyuman ah, I didn't publish a 2.12 version. Will run it for 2.12 and report back.

@jrudolph
Copy link
Member

jrudolph commented Mar 2, 2017

I published the 2.12 version as well. Can you try again?

@pradyuman
Copy link
Author

pradyuman commented Mar 2, 2017

It's able to find the pom now - I'll run it and get back to you once it stalls.

@pradyuman
Copy link
Author

pradyuman commented Mar 2, 2017

I have this configuration but it doesn't seem like I'm logging any debug messages from akka:

akka {

  loglevel = "DEBUG"

  actor.deployment.default {
    dispatcher = "charon-dispatcher"
  }

}

EDIT: Nevermind, I found out about the stdout-loglevel and I can see the messages now.

@jrudolph
Copy link
Member

jrudolph commented Mar 8, 2017

Cool, any logs to share already?

@pradyuman
Copy link
Author

Not yet, we had some issues with a downstream service so we couldn't hit the connection pool as fast as possible. We should be fixing things soon and then can debug the connection pool better.

@pradyuman
Copy link
Author

I'm updating my akka version to 10.0.5 - this should include the updated logging. I'm still keeping tabs on my service - we fixed our issues today so if the problem resurfaces, it should be soon.

@ShaneDelmore
Copy link

Please update the thread if you find a solution. I am experiencing a similar issue, also using 10.05.

@jrudolph
Copy link
Member

We need logs for a stuck case to make any progress towards a solution. So if you suffer from the same issue please gather DEBUG-level logs for the pool with Akka-Http 10.0.5 and send relevant parts to us.

@ShaneDelmore
Copy link

My workaround for now was to map ever request toStrict which resolved the issue. I definitely have some clients that read headers and ignore the body if they get enough information from the resulting status code. Until there is a way to monitor connections I am hesitant to re-enable the streaming mode.

@greenhost87
Copy link

I have some issue on akka-http 10.0.7

@abdolence
Copy link

abdolence commented Jun 30, 2019

Is this still happening for anybody? Because I have similar suspicious behaviour even for the most recent version (10.1.8). The server just might occasionally hang (doesn't respond anyhow for idle-timeout) for one of the requests but following and previous requests work just fine. It is weird also because I have a very simple server serving mostly static content for GETs.

I also have to notice that it seems this happens only for HTTPS requests (with HTTP/2 enabled mode).

@jrudolph jrudolph added t:client Issues related to the HTTP Client t:core Issues related to the akka-http-core module labels Jul 3, 2019
@jrudolph
Copy link
Member

jrudolph commented Jul 3, 2019

Closing. The new pool used in 10.1.x is more stable.

@abdolence this issue is about the client side while you seem to be talking about a server-side issue. If you can pinpoint the suspicious behavior and can provide logs or similar (and a reproducer in the best case), please open a new issue.

@jrudolph jrudolph closed this as completed Jul 3, 2019
@jrudolph jrudolph added this to the will not fix milestone Jul 3, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
1 - triaged Tickets that are safe to pick up for contributing in terms of likeliness of being accepted bug t:client Issues related to the HTTP Client t:core Issues related to the akka-http-core module
Projects
None yet
Development

No branches or pull requests

5 participants