Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Ember Parsing in N of prelude + header bytes #3605

Merged
merged 13 commits into from Aug 21, 2020

Conversation

ChristopherDavenport
Copy link
Member

@ChristopherDavenport ChristopherDavenport commented Jul 28, 2020

Built on top of #3603

This is optimized for the ideal and most probable case. Prelude and Headers are present in the first chunk of the request. It then get progressively worse from there as we concatenate arrays so we can continue in linear time. There may be a way to drop off what we have done so far, or two algorithms depending on state. Ideally then this operates in n iterations equal to exactly the number of bytes that form the prelude and headers for both requests and responses.

We generate special case information for chunked and content length, as profiling shows that going and looking for those headers afterwards becomes the most expensive operations, primarily because being rule following and checking all our parsers is slow. Presently Uri parsing is like 50% of the parsing algorithm cpu time used.

Each algorithm has throwable a variable. We exit the loop if we encounter throwables. There's no recovery on an error.

@rossabaker
Copy link
Member

rossabaker commented Aug 6, 2020

Is this one ready for a review yet? Our old friend scalafmt needs to be run.

Copy link
Member

@rossabaker rossabaker left a comment

👍 on scalafmt and merge

@rossabaker rossabaker added this to the 0.21.8 milestone Aug 12, 2020
@tpolecat
Copy link
Contributor

tpolecat commented Aug 19, 2020

I'm hitting the following exception running on this branch. I'll try to make a reproduction.

org.http4s.ember.core.Parser$Request$ReqPrelude$ParsePreludeError: Parse Prelude Error Encountered - Partially Decoded: None None None
	at org.http4s.ember.core.Parser$Request$ReqPrelude$.$anonfun$parsePrelude$1(Parser.scala:216)
	at org.http4s.ember.core.Parser$Request$ReqPrelude$.$anonfun$parsePrelude$1$adapted(Parser.scala:189)
	at fs2.Pull$.$anonfun$flatMap$1(Pull.scala:64)
	at fs2.internal.FreeC$$anon$1.cont(Algebra.scala:33)
	at fs2.internal.FreeC$ViewL$$anon$9$$anon$10.<init>(Algebra.scala:222)
	at fs2.internal.FreeC$ViewL$$anon$9.cont(Algebra.scala:222)
	at fs2.internal.FreeC$ViewL$$anon$9$$anon$10.cont(Algebra.scala:224)
	at fs2.internal.FreeC$ViewL$.mk(Algebra.scala:213)
	at fs2.internal.FreeC$ViewL$.apply(Algebra.scala:202)
	at fs2.internal.FreeC.viewL(Algebra.scala:86)
	at fs2.internal.FreeC$.go$1(Algebra.scala:399)
	at fs2.internal.FreeC$.$anonfun$compile$7(Algebra.scala:438)
	at fs2.internal.FreeC$.$anonfun$compile$1(Algebra.scala:415)
	at cats.effect.internals.IORunLoop$.cats$effect$internals$IORunLoop$$loop(IORunLoop.scala:178)
	at cats.effect.internals.IORunLoop$RestartCallback.signal(IORunLoop.scala:414)
	at cats.effect.internals.IORunLoop$RestartCallback.apply(IORunLoop.scala:435)
	at cats.effect.internals.IORunLoop$RestartCallback.apply(IORunLoop.scala:373)
	at cats.effect.internals.IORunLoop$.cats$effect$internals$IORunLoop$$loop(IORunLoop.scala:174)
	at cats.effect.internals.IORunLoop$RestartCallback.signal(IORunLoop.scala:414)
	at cats.effect.internals.IORunLoop$RestartCallback.apply(IORunLoop.scala:435)
	at cats.effect.internals.IORunLoop$RestartCallback.apply(IORunLoop.scala:373)
	at cats.effect.internals.IORunLoop$.cats$effect$internals$IORunLoop$$loop(IORunLoop.scala:174)
	at cats.effect.internals.IORunLoop$RestartCallback.signal(IORunLoop.scala:414)
	at cats.effect.internals.IORunLoop$RestartCallback.apply(IORunLoop.scala:435)
	at cats.effect.internals.IORunLoop$RestartCallback.apply(IORunLoop.scala:373)
	at cats.effect.internals.IOShift$Tick.run(IOShift.scala:36)
	at cats.effect.internals.PoolUtils$$anon$2$$anon$3.run(PoolUtils.scala:52)
	at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
	at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
	at java.base/java.lang.Thread.run(Thread.java:834)
Caused by: java.lang.Throwable: Reached Ended of Stream Looking for Request Prelude
	at org.http4s.ember.core.Parser$Request$ReqPrelude$.$anonfun$parsePrelude$1(Parser.scala:213)
	... 29 more

@tpolecat
Copy link
Contributor

tpolecat commented Aug 19, 2020

Actually I see this on every request from Safari (it just takes a while) because it seems that all responses from Ember have Connection: keep-alive by default and eventually it times out. Maybe this is unrelated to this PR.

@hamnis
Copy link
Contributor

hamnis commented Aug 19, 2020

@tpolecat are they also missing content-length and|or transfer-encoding: chunked ?

@hamnis
Copy link
Contributor

hamnis commented Aug 19, 2020

If its an empty 200 OK, try adding a Content-Length: 0 header and see if that helps

@tpolecat
Copy link
Contributor

tpolecat commented Aug 19, 2020

@hamnis this is via Ok(xml)

HTTP/1.1 200 OK
Date: Wed, 19 Aug 2020 20:45:18 GMT
Connection: keep-alive
Content-Type: text/html
Content-Length: 687

<html>...

It's possible I'm misdiagnosing this. The browser isn't spinning on this one like it does with 404, but after sitting idle for a while the exception above pops out.

It does spin on 404 if I use .orNotFound.

$ curl -i http://localhost:8080/woozle
HTTP/1.1 404 Not Found
Date: Wed, 19 Aug 2020 20:48:21 GMT
Connection: keep-alive
Content-Type: text/plain; charset=UTF-8

[blocks for a while here]

Not foundHTTP/1.1 500 Internal Server Error

java.nio.channels.InterruptedByTimeoutException
        at java.base/sun.nio.ch.UnixAsynchronousSocketChannelImpl$1.run(UnixAsynchronousSocketChannelImpl.java:465)
        at java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515)
        at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264)
        at java.base/java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:304)
        at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
        at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
        at java.base/java.lang.Thread.run(Thread.java:834)
$ 

@tpolecat
Copy link
Contributor

tpolecat commented Aug 19, 2020

Despite this issues above, the client works much better with this patch than without. In v21.7 the client will block until it fails with the above error, before returning the body content. With this patch the body content is returned immediately and client code continues as if all is well. The exception above happens on someone else's thread and doesn't seem to cause any ill effects. So I'm still 👍

@ChristopherDavenport ChristopherDavenport marked this pull request as ready for review Aug 20, 2020
@ChristopherDavenport
Copy link
Member Author

ChristopherDavenport commented Aug 20, 2020

@tpolecat If you could retest I would be very grateful.

@tpolecat
Copy link
Contributor

tpolecat commented Aug 20, 2020

I'm still seeing that error, but it may be due to an issue with when you're calling the withOnError handler. It seems to be getting called after the response it sent, and the alternative response that's computed is discarded. Here is an example.

Invoking with curl works as expected.

$ curl -i localhost:8080/foo
HTTP/1.1 404 Not Found
Date: Thu, 20 Aug 2020 16:03:02 GMT
Connection: keep-alive
Content-Type: text/plain; charset=UTF-8
Content-Length: 0

$ 

but the log shows

org.http4s.ember.core.Parser$Request$ReqPrelude$ParsePreludeError: Parse Prelude Error Encountered - Partially Decoded: None None None
	at org.http4s.ember.core.Parser$Request$ReqPrelude$.$anonfun$parsePrelude$1(Parser.scala:217)
	at org.http4s.ember.core.Parser$Request$ReqPrelude$.$anonfun$parsePrelude$1$adapted(Parser.scala:190)
	at fs2.Pull$.$anonfun$flatMap$1(Pull.scala:64)
	...
	at java.base/java.lang.Thread.run(Thread.java:834)
Caused by: java.lang.Throwable: Reached Ended of Stream Looking for Request Prelude
	at org.http4s.ember.core.Parser$Request$ReqPrelude$.$anonfun$parsePrelude$1(Parser.scala:214)
	... 29 more

Here is the code I'm using.

package test

import cats.effect._
import org.http4s.ember.server.EmberServerBuilder
import org.http4s.HttpApp
import java.io.StringWriter
import java.io.PrintWriter
import org.http4s.Status
import org.http4s.EntityEncoder

object EmberTest extends IOApp {

  def run(args: List[String]): IO[ExitCode] =
    EmberServerBuilder
      .default[IO]
      .withPort(8080)
      .withHttpApp(HttpApp.notFound)
      .withOnError { t =>
        val sw = new StringWriter
        t.printStackTrace()
        t.printStackTrace(new PrintWriter(sw))
        org.http4s.Response[IO](
          status = Status.InternalServerError,
          body   = EntityEncoder[IO, String].toEntity(sw.toString).body
        )
      }
      .build
      .use(_ => IO.never)
      .as(ExitCode.Success)

}

My intent, by the way, is to ensure that 50x responses include some diagnostics. Maybe there is a better way to do this.

@ChristopherDavenport
Copy link
Member Author

ChristopherDavenport commented Aug 20, 2020

Does this work

org.http4s.Response[IO](
 status = Status.InternalServerError,
).withEntity(sw.toString)

The way you were writing that would not encode any content length nor chunked encoding. So it would encode to Content-Length 0 so you would ignore the body.

Likely messing with the next call reusing the socket.

hamnis
hamnis approved these changes Aug 21, 2020
@tpolecat
Copy link
Contributor

tpolecat commented Aug 21, 2020

👍 :shipit: this is working well for me, thanks!

@@ -26,7 +26,8 @@ atlassian-ide-plugin.xml
.metals
.bloop
project/metals.sbt

metals.sbt
Copy link
Member

@rossabaker rossabaker Aug 21, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These don't bother me, but things specific to your tooling rather than the project are a good fit for your personal gitignore.

@rossabaker rossabaker merged commit 2e3f555 into http4s:series/0.21 Aug 21, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants