Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add request decompression to server GZip middleware #7119

Open
wants to merge 1 commit into
base: series/0.23
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Jump to
Jump to file
Failed to load files.
Diff view
Diff view
85 changes: 81 additions & 4 deletions docs/docs/gzip.md
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
# GZip Compression
# GZip

Http4s provides [Middleware], named `GZip`, for allowing for the compression of the `Response`
body using GZip.
body and for the decompression of the incoming `Request` using `GZip`.

Examples in this document have the following dependencies.

Expand All @@ -28,6 +28,8 @@ import cats.effect.unsafe.IORuntime
implicit val runtime: IORuntime = cats.effect.unsafe.IORuntime.global
```

## Compressing Response

Let's start by making a simple service that returns a (relatively) large string
in its body. We'll use `as[String]` to examine the body.

Expand Down Expand Up @@ -63,7 +65,7 @@ bodyNormal.length

So far, there was no change. That's because the caller needs to inform us that
they will accept GZipped responses via an `Accept-Encoding` header. Acceptable
values for the `Accept-Encoding` header are **"gzip"**, **"x-gzip"**, and **"*"**.
values for the `Accept-Encoding` header are **"gzip"**, **"x-gzip"**, and **"\*"**.

```scala mdoc
val requestZip = request.putHeaders("Accept-Encoding" -> "gzip")
Expand All @@ -79,7 +81,82 @@ length. Also, there is a `Content-Encoding` header in the response with a value
of **"gzip"**.

As described in [Middleware], services and middleware can be composed such
that only some of your endpoints are GZip enabled.
that only some of your endpoints are `GZip` enabled.

## Decompressing Request

Analogically to compressing response, `GZip` middleware supports decompressing request.
Let's see how it works with a simple echo service, returning the body of incoming requests.

```scala mdoc:reset:invisible
import cats.effect._
import org.http4s._
import org.http4s.dsl.io._
import org.http4s.implicits._

import cats.effect.unsafe.IORuntime
implicit val runtime: IORuntime = cats.effect.unsafe.IORuntime.global
```

```scala mdoc:silent
val service = HttpRoutes.of[IO] {
case req =>
Ok(req.body)
}

val request = Request[IO](Method.POST, uri"/").withEntity("echo")
```

```scala mdoc
// Do not call 'unsafeRun' in your code - see note at bottom.
val response = service.orNotFound(request).unsafeRunSync()
val body = response.as[String].unsafeRunSync()
body.length
```

Now let's see what happens when we wrap the service with `GZip` middleware.
For the purpose of this example, let's create a compressed body using
`Compression` utility from [fs2](https://fs2.io) library.

```scala mdoc:silent
import fs2._
import fs2.compression._

val compressedEntity =
Stream
.emits(("I repeat myself when I'm under stress. " * 3).getBytes())
.through(Compression[IO].gzip())

val compressedRequest = request.withEntity(compressedEntity)
```

Now, similarly to before, let's wrap the service with the `GZip` middleware.

```scala mdoc:silent
import org.http4s.server.middleware._
val serviceZip = GZip(service)
```

```scala mdoc
// Do not call 'unsafeRun' in your code - see note at bottom.
val respNormal = serviceZip.orNotFound(compressedRequest).unsafeRunSync()
val bodyNormal = respNormal.as[String].unsafeRunSync()
bodyNormal.length
```

We can clearly see the middleware didn't do much and that's because the decompression will only be
triggered if the request contains `Content-Encoding` header with a value of **"gzip"** or **"x-gzip"**.

```scala mdoc
// Do not call 'unsafeRun' in your code - see note at bottom.
val validRequest = compressedRequest.putHeaders("Content-Encoding" -> "gzip")

val respDecompressed = serviceZip.orNotFound(validRequest).unsafeRunSync()
val bodyDecompressed = respDecompressed.as[String].unsafeRunSync()
bodyDecompressed.length
```

This time we can see that the request got decompressed as expected.

**NOTE:** In this documentation, we are calling `unsafeRunSync` to extract values out
of a service or middleware code. You can work with values while keeping them inside the
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -36,10 +36,11 @@ object GZip extends GZipPlatform {
isZippable: Response[G] => Boolean = defaultIsZippable[G](_: Response[G]),
): Http[F, G] =
Kleisli { (req: Request[G]) =>
req.headers.get[`Accept-Encoding`] match {
val unzippedRequest = unzipOrPass(req, bufferSize)
unzippedRequest.headers.get[`Accept-Encoding`] match {
case Some(acceptEncoding) if satisfiedByGzip(acceptEncoding) =>
http(req).map(zipOrPass(_, bufferSize, level, isZippable))
case _ => http(req)
http(unzippedRequest).map(zipOrPass(_, bufferSize, level, isZippable))
case _ => http(unzippedRequest)
}
}

Expand All @@ -51,6 +52,11 @@ object GZip extends GZipPlatform {
(contentType.get.mediaType == MediaType.application.`octet-stream`))
}

private def isZipped[F[_]](req: Request[F]): Boolean =
req.headers.get[`Content-Encoding`].map(_.contentCoding).exists { coding =>
coding === ContentCoding.gzip || coding === ContentCoding.`x-gzip`
}

private def satisfiedByGzip(acceptEncoding: `Accept-Encoding`) =
acceptEncoding.satisfiedBy(ContentCoding.gzip) || acceptEncoding.satisfiedBy(
ContentCoding.`x-gzip`
Expand All @@ -67,6 +73,15 @@ object GZip extends GZipPlatform {
case resp => resp // Don't touch it, Content-Encoding already set
}

private def unzipOrPass[F[_]: Compression](
request: Request[F],
bufferSize: Int,
): Request[F] =
request match {
case req if isZipped(req) => unzipRequest(bufferSize, req)
case req => req
}

private def zipResponse[F[_]: Compression](
bufferSize: Int,
level: DeflateParams.Level,
Expand All @@ -89,4 +104,16 @@ object GZip extends GZipPlatform {
.putHeaders(`Content-Encoding`(ContentCoding.gzip))
.pipeBodyThrough(compressPipe)
}

private def unzipRequest[F[_]: Compression](
bufferSize: Int,
req: Request[F],
): Request[F] = {
val decompressPipe = Compression[F].gunzip(bufferSize = bufferSize).andThenF(_.content)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the .handleErrorWith in #5546 (comment) is a good idea to throw MalformedMessageBodyFailure for streams that aren't valid gzip streams, but maybe it could be more specific (and I should have added a test for it).

logger.trace("GZip middleware decoding content").unsafeRunSync()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't love running this effect like this, though I see that it is status quo.

Perhaps in 1.x we should move the logging out to a separate wrapper method. Something like,

object Gzip extends GzipPlatform {

  def withLogging[F[_]: Functor, G[_]: Compression: Sync](...):` Http[F, G]

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of asking for Sync we should ask for a LoggerFactory. There are old mistakes but let's not repeat them in new code 😅

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah I see. Makes sense.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems like out of scope of this PR, doesn't it?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changing existing code is out-of-scope. Avoiding writing new code with anti-pattterns (established as they may be) is in-scope :)

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Got it! I'll change it a little

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This change seems to have some repercussions. Biggest one being breaking compatibility but also when working on it I had an issue with GZipPlatform I couldn't resolve - I couldn't provide enough evidence to create GZip middleware.
How about removing those trace logs?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I couldn't provide enough evidence to create GZip middleware.

What's the error?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I change the signature of Gzip::apply to something like def apply[F[_]: LoggerFactory: Monad, G[_]: Compression](...) then the arguments of GZipPlatform::apply must change as well. Currently it's

def apply[F[_], G[_]](
      http: Http[F, G],
      bufferSize: Int,
      level: DeflateParams.Level,
      isZippable: Response[G] => Boolean,
      F: Functor[F],
      G: Sync[G],
  ): Http[F, G]

But F: Functor[F] is not enough. I tried various approaches but can't find anything that would work.

req
.removeHeader[`Content-Length`]
.removeHeader[`Content-Encoding`]
.pipeBodyThrough(decompressPipe)
}
}
Original file line number Diff line number Diff line change
Expand Up @@ -29,6 +29,7 @@ import org.http4s.syntax.all._
import org.scalacheck.Arbitrary.arbitrary
import org.scalacheck.Gen
import org.scalacheck.effect.PropF
import org.typelevel.ci._

import java.util.Arrays

Expand Down Expand Up @@ -88,6 +89,44 @@ class GZipSuite extends Http4sSuite {
resp.map(!_.headers.contains[`Content-Encoding`]).assert
}

test("decodes random content-type if content-encoding allows it") {
val request = "Request string"
val routes: HttpRoutes[IO] = HttpRoutes.of[IO] { case req @ POST -> Root => Ok(req.body) }
matkob marked this conversation as resolved.
Show resolved Hide resolved
val gzipRoutes: HttpRoutes[IO] = GZip(routes, isZippable = _ => false)

val req: Request[IO] = Request[IO](Method.POST, uri"/")
.putHeaders(Header.Raw(ci"Content-Encoding", "gzip"))
.withBodyStream(Stream.emits(request.getBytes()).through(Compression[IO].gzip()))

gzipRoutes.orNotFound(req).flatMap { response =>
response.body.compile
.to(Chunk)
.map { decoded =>
Arrays.equals(request.getBytes(), decoded.toArray)
}
.assert
}
}

test("doesn't decode request if content-encoding doesn't allow it") {
val request = "Request string"
val routes: HttpRoutes[IO] = HttpRoutes.of[IO] { case req @ POST -> Root => Ok(req.body) }
val gzipRoutes: HttpRoutes[IO] = GZip(routes, isZippable = _ => false)

val req: Request[IO] = Request[IO](Method.POST, uri"/")
.putHeaders(`Content-Encoding`(ContentCoding.identity))
.withBodyStream(Stream.emits(request.getBytes()))

gzipRoutes.orNotFound(req).flatMap { response =>
response.body.compile
.to(Chunk)
.map { decoded =>
Arrays.equals(request.getBytes(), decoded.toArray)
}
.assert
}
}

test("encoding") {
val genByteArray =
Gen.poisson(10).flatMap(n => Gen.buildableOfN[Array[Byte], Byte](n, arbitrary[Byte]))
Expand Down