Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Do not buffer entire body in Http Cache #4076

Merged
merged 16 commits into from
May 11, 2022
Merged

Conversation

BoD
Copy link
Contributor

@BoD BoD commented May 3, 2022

See #3981

@netlify
Copy link

netlify bot commented May 3, 2022

Deploy Preview for apollo-android-docs canceled.

Name Link
🔨 Latest commit 127256a
🔍 Latest deploy log https://app.netlify.com/sites/apollo-android-docs/deploys/627a87465712b60009facd39

}

if (read == -1L) {
// We've read fully, commit the cache edit
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not 100% sure this is needed.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The DeferJvmTest tests still pass for me without the closeAndCommitCache() can you add one that fails?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did we ever find the reason behind this?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry I thought your previous comment was about the test passing without committing because it wasn't checking the presence of data in the cache - but now I'm wondering if you meant something else 😅

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let me try again

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah got it (thought you were commenting the inside of closeAndCommitCache).

This path can be reached when an incomplete response is fully read: the JSON reader pulls more bytes, but there aren't any, so originalSource.read returns -1. In that case this bit of code commits early, but:

  1. it's actually not useful, because close will still be called after a parse exception occurs (which commits at that point)
  2. not 100% sure if we actually want to commit in that case? On the one hand, this would be consistent with the current behavior (currently, the data is cached before it's parsed). On the other hand, if we know for sure that -1 here means an incomplete response, it may be nice to avoid caching that?

WDYT?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh gosh this is hard. There are 2 error cases:

  • truncated Json with a 200 response as you mentioned above. I think this is very unlikely to happen but still something we need to handle
  • valid Json with a 200 response containing GraphQL errors.

Ideally we don't want to HTTP cache those... But there's no way to know this at the HTTP level. For now I'd suggest we cache everything that is read fully. The only way I think right now is something along these lines:

    override fun close() {
      if (!hasClosedAndCommitted) {
        try {
          sink.close()
          if (originalSource.read(Buffer(), 1) == -1L) {
            // The caller has read everything
            cacheEditor.commit()
          } else {
            cacheEditor.abort()
          }
        } catch (e: Exception) {
          // Silently ignore cache write errors
        } finally {
          hasClosedAndCommitted = true
        }
        originalSource.close()
      }
    }

This should cache "complete 200" responses. We can revisit later to add support for discarding GraphQL errors

Also:

EzU41-DUcAAsTcS

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fell into this rabbit hole and turns out it's not too hard to implement. I gave it a try there: #4087. This is wildly untested. Feel free to include in this PR or I'll rebase and open a follow up PR later

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes this looks good! 🙏 Merging into this one and I'll add 2 tests for it.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hmm 2 tests failed - I'll have a look before merging :P

@BoD BoD marked this pull request as ready for review May 3, 2022 18:41
var lastEmitTime = currentTimeMillis()
apolloClient.query(WithFragmentSpreadsQuery()).toFlow().collect {
// Allow a 10% margin for inaccuracies
assertTrue(currentTimeMillis() - lastEmitTime >= delay / 1.1)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It'd be nice to have a lower level MockServer API where we enqueue the chunks one after the other and read them as we go. That'd remove the potentiality of flaky timing issues.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh yeah 😊 I'll have a look!

@@ -47,25 +47,46 @@ fun createMultipartMixedChunkedResponse(
responseDelayMillis: Long = 0,
chunksDelayMillis: Long = 0,
boundary: String = "-",
waitForMoreChunks: Boolean = false,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's a lot of constructor arguments. I think MockResponse is due for a little refactoring.

We could make it take a Source as body:

class MockResponse(
    val statusCode: Int = 200,
    val body: Source = Buffer(),
    val headers: Map<String, String> = emptyMap(),
)

Then delay and chunks "just" become specific Source implementations:

class DelayedSource(
    val delayMillis: Long,
    val wrappedSource:Source
    ): Source{
  private var firstRead = true

  override fun close() {
    wrappedSource.close()
  }

  override fun read(sink: Buffer, byteCount: Long): Long {
    if (firstRead) {
      // TODO: see if we can implement this more efficiently with timeouts
      usleep(delayMillis*1000)
    }
    return wrappedSource.read(sink, byteCount)
  }

  override fun timeout(): Timeout {
    TODO("Not yet implemented")
  }
}

class ChunkedSource(): Source {
  // ...
}

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's a cool idea, but the ChunkedSource + a way to enqueue the chunks may be tricky to implement: we'd need a way to make reads blocking. Did it in StreamingNSURLSessionHttpEngine with pthread_cond but we'd need something for Java (easy) and JS (not sure if easy!).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we do some kind of non-blocking/polling I/O with okio? have read() return zero in the source implementation while waiting and something >0 when data is available again?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So I think 0 would make callers call again immediately, in a loop, but we could sleep for a few ms to not hog the cpu? Not ideal but maybe enough for tests?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right. okio.Source is not the good abstraction then. But we could write our own Source abstraction with coroutines support to give a chance to JS land to "sleep"/"delay"

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What about Flow<ByteString>?

class MockResponse(
    val statusCode: Int = 200,
    val body: Flow<ByteString> = emptyFlow(),
    val headers: Map<String, String> = emptyMap(),
)

Example usage:

fun MockServer.enqueue(string: String, delayMs: Long = 0) {
  enqueue(MockResponse(
      statusCode = statusCode,
      body = flow {
        delay(delayMillis)
        emit(body.encodeUtf8())
      },
      headers = mapOf("Content-Length" to string.length.toString()),
  ))
}
val chunks = Channel<String>(Channel.UNLIMITED)
delay(200)
chunks.send("chunk 1")
delay(200)
chunks.send("chunk 2")
chunks.close()
val body: Flow<ByteString> = chunks.receiveAsFlow().map { it.encodeUtf8() }

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you're OK with it I can remove the MockServer changes from this PR and open a dedicated one 😅

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure thing!

@BoD BoD force-pushed the dont-buffer-whole-body-in-http-cache branch from 590902a to ef244eb Compare May 10, 2022 13:13
Copy link
Contributor

@martinbonnin martinbonnin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

@BoD BoD merged commit 0f50117 into main May 11, 2022
@BoD BoD deleted the dont-buffer-whole-body-in-http-cache branch May 11, 2022 08:29
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants