Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reduce CPU & network consumption of Facia JSON download #26338

Merged
merged 2 commits into from
Aug 9, 2023

Conversation

rtyley
Copy link
Member

@rtyley rtyley commented Jul 20, 2023

This PR looks to fix #26335 by improving the scalability of Facia's pressed-page-json loading code in FrontJsonFapi. Originally it focussed only on adopting AWS SDK v2 to allow non-blocking async code for the downloading, but it now additionally includes a more significant change: ETag-based caching.

ETag caching is a pretty wonderful standard part of the HTTP protocol, relying on these 3 components:

  • ETag HTTP response header - a hash of the content, returned by the server
  • If-None-Match HTTP request header - sent by the client, to indicate what content it already has, by sending its ETag
  • 304 Not Modified HTTP response code - returned by a server when content is unchanged from the supplied ETag, with an otherwise blank response (no need to return a response!)

I've extracted the code needed for an ETagCache into a new library at https://github.com/guardian/etag-caching .

ETag Caching Benefits

  • Significant savings in terms of resources:
    • Network: PressedPages are only re-downloaded if the S3 content has changed
    • CPU: Cached PressedPages are only re-parsed if the S3 content has changed
  • Retains the 'stay current' behaviour of the old, non-caching solution: S3 queried with every request to ensure the ETag is up-to-date.
  • Also, given the cache is based on the (S)Caffeine caching library:
    • In-flight requests for a given key are unified, so 100 simultaneous requests for a key make just 1 fetch-and-parse

The fetching and parsing code is also improved:

  • Updated to AWS SDK v2, using non-blocking async code to avoid blocking threads while downloading the Facia JSON data from S3
  • Directly transform bytes into JsValue, without constructing a String first

Memory requirements of caching

Currently facia runs on c6g.2xlarge instances (16GB RAM), with a JVM heap of 12GB. I'm going to suggest that it's reasonable for the ETagCache to consume up to 4GB of the 12GB RAM.

There are ~300 fronts, each of which can have 4 different variants, so currently there could be up to 1200 different PressedPage objects. From heapdump analysis, the largest PressedPage retains ~22MB of memory (most instances are smaller, averaging at 4MB). With the 4GB budget, and assuming a worse case of 22MB per PressedPage, we can afford to set a max size on the cache of ~180 entries. Although it's disappointing we can't hold all of the PressedPages, the priorities of the eviction policy used by the Caffeine caching library should ensure that we get a good hit rate on the most in-demand fronts.

Incidentally, heapdump analysis also shows that some structures within PressedPage are memory inefficient when considered from the perspective of holding many PressedPage objects in memory at once - using object pooling on model.Tag for instance, would probably lead to a 80% reduction of the total retained memory.

Removal of FutureSemaphore

FutureSemaphore, introduced in November 2017 with #18331, is removed in this change - as described in #26335 (comment) & #26336 (comment), the late-stage 32-concurrent-decoding-processes limit was harmful: it led to work (JSON download & String generation) being thrown away, without the result of that work being cached by the CDN.

Testing

Testing total request-response time on CODE:

$ ssm cmd -p frontend -t facia,frontend,code -c "curl -sS --head -w ',result,%{url_effective},%{time_total}\n' http://localhost:9000/uk | grep -e result -e Commit | tr -d '\r\n'" 2> /dev/null | grep Commit | cut -d' ' -f2 | cut -d',' -f1,3- | sort
34d8b3ac2752dc35bae52240c82ebd48945a3408,http://localhost:9000/uk,0.883647
34d8b3ac2752dc35bae52240c82ebd48945a3408,http://localhost:9000/uk,0.986826
34d8b3ac2752dc35bae52240c82ebd48945a3408,http://localhost:9000/uk,1.013991
34d8b3ac2752dc35bae52240c82ebd48945a3408,http://localhost:9000/uk,1.022075
eb1fdaeb1db72479be8b51e06351f905b72d479b,http://localhost:9000/uk,0.626043
eb1fdaeb1db72479be8b51e06351f905b72d479b,http://localhost:9000/uk,0.629217
eb1fdaeb1db72479be8b51e06351f905b72d479b,http://localhost:9000/uk,0.645948
eb1fdaeb1db72479be8b51e06351f905b72d479b,http://localhost:9000/uk,0.669940

@rtyley rtyley force-pushed the mob/facia-s3-aws-v2 branch 6 times, most recently from 7bcd556 to 9431a92 Compare August 2, 2023 14:24
@rtyley rtyley changed the title Switch to async AWS SDK v2 for Facia JSON download Reduce CPU & network consumption of Facia JSON download Aug 2, 2023
@rtyley rtyley force-pushed the mob/facia-s3-aws-v2 branch 4 times, most recently from 4ada4c3 to eb1fdae Compare August 2, 2023 21:50
@rtyley rtyley marked this pull request as ready for review August 3, 2023 14:23
@rtyley rtyley requested a review from a team as a code owner August 3, 2023 14:23
@rtyley rtyley assigned jamesgorrie and unassigned jamesgorrie Aug 3, 2023
@rtyley
Copy link
Member Author

rtyley commented Aug 3, 2023

I've talked this through with @abeddow91, @arelra, @ioannakok, @georgeblahblah and @ParisaTork at the 2pm meeting today, and response was positive, no blocking objections!

Some evidence shared:

  • During the meeting: with the CODE Facia ASG 50/50 split between 4 old instances on the main branch with 34d8b3a, and 4 new ones on eb1fdae, the new code demonstrated consistently faster response times on single requests to the /uk front:
$ ssm cmd -p frontend -t facia,frontend,code -c "curl -sS --head -w ',result,%{url_effective},%{time_total}\n' http://localhost:9000/uk | grep -e result -e Commit | tr -d '\r\n'" 2> /dev/null | grep Commit | cut -d' ' -f2 | cut -d',' -f1,3- | sort
34d8b3ac2752dc35bae52240c82ebd48945a3408,http://localhost:9000/uk,0.883647
34d8b3ac2752dc35bae52240c82ebd48945a3408,http://localhost:9000/uk,0.986826
34d8b3ac2752dc35bae52240c82ebd48945a3408,http://localhost:9000/uk,1.013991
34d8b3ac2752dc35bae52240c82ebd48945a3408,http://localhost:9000/uk,1.022075
eb1fdaeb1db72479be8b51e06351f905b72d479b,http://localhost:9000/uk,0.626043
eb1fdaeb1db72479be8b51e06351f905b72d479b,http://localhost:9000/uk,0.629217
eb1fdaeb1db72479be8b51e06351f905b72d479b,http://localhost:9000/uk,0.645948
eb1fdaeb1db72479be8b51e06351f905b72d479b,http://localhost:9000/uk,0.669940
  • Captured prior to the meeting: Two snapshots of heap memory usage on a local instance of Frontend/DCR while testing with 50 concurrent requests to the /uk front:
    • Old code on main: heap rises quickly to 8GB/12GB, significant GC activity
      image
    • This PR: heap never goes higher than 1.5GB, GC activity is not visible
      image

@rtyley rtyley requested a review from ParisaTork August 3, 2023 14:39
@bryophyta
Copy link
Contributor

There are ~300 fronts, each of which can have 4 different variants, so currently there could be up to 1200 different PressedPage objects

Thanks for the detailed write-up @rtyley! Not sure if this has come up already, but (iirc) the only reason @ioannakok and I were able to find for having a lite version, when we looked into it, was to save some parsing overhead. I haven't followed this PR in detail, but if the new caching strategy tips the balance significantly in favour of having fewer page objects in memory then that might offset the overhead from having to parse the 'full' version of the page json when it is parsed.

Might be worth looking into if it hasn't been discussed yet?

@rtyley
Copy link
Member Author

rtyley commented Aug 3, 2023

@ioannakok and I were able to find for having a lite version, when we looked into it, was to save some parsing overhead.

Ah interesting - that would have been #18364 & #18365 ...

I haven't followed this PR in detail, but if the new caching strategy tips the balance significantly in favour of having fewer page objects in memory then that might offset the overhead from having to parse the 'full' version of the page json when it is parsed.

That is a super interesting point @bryophyta ! Looking at the uncompressed files sizes for s3://aws-frontend-store/PROD/frontsapi/pressed/live/uk/fapi/:

5980936 pressed.v2.adfree.json
4025640 pressed.v2.lite.adfree.json

6063014 pressed.v2.json
4107718 pressed.v2.lite.json

...the lite versions aren't even dramatically smaller - they are still 67% of the same size as the original versions of those files.

So I think you're right - they're adding 67% to storage requirements (assuming that the Scala case-class memory representation is roughly proportional to the file size) and are actively harmful to the cache hit rate, now that we're using a cache.

Subsequent to this PR, in a new PR, I definitely agree: the 'lite' versions should be completely removed. The only thing to watch out for would be to see if the output of the endpoints that use the 'lite' versions (renderFrontHeadline()
renderFrontPressResult() & renderFrontJsonMinimal()) stays the same, depending whether they perform their own trimming that will work on the full versions of PressedPage, or if were relying on the trimmed nature of the 'lite' version at all. I've created #27143 to track this.

@bryophyta
Copy link
Contributor

bryophyta commented Aug 3, 2023

Thanks for this!

The only thing to watch out for would be to see if the output of the endpoints that use the 'lite' versions (renderFrontHeadline() renderFrontPressResult() & renderFrontJsonMinimal()) stays the same, depending whether they perform their own trimming that will work on the full versions of PressedPage, or if were relying on the trimmed nature of the 'lite' version at all.

Agreed that we'd need to check implications for consumers 👍 Hopefully it should be easy enough to repurpose the existing code that creates the lite versions in the current implementation, and run it when those endpoints are hit?

From what I remember though it might take a little bit of detective work to find out who the consumers are though (I happen to know that one of the consumers of renderFrontJsonMinimal() is the Pressreader lambdas, but that's only because by coincidence I've been working on those on my new team.)

But ultimately up to WebX to decide whether it's something to prioritise of course 🙂

}

class FrontJsonFapiLive(val blockingOperations: BlockingOperations) extends FrontJsonFapi {
Copy link
Contributor

@ioannakok ioannakok Aug 4, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since we're deleting BlockingOperations this config can also be removed.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, that bit of config is still used by the BlockingOperations class, and that hasn't been deleted (unfortunately?!) - we're no longer using it for the Fronts-JSON download code, but it's still used elsewhere, eg dfp.OrderAgent.

Copy link
Contributor

@ioannakok ioannakok left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fantastic work @rtyley 👏 Thank you for doing this and for the great PR write-up!

@mchv
Copy link
Member

mchv commented Aug 7, 2023

Excellent work @rtyley 💎

Context: #26335

In this change:

* Use AWS SDK v2 and non-blocking async code to avoid blocking a thread
  while downloading the Facia JSON data from S3
* Directly transform bytes into JsValue, without constructing a `String`
  first

Note that the work done between the two metrics `FrontDownloadLatency`
& `FrontDecodingLatency` has changed slightly - the conversion to the
basic JSON model (JsValue) now occurs in `FrontDownloadLatency`, rather
than the 'decoding' step.

We could get rid of the futureSemaphore and stop using the dedicated
blocking-operations pool here, but we'll leave that to another PR.

Co-authored-by: Ravi <7014230+arelra@users.noreply.github.com>
Co-authored-by: Ioanna Kokkini <ioanna.kokkini@guardian.co.uk>
This change introduces ETag-based caching with the new
https://github.com/guardian/etag-caching library, yielding
these benefits:

* Significant savings in terms of resources:
  * **Network**: `PressedPage`s are only re-downloaded if the S3 content has _changed_
  * **CPU**: Cached `PressedPage`s are only re-parsed if the S3 content has _changed_
* Retains the 'stay current' behaviour of the old, non-caching solution: S3 queried with every request to ensure the ETag is up-to-date.
* Also, given the cache is based on the ([S](https://github.com/blemale/scaffeine))[Caffeine](https://github.com/ben-manes/caffeine) caching library:
  * In-flight requests for a given key are unified, so 100 simultaneous requests for a key make just **1** fetch-and-parse

Currently `facia` runs on [`c6g.2xlarge`](https://instances.vantage.sh/aws/ec2/c6g.2xlarge) instances (16GB RAM), with a JVM heap of 12GB. I'm going to suggest that it's reasonable for the `ETagCache` to consume up to 4GB of the 12GB RAM.

There are ~300 fronts, each of which can have [4 different variants](https://github.com/guardian/frontend/blob/f64c5b681f53a9c87ae1e76c529d08b3ad16ef6b/common/app/model/PressedPage.scala#L68-L86), so currently there could be up to 1200 different `PressedPage` objects. From [heapdump](https://drive.google.com/file/d/1yjfqLMpWqww6-3L8RBN0yrAf9hQHHpSC/view?usp=sharing) analysis, the largest `PressedPage` retains ~22MB of memory (most instances are smaller, averaging at 4MB). With the 4GB budget, and assuming a worse case of 22MB per `PressedPage`, we can afford to set a max size on the cache of ~180 entries. Although it's disappointing we can't hold _all_ of the `PressedPage`s, the [priorities of the eviction policy used by the Caffeine caching library](https://github.com/ben-manes/caffeine/wiki/Efficiency) should ensure that we get a good hit rate on the most in-demand fronts.

_Incidentally, heapdump analysis also shows that some structures within `PressedPage` are memory inefficient when considered from the perspective of holding many `PressedPage` objects in memory at once - using object pooling on `model.Tag` for instance, would probably lead to a 80% reduction of the total retained memory._
@rtyley
Copy link
Member Author

rtyley commented Aug 9, 2023

Thanks all! I'm looking to merge this today, and deploy it to PROD - I would like to pair with a WebX person on the release? Would be good to keep an eye on the graphs as it goes out.

Some useful graphs:

@sophie-macmillan sophie-macmillan merged commit b663cd6 into main Aug 9, 2023
3 checks passed
@sophie-macmillan sophie-macmillan deleted the mob/facia-s3-aws-v2 branch August 9, 2023 10:51
@sophie-macmillan
Copy link
Contributor

@sophie-macmillan
Copy link
Contributor

@rtyley
Copy link
Member Author

rtyley commented Aug 9, 2023

So far, I would say this looks healthy overall (and we're back to running with 3 EC2 instances!), tho' two metrics are perturbing, need follow up.

Good news

Average CPU utilisation in the EC2 ASG dropped when we deployed this PR - and rose when we halved the number of instances from 6 to 3 - but it did not quite double 👍 (graph):

image

More than half of S3 requests are now receiving NOT MODIFIED - and this proportion even raised slightly as we halved the number of instances from 6 to 3 (graph):

image

Decoding the bytes to PressedPages is now slightly faster - that, I can't quite explain! (graph):

image

Concerning metrics

Download time

Despite more than half of them being empty NOT MODIFIED responses, the overall average download time has risen (graph):

image

I can't easily explain this - I feel like either AWS SDK v2 is actually somehow slower than v1, or possibly the fact that we're using async and need to wait for a thread to process the result of the response?!

Garbage collection

Garbage collections have increased (graph) - this is kind of what you'd expect when you hold more objects in memory, but still not very nice to see. I think the main problem here is that the cache max size (180) is pretty small, and so this leads to quite a lot of cache thrashing- if we could hold more PressedPages in memory, we would see less garbage collection.

image

I feel like a lot of this bad stuff can be addressed by reducing the memory footprint of a PressedPage -something for a subsequent PR.

@prout-bot
Copy link
Collaborator

Seen on FRONTS-PROD, ADMIN-PROD (created by @rtyley and merged by @sophie-macmillan 8 hours, 59 minutes and 39 seconds ago)

rtyley added a commit to guardian/facia-scala-client that referenced this pull request Aug 10, 2023
This change adds these improvements:

* Facia data is only re-downloaded & re-parsed if the S3 content has
  _changed_, thanks to ETag-caching - see https://github.com/guardian/etag-caching .
  This library has already been used in DotCom PROD with guardian/frontend#26338
* AWS SDK v2: the FAPI client itself now has a `fapi-s3-sdk-v2` artifact.

An example PR consuming this updated version of the FAPI client is at:

guardian/ophan#5506

To use FAPI with the new AWS SDK v2 support, users must now have a
dependency on *two* FAPI artifacts:

* `fapi-s3-sdk-v2`
* `fapi-client-playXX`

Due to needing to support the matrix of:

* AWS SDK v1 & v2
* Play-JSON 2.7, 2.8, and eventually 2.9

...it's best not to try to produce an artifact that corresponds to
every single combination of those! Consequently, we provide an
artifacts that are specific to the different versions of AWS SDK
(or at least, could do - if AWS SDK v1 was moved out of common code),
and artifacts that are specific to the different versions of
Play-JSON, and allow the user to combine them as needed. A
similar approach was used with `guardian/play-secret-rotation`:

guardian/play-secret-rotation#8

In order for the different artifacts to have interfaces they can
use to join together and become a single useful Facia client, we have
a `fapi-client-core` artifact. Any code that doesn't depend on the
JSON classes, or the actual AWS SDK version (which isn't much!), can
live in there. In particular, we have:

* `com.gu.facia.client.ApiClient`, an existing type that is now a
  trait, with 2 implementations - one that uses the existing
  `com.gu.facia.client.S3Client` abstraction on S3 behaviour
* `com.gu.facia.client.etagcaching.fetching.S3FetchBehaviour`,
  a new trait that exposes just enough interface to allow the
  conditional fetching used for ETag-based caching, but doesn't
  tie you to any specific version of the AWS SDK.
rtyley added a commit to guardian/facia-scala-client that referenced this pull request Aug 10, 2023
This change adds these improvements:

* Facia data is only re-downloaded & re-parsed if the S3 content has
  _changed_, thanks to ETag-caching - see https://github.com/guardian/etag-caching .
  This library has already been used in DotCom PROD with guardian/frontend#26338
* AWS SDK v2: the FAPI client itself now has a `fapi-s3-sdk-v2` artifact.

An example PR consuming this updated version of the FAPI client is at:

guardian/ophan#5506

Updated FAPI artifact layout
----------------------------

To use FAPI with the new AWS SDK v2 support, users must now have a
dependency on *two* FAPI artifacts:

* `fapi-s3-sdk-v2`
* `fapi-client-playXX`

Due to needing to support the matrix of:

* AWS SDK v1 & v2
* Play-JSON 2.7, 2.8, and eventually 2.9

...it's best not to try to produce an artifact that corresponds to
every single combination of those! Consequently, we provide an
artifacts that are specific to the different versions of AWS SDK
(or at least, could do - if AWS SDK v1 was moved out of common code),
and artifacts that are specific to the different versions of
Play-JSON, and allow the user to combine them as needed. A
similar approach was used with `guardian/play-secret-rotation`:

guardian/play-secret-rotation#8

In order for the different artifacts to have interfaces they can
use to join together and become a single useful Facia client, we have
a `fapi-client-core` artifact. Any code that doesn't depend on the
JSON classes, or the actual AWS SDK version (which isn't much!), can
live in there. In particular, we have:

* `com.gu.facia.client.ApiClient`, an existing type that is now a
  trait, with 2 implementations - one that uses the existing
  `com.gu.facia.client.S3Client` abstraction on S3 behaviour
* `com.gu.facia.client.etagcaching.fetching.S3FetchBehaviour`,
  a new trait that exposes just enough interface to allow the
  conditional fetching used for ETag-based caching, but doesn't
  tie you to any specific version of the AWS SDK.
rtyley added a commit that referenced this pull request Aug 10, 2023
ETag Caching was introduced for Facia `PressedPage` JSON downloading
with #26338 in order to improve
scalability and address #26335,
but a limiting factor was the number of `PressedPage` objects that could
be stored in the cache.

With a max `PressedPage` size of 22MB and a memory budget of 4GB, a
cautious max cache size limit of only 180 `PressedPage` objects was set.
As a result, the cache hit rate was relatively low, and we saw elevated GC,
probably because of object continually being evicted out of the small cache:

#26338 (comment)

The change in this new commit dramatically reduces the combined size of
the `PressedPage` objects held in memory, taking the average retained size
per `PressedPage` from 4MB to 0.5MB (based on a sample of 125 `PressedPage`
objects held in memory at the same time).

It does this by deduplicating the `Tag` objects held by the `PressedPage`s.
Previously, as the `Tag`s for different `PressedPage`s were deserialised
from JSON, many identical tags would created over and over again, and held
in memory. After dedeuplication, those different `PressedPage`s will all
reference the _same_ `Tag` object for a given tag.

The deduplication is done as the `Tag`s are deserialised - a new cache
(gotta love caches!) holds `Tag`s keyed by their hashcode and tag id,
and if a new `Tag` is created with a matching key, it's thrown away, and
the old one is used instead. Thus we end up with just one instance of
that `Tag`, instead of many duplicated ones.

See also:

* https://en.wikipedia.org/wiki/String_interning - a similar technique
  used by Java for Strings: https://www.geeksforgeeks.org/interning-of-string/
rtyley added a commit that referenced this pull request Aug 10, 2023
ETag Caching was introduced for Facia `PressedPage` JSON downloading
with #26338 in order to improve
scalability and address #26335,
but a limiting factor was the number of `PressedPage` objects that could
be stored in the cache.

With a max `PressedPage` size of 22MB and a memory budget of 4GB, a
cautious max cache size limit of only 180 `PressedPage` objects was set.
As a result, the cache hit rate was relatively low, and we saw elevated GC,
probably because of object continually being evicted out of the small cache:

#26338 (comment)

The change in this new commit dramatically reduces the combined size of
the `PressedPage` objects held in memory, taking the average retained size
per `PressedPage` from 4MB to 0.5MB (based on a sample of 125 `PressedPage`
objects held in memory at the same time).

It does this by deduplicating the `Tag` objects held by the `PressedPage`s.
Previously, as the `Tag`s for different `PressedPage`s were deserialised
from JSON, many identical tags would created over and over again, and held
in memory. After dedeuplication, those different `PressedPage`s will all
reference the _same_ `Tag` object for a given tag.

The deduplication is done as the `Tag`s are deserialised - a new cache
(gotta love caches!) holds `Tag`s keyed by their hashcode and tag id,
and if a new `Tag` is created with a matching key, it's thrown away, and
the old one is used instead. Thus we end up with just one instance of
that `Tag`, instead of many duplicated ones.

See also:

* https://en.wikipedia.org/wiki/String_interning - a similar technique
  used by Java for Strings: https://www.geeksforgeeks.org/interning-of-string/
rtyley added a commit that referenced this pull request Sep 5, 2023
ETag Caching was introduced for Facia `PressedPage` JSON downloading
with #26338 in order to improve
scalability and address #26335,
but a limiting factor was the number of `PressedPage` objects that could
be stored in the cache.

With a max `PressedPage` size of 22MB and a memory budget of 4GB, a
cautious max cache size limit of only 180 `PressedPage` objects was set.
As a result, the cache hit rate was relatively low, and we saw elevated GC,
probably because of object continually being evicted out of the small cache:

#26338 (comment)

The change in this new commit dramatically reduces the combined size of
the `PressedPage` objects held in memory, taking the average retained size
per `PressedPage` from 4MB to 0.5MB (based on a sample of 125 `PressedPage`
objects held in memory at the same time).

It does this by deduplicating the `Tag` objects held by the `PressedPage`s.
Previously, as the `Tag`s for different `PressedPage`s were deserialised
from JSON, many identical tags would created over and over again, and held
in memory. After dedeuplication, those different `PressedPage`s will all
reference the _same_ `Tag` object for a given tag.

The deduplication is done as the `Tag`s are deserialised - a new cache
(gotta love caches!) holds `Tag`s keyed by their hashcode and tag id,
and if a new `Tag` is created with a matching key, it's thrown away, and
the old one is used instead. Thus we end up with just one instance of
that `Tag`, instead of many duplicated ones.

See also:

* https://en.wikipedia.org/wiki/String_interning - a similar technique
  used by Java for Strings: https://www.geeksforgeeks.org/interning-of-string/
rtyley added a commit to guardian/facia-scala-client that referenced this pull request Jan 5, 2024
This change adds these improvements:

* Facia data is only re-downloaded & re-parsed if the S3 content has
  _changed_, thanks to ETag-caching - see https://github.com/guardian/etag-caching .
  This library has already been used in DotCom PROD with guardian/frontend#26338
* AWS SDK v2: the FAPI client itself now has a `fapi-s3-sdk-v2` artifact.

An example PR consuming this updated version of the FAPI client is at:

guardian/ophan#5506

Updated FAPI artifact layout
----------------------------

To use FAPI with the new AWS SDK v2 support, users must now have a
dependency on *two* FAPI artifacts:

* `fapi-s3-sdk-v2`
* `fapi-client-playXX`

Due to needing to support the matrix of:

* AWS SDK v1 & v2
* Play-JSON 2.7, 2.8, and eventually 2.9

...it's best not to try to produce an artifact that corresponds to
every single combination of those! Consequently, we provide an
artifacts that are specific to the different versions of AWS SDK
(or at least, could do - if AWS SDK v1 was moved out of common code),
and artifacts that are specific to the different versions of
Play-JSON, and allow the user to combine them as needed. A
similar approach was used with `guardian/play-secret-rotation`:

guardian/play-secret-rotation#8

In order for the different artifacts to have interfaces they can
use to join together and become a single useful Facia client, we have
a `fapi-client-core` artifact. Any code that doesn't depend on the
JSON classes, or the actual AWS SDK version (which isn't much!), can
live in there. In particular, we have:

* `com.gu.facia.client.ApiClient`, an existing type that is now a
  trait, with 2 implementations - one that uses the existing
  `com.gu.facia.client.S3Client` abstraction on S3 behaviour
* `com.gu.facia.client.etagcaching.fetching.S3FetchBehaviour`,
  a new trait that exposes just enough interface to allow the
  conditional fetching used for ETag-based caching, but doesn't
  tie you to any specific version of the AWS SDK.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

"Too many operations in progress" in FrontJsonFapi fetching
8 participants