Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adjust size of in-memory block cache #47

Open
1 of 3 tasks
lidel opened this issue Feb 18, 2023 · 6 comments
Open
1 of 3 tasks

Adjust size of in-memory block cache #47

lidel opened this issue Feb 18, 2023 · 6 comments
Milestone

Comments

@lidel
Copy link
Member

lidel commented Feb 18, 2023

bifrost-gateway runs with in-memory 2Q cache with size set to 1024 blocks.

2Q is an enhancement over the standard LRU cache in that it tracks both frequently and recently used entries separately. This avoids a burst in access to new entries from evicting frequently used entries.

Current cache performance: ~50% cache HIT

Cache metrics from bifrost-stage1-ny after one day (~48%):

ipfs_http_blockstore_cache_hit 7.273594e+06
ipfs_http_blockstore_cache_requests 1.515003e+07

And second sample from other day (~50%):

ipfs_http_blockstore_cache_hit 2.7508843e+07
ipfs_http_blockstore_cache_requests 5.4966088e+07

iiuc the above means that in-memory "frecency" cache of 1024 blocks produces cache HIT ~50% of time.
This is not that surprising, every website will cause the same parent blocks to be read multiple times for every subresource on a page.

We run on machines that have 64GiB of memory and bifrost-gateway only utilizes ~5GiB.

Proposal: increase cache size

Improving cache hit here won't improve things like video seeking or fetching big files, but will have impact for how fast popular websites and directory enumerations load, avoiding trashing of the most popular content.

Tasks

  • refactor cache size configuration: remove CLI parameter, and use ENV variable instead (to match plan from Update Saturn logger configuration #43 and new configuration convention agreed with George)
  • with ability to tweak cache size with env variable, run experiments on bifrost-stage1-ny and increase block cache size, let's say initiallly x5 (to 5120 blocks) and see if it improves cache hit, or if it produces diminishing returns.
  • Once we find the optimal cache size on staging, update implicit default
@lidel lidel added this to the M0.3 milestone Feb 18, 2023
@lidel
Copy link
Member Author

lidel commented Feb 24, 2023

I've restarted bifrost-stage1-ny with BLOCK_CACHE_SIZE increased from 1024 to 2048.

2023/02/24 00:57:10 Starting bifrost-gateway 2023-02-22-c0c8fa3
2023/02/24 00:57:10 Block cache size: 2048

I will check Friday EOD if cache hit ratio improved, or latency/error rate decreased in any significant way.

@lidel
Copy link
Member Author

lidel commented Feb 24, 2023

after ~12h:

ipfs_http_blockstore_cache_hit 1.0137958e+07
ipfs_http_blockstore_cache_requests 1.9706903e+07

hits still at ~51%, which confirms that we could increase it further to save on more roundtrips.

I am setting it to 4096 now:

2023/02/24 13:03:22 Starting bifrost-gateway 2023-02-24-dca4ba9
2023/02/24 13:03:22 Block cache size: 4096

@lidel
Copy link
Member Author

lidel commented Feb 24, 2023

in 6h 4096 produced "only" 40% hit ratio

ipfs_http_blockstore_cache_hit 1.297952e+06
ipfs_http_blockstore_cache_requests 3.199352e+06

@lidel
Copy link
Member Author

lidel commented Feb 27, 2023

After weekend, the BLOCK_CACHE_SIZE=4096 result on bifrost-stage1-ny was still around ~41%:

ipfs_http_blockstore_cache_hit 5.595748e+06
ipfs_http_blockstore_cache_requests 1.3580514e+07

@lidel
Copy link
Member Author

lidel commented Apr 3, 2023

Cache hit % is pretty successful across instances, including staging where we deployed #61 with Graph API enabled:

2023-04-03_23-39

I am going to double the cache size on staging to 8192 and check if it makes any difference for graph backend.

Done:

2023/04/03 21:47:39 Starting bifrost-gateway 2023-04-03-e68f6ca

@lidel
Copy link
Member Author

lidel commented Apr 4, 2023

8192 produces similar hit ratio:

Screenshot 2023-04-04 at 21-14-52 bifrost-gw staging metrics - Project Rhea - Dashboards - Grafana

Memory usage is minimal:

Screenshot 2023-04-04 at 21-21-29 bifrost-gw staging metrics - Project Rhea - Dashboards - Grafana

I've bumped it to 16384:

bifrost-gateway version 2023-04-04-575d307
2023/04/04 19:18:51 Starting bifrost-gateway 2023-04-04-575d307
[..]
2023/04/04 19:18:51 BLOCK_CACHE_SIZE: 16384

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
No open projects
Status: 📋 Backlog
Development

No branches or pull requests

1 participant