Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: add S3Fifo eviction for memory #303

Merged
merged 6 commits into from
Apr 8, 2024
Merged

Conversation

Ishiihara
Copy link
Contributor

@Ishiihara Ishiihara commented Apr 5, 2024

What's changed and what's your intention?

This PR adds S3Fifo eviction policy based on https://blog.jasony.me/system/cache/2023/08/01/s3fifo and https://github.com/matrixorigin/matrixone/blob/c582b7d3ce8f7c177f2bd268a37375ae1a5b18bd/pkg/fileservice/fifocache/fifo.go#L24. It use a small queue and a main queue. The small queue is also served as the ghost queue.

Checklist

  • I have written the necessary rustdoc comments
  • I have added the necessary unit tests and integration tests
  • I have passed make all (or make fast instead if the old tests are not modified) in my local environment.

Related issues or PRs (optional)

@Ishiihara
Copy link
Contributor Author

@MrCroxx Thank for you maintaining foyer. I found this library is very well designed and are currently evaluating if it could be used in https://github.com/chroma-core/chroma for various caching scenarios. Chroma is one of the most popular embedding stores for RAG.

Chroma will use both memory and SSD for caches for different scenarios. We will also need different eviction policies for different scenarios. S3Fifo is a very promising policy and that's why I would like to collaborate with you on this. The current PR is till a draft and are subject to change and improvements.

I would love to discuss with you more on how we can collaborate and discuss a bit more on how to use foyer as a hybrid cache.

Looking forward to speaking with you!

BTW, I used to work at PingCAP before joining Chroma.

@MrCroxx
Copy link
Owner

MrCroxx commented Apr 6, 2024

Hi @Ishiihara . Thank you for your attention and contribution to foyer. I apologize for just seeing your PR now as I was on call these days. It will be an honor to have foyer as a component of Chroma.

To be honest, I'm not very familiar with what the workload of Chroma is like. And foyer still lacks the ability to handle small KVs as efficiently as it could be. I'm willing to hear from you about where you will use foyer and how can we make foyer better. 🥰

I'm glad to review this PR carefully after my shift. And I glanced at it briefly, it is implemented very well. And just a reminder, you can run make fast in your local environment first as a quick test.

And I'm glad to meet my former colleague here. I'm looking forward to discussing more via IM or offline. But I'm really careful to leave my personal IM on Github. I'll send you an email with my contact.

Copy link
Owner

@MrCroxx MrCroxx left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for your contribution @Ishiihara 🥰. This PR generally LGTM. There is only some code styles don't match foyer. I can help fix them if you like, cause I've also waiting for a S3Fifo implementation for a long time. I've not finished reviewing (I'm still on-calling). I'll finish it soon.

foyer-memory/src/eviction/s3fifo.rs Outdated Show resolved Hide resolved
foyer-memory/src/eviction/s3fifo.rs Outdated Show resolved Hide resolved
foyer-memory/src/eviction/s3fifo.rs Outdated Show resolved Hide resolved
Signed-off-by: MrCroxx <mrcroxx@outlook.com>
Signed-off-by: MrCroxx <mrcroxx@outlook.com>
@MrCroxx MrCroxx changed the title Add S3Fifo eviction for memory feat: add S3Fifo eviction for memory Apr 7, 2024
@MrCroxx MrCroxx added the feature New feature or request label Apr 7, 2024
Signed-off-by: MrCroxx <mrcroxx@outlook.com>
Copy link

codecov bot commented Apr 7, 2024

Codecov Report

Attention: Patch coverage is 88.65979% with 22 lines in your changes are missing coverage. Please review.

Project coverage is 77.40%. Comparing base (6bea810) to head (6f3c413).

❗ Current head 6f3c413 differs from pull request most recent head 9346e73. Consider uploading reports for the commit 9346e73 to get more accurate results

Files Patch % Lines
foyer-memory/src/eviction/s3fifo.rs 92.00% 12 Missing ⚠️
foyer-memory/src/cache.rs 76.74% 10 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main     #303      +/-   ##
==========================================
+ Coverage   77.10%   77.40%   +0.30%     
==========================================
  Files          61       62       +1     
  Lines        7478     7671     +193     
==========================================
+ Hits         5766     5938     +172     
- Misses       1712     1733      +21     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@MrCroxx
Copy link
Owner

MrCroxx commented Apr 7, 2024

Hi @Ishiihara , thanks again for your contribution. I've modified the related codes to make them more compatible with foyer.

This PR still needs some unit tests with s3fifo (e.g. https://github.com/MrCroxx/foyer/blob/main/foyer-memory/src/eviction/lfu.rs#L425-L576). Could you please add some? 🙏 Then the PR is ready to be merged.

Signed-off-by: MrCroxx <mrcroxx@outlook.com>
@MrCroxx
Copy link
Owner

MrCroxx commented Apr 7, 2024

Hi @xiaguan @Little-Wallace , maybe you are also interested in this PR. 🥰

@MrCroxx
Copy link
Owner

MrCroxx commented Apr 7, 2024

hit ratio bench result:

zif_exp, cache_size             fifo            lru             lfu             s3fifo          moka
0.90, 0.005                     16.24%          19.22%          32.37%          32.39%          33.50%
0.90, 0.01                      22.55%          26.20%          38.54%          39.20%          37.92%
0.90, 0.05                      41.05%          45.56%          55.37%          56.63%          55.25%
0.90,  0.1                      51.06%          55.68%          63.82%          65.06%          64.20%
0.90, 0.25                      66.81%          71.17%          76.21%          77.26%          77.12%
1.00, 0.005                     26.62%          31.10%          44.16%          44.15%          45.62%
1.00, 0.01                      34.38%          39.17%          50.63%          51.29%          50.72%
1.00, 0.05                      54.04%          58.76%          66.79%          67.85%          66.89%
1.00,  0.1                      63.15%          67.60%          73.93%          74.92%          74.38%
1.00, 0.25                      76.18%          79.95%          83.63%          84.39%          84.38%
1.05, 0.005                     32.67%          37.71%          50.26%          50.21%          51.85%
1.05, 0.01                      40.97%          46.10%          56.74%          57.40%          57.09%
1.05, 0.05                      60.44%          65.03%          72.04%          73.02%          72.28%
1.05,  0.1                      68.93%          73.12%          78.49%          79.37%          79.00%
1.05, 0.25                      80.38%          83.73%          86.78%          87.42%          87.41%
1.10, 0.005                     39.02%          44.50%          56.26%          56.20%          57.90%
1.10, 0.01                      47.60%          52.93%          62.61%          63.24%          63.05%
1.10, 0.05                      66.59%          70.95%          76.92%          77.76%          77.27%
1.10,  0.1                      74.24%          78.07%          82.54%          83.28%          83.00%
1.10, 0.25                      84.18%          87.10%          89.57%          90.06%          90.08%
1.50, 0.005                     81.17%          85.27%          88.90%          89.10%          89.89%
1.50, 0.01                      86.91%          89.87%          92.25%          92.56%          92.79%
1.50, 0.05                      94.77%          96.04%          96.96%          97.10%          97.07%
1.50,  0.1                      96.65%          97.50%          98.06%          98.14%          98.15%
1.50, 0.25                      98.36%          98.81%          99.04%          99.06%          99.09%

@xiaguan
Copy link
Contributor

xiaguan commented Apr 7, 2024

Thanks so much for your contribution. I've actually tried to code up an S3FIFO before and ended up failing miserably 😭 .
It's pretty awesome to see a RAG user around here 😄 .
I don't seem to catch the implementation of the ghost queue from the paper, or maybe the cache admission strategy? But I'm really surprised at how well it performed in the tests.
I think tinyufo from pingora could be a great reference for an admission strategy?

@xiaguan
Copy link
Contributor

xiaguan commented Apr 7, 2024

Regarding a more long-term integration with foyer. Could you briefly describe your workload, cache size, expected goals, etc?
I'm wondering if we'd be better off using a Btree for our disk storage since LSM trees have the write amplification issue?

@MrCroxx
Copy link
Owner

MrCroxx commented Apr 7, 2024

Regarding a more long-term integration with foyer. Could you briefly describe your workload, cache size, expected goals, etc? I'm wondering if we'd be better off using a Btree for our disk storage since LSM trees have the write amplification issue?

Hi @xiaguan . Are you talking about disk cache? If it is, foyer doesn't use LSM trees as the disk cache. We can discuss on it in another issue. (Or maybe I should transfer this kind of issues into github discussions. foyer repo has already enabled it but is not used properly.)

@xiaguan
Copy link
Contributor

xiaguan commented Apr 7, 2024

I tryied to impl ghost queue based on your code.
But it seems not wokr well 🤔 .
LGTM

@MrCroxx
Copy link
Owner

MrCroxx commented Apr 7, 2024

I tryied to impl ghost queue based on your code. But it seems not wokr well 🤔 . LGTM

Thanks @xiaguan . Can you send another PR with your changes? I'm interested in it. Let's find out why.

@xiaguan
Copy link
Contributor

xiaguan commented Apr 7, 2024

I tryied to impl ghost queue based on your code. But it seems not wokr well 🤔 . LGTM

Thanks @xiaguan . Can you send another PR with your changes? I'm interested in it. Let's find out why.

#306

Signed-off-by: MrCroxx <mrcroxx@outlook.com>
@MrCroxx MrCroxx marked this pull request as ready for review April 8, 2024 07:31
@MrCroxx MrCroxx merged commit 41eebba into MrCroxx:main Apr 8, 2024
9 checks passed
MrCroxx added a commit that referenced this pull request Apr 17, 2024
* Add S3Fifo eviction for memory

* fix: refine s3fifo, fix some bugs

Signed-off-by: MrCroxx <mrcroxx@outlook.com>

* fix: fix license

Signed-off-by: MrCroxx <mrcroxx@outlook.com>

* refactor: expose s3fifo, fix hakari, add s3fifo fuzzy test

Signed-off-by: MrCroxx <mrcroxx@outlook.com>

* bench: add s3fifo to hit ratio bench

Signed-off-by: MrCroxx <mrcroxx@outlook.com>

* test: add s3fifo uts

Signed-off-by: MrCroxx <mrcroxx@outlook.com>

---------

Signed-off-by: MrCroxx <mrcroxx@outlook.com>
Co-authored-by: MrCroxx <mrcroxx@outlook.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants