Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use tinyset to optimize in memory cache RAM usage #1507

Closed
wants to merge 2 commits into from
Closed

Use tinyset to optimize in memory cache RAM usage #1507

wants to merge 2 commits into from

Conversation

james7132
Copy link
Contributor

A good chunk of the overhead of in-memory cache comes from the large number of HashSets used in a lot of it's data structures.

tinyset provides a set type optimized for holding types that boil down to 64-bit integers. For a good number of cases, it will not allocate any heap memory until several elements are inserted (like small/arrayvec), and is only the size of a pointer otherwise.

This PR includes one unsafe function, because tinyset::Fits64 requires one unsafe fn as a part of the trait.

To likewise maximize memory savings, when converting to a u64 for use in Set64, Id<T> shuffles the snowflake parts around to shift most of the 1s towards the lower bits.

As an alternative, a roaring bitmap might be another alternative.

@github-actions github-actions bot added c-cache Affects the cache crate c-model Affects the model crate labels Feb 1, 2022
@james7132 james7132 added m-breaking change Breaks the public API. t-perf Improves performace and removed c-model Affects the model crate labels Feb 1, 2022
@7596ff
Copy link
Contributor

7596ff commented Feb 1, 2022

Could you provide benchmarks at a hundred, ten thousand, or a million guilds? Also, any unsafe code needs to be removed and CI needs to be addressed.

@github-actions github-actions bot added the c-model Affects the model crate label Feb 1, 2022
@james7132
Copy link
Contributor Author

The unsafe fn is required by the trait. The actual implementation is 100% safe. I'll set up a benchmark soon.

@james7132
Copy link
Contributor Author

Units are in bytes allocated by the system allocator.

guild count  | hashset | tinyset | change
1 | 196817 | 195697 | -0.57%
10 | 224218 | 214534 | -4.32%
100 | 416956 | 343188 | -17.69%
1000  | 1427224 | 983548 | -31.09%
10000 | 13006648 | 8395768 | -35.45%
100000 | 102821832 | 66121672 | -35.69%
1000000 | 1636599912 | 1049397352 | -35.88%

Benchmarking memory usage is a lot harder than it needs to be, orz.

@baptiste0928
Copy link
Member

I think this should be behind a feature flag in the model crate to avoid adding unnecessary dependencies for those who don't use the in-memory cache.

@@ -479,6 +509,26 @@ mod tests {
);
}

/// Test that creating an ID via [`Id::new`] with a value of zero panics.
Copy link
Contributor

@HTGAzureX1212 HTGAzureX1212 Feb 13, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This test doesn't seem to test for panic on zero.

Documentation should be updated?

@7596ff
Copy link
Contributor

7596ff commented Feb 13, 2022

This hasn't had much movement since the initial discussion in Discord. Ultimately, we will not be using tinyset, and if any changes are to be made here, they should be made in a separate PR.

@7596ff 7596ff closed this Feb 13, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
c-cache Affects the cache crate c-model Affects the model crate m-breaking change Breaks the public API. t-perf Improves performace
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants