Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement 8 and optionally 16 bit integers with build.rs generated enum #18

Open
wants to merge 2 commits into
base: main
Choose a base branch
from

Conversation

GnomedDev
Copy link

@GnomedDev GnomedDev commented Jan 5, 2024

This PR starts off by splitting the fundamental structure and functions into two different macros, to allow for a custom definition of the internals without having to rewrite all the helpers and trait implementations.

I then changed NonMaxU8, NonMaxI8 (and optionally NonMaxU16/NonMaxI16) to be custom definitions, using a custom enum with NonMax::MAX variants to allow us to choose where the niche is stored, preventing having to generate xor instructions for new and get, as tested and validated on godbolt.

16 bit enum repr is limited to a feature flag, as it extends build time by ~12 seconds on a 5800x3d, and is the reason that the new method is implemented as a transmute (using a match extends build times to minutes!)

@GnomedDev
Copy link
Author

GnomedDev commented Jan 5, 2024

I'm going to rewrite to use build.rs, to avoid bloating the codebase and make the codegen logic live in the repo instead of the Python script I used to generate the repr enums. Done!

@GnomedDev GnomedDev marked this pull request as draft January 5, 2024 13:55
@GnomedDev GnomedDev changed the title Manually implement 8 bit integers with an enum representation Implement 8 and optionally 16 bit integers with build.rs generated enum Jan 5, 2024
@GnomedDev GnomedDev marked this pull request as ready for review January 5, 2024 22:30
@LPGhatguy
Copy link
Owner

Hey, thanks for the PR! I've been travelling with bad internet and haven't had time until now to sit down and look at this.

This is a super interesting idea! It's neat that we can cut down on some instructions at runtime by pushing the work to build time. Controlling the representation of a niche by representing numbers as an enum is clever.

I am concerned about increasing the complexity of this crate's implementation. The build time increase for the 16-bit variants is a lot and I don't know if the special case for 8-bit integers is worthwhile on its own. I wonder if there is a compiler feature we could help with that would make defining custom niches easier. Conversely, I wonder if there are any edge cases we might hit in the compiler by having enums with so many variants.

What is the runtime performance difference of this change? Do you have an application where this change makes a difference on your runtime performance?

@GnomedDev
Copy link
Author

I don't have a situation in which this measurably improves runtime performance, I'm just really interested in overall software micro-optimization and this is an interesting efficiency improvement. This would make nonmax a truly pure-gain abstraction, at least for 8 bit and 16 bit, instead of a trade off of niches/validity and the extra CPU instruction per access or store.

We could use the std internal feature that allows for NonZero to exist, but that would lock it behind nightly. That could be implemented in a different PR under a nightly feature, but for now I want to make this improvement as open to everyone.

This definitely increases complexity in the nonmax repo, but it's mostly self-contained and not a very complex project to start with. This is already running into the compiler's inability to handle large enums with the 16 bit feature, but it causes no errors at this size, just a slow build time that I may look into improving if I get into working on the compiler soon.

@GnomedDev
Copy link
Author

I've run into a bit of an issue, this causes MSRV to be bumped to 1.57 since this is relying on const transmute (1.56) and const unreachable_unchecked (1.57). Shall I include the MSRV bump in this PR, open it separately, or do something else?

@GnomedDev
Copy link
Author

I've just managed to reduce build times from 12 seconds to around 7 to 8 seconds, I'm currently running rustc profiling tools to fix the compiler's handling of this, hopefully the feature should become redundant soon.

@GnomedDev
Copy link
Author

@LPGhatguy can we get this merged?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants