-
Notifications
You must be signed in to change notification settings - Fork 201
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feature request - support for Zc* extensions #633
Comments
The standard I had a closer look at the
This is just my opinion. Any thoughts? |
as for i'm not entirely sure what motivates bottom line -- |
I agree! But before we start implementing that we should wait for GCC support. Unfortunately, upcoming GCC 13(.1) does not include |
Interesting results! Thanks for sharing! 35% would be quite amazing, but I'm not sure what the "cost" of that might be (additional hardware resources, impact on critical path, etc.). But the NEORV32's execution stage is a multi-cycle architecture... so maybe the additional hardware overhead would be quite small... I think I'll need to have a closer look at this again. |
Moreover, Zcb has become mandatory for RVA2023 profile. |
Oh, I did not expect that. However, RVA is the application-class profile (MMU, 64-bit, ...), which is out of scope of this project right now 🙈 I had another look at the Anybody volunteering to do a PR? 😅 |
The However, the big problem with these two instructions is that they do not de-compress into a 32-bit counterpart. Instead, they decompress into several and different instructions which would require a lot of hardware overhead. So I think that the "costs" clearly exceed the benefits here. What do you think? 🤔 |
The main advantages from my biased point of view are reducing the load on the instruction fetch channel, less cache pollution, and it should have a positive impact on interrupt handler latency. However, the most valuable aspect is the reduction of byte-code size to at least the level of Cortex-M0. I think technical difficulties are unavoidable, and it's hard to objectively evaluate their value until they come into play :). |
That's true! In its best case, this instruction saves up to 27 further 16-bit words from being fetched.
Also true. However, embedded single-core systems might not need any kind of caches if you use fast on-chip memory.
I'm not sure about this. Execution time would be identical. However, due to cache pollution / bus congestion there might be a relevant speedup.
Maybe, but technically such a complex instruction isn't "RISC" anymore, right? 😅
I think there are several benchmark examples provided by the people who invented these extended compressed instructions. The benefit (entirely looking at code size and performance) is quite impressive! |
the proposed
Zc*
extensions described here have recently been ratified....the
Zca
extensions is of particular interest in "small cores" with limit memory resources....just a placeholder for what appears to be a non-trivial improvement....
The text was updated successfully, but these errors were encountered: