Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

decompression: optimize ExtractOffset for Arm #136

Merged
merged 1 commit into from
Aug 13, 2021
Merged

Conversation

JunHe77
Copy link
Contributor

@JunHe77 JunHe77 commented Aug 6, 2021

Inspired by kExtractMasksCombined, this patch uses shift
to replace table lookup. On Arm the codegen is 2 shift ops
(lsl+lsr). Comparing to the previous ldr which requires 4 cycles
latency, the lsl+lsr only needs 2 cycles.
Slight (~0.3%) uplift was observed on N1, and ~3% on A72.

Signed-off-by: Jun He jun.he@arm.com
Change-Id: I5b53632d22d9e5cf1a49d0c5cdd16265a15de23b

Inspired by kExtractMasksCombined, this patch uses shift
to replace table lookup. On Arm the codegen is 2 shift ops
(lsl+lsr). Comparing to previous ldr which requires 4 cycles
latency, the lsl+lsr only need 2 cycles.
Slight (~0.3%) uplift observed on N1, and ~3% on A72.

Signed-off-by: Jun He <jun.he@arm.com>
Change-Id: I5b53632d22d9e5cf1a49d0c5cdd16265a15de23b
@google-cla google-cla bot added the cla: yes label Aug 6, 2021
@atdt
Copy link
Contributor

atdt commented Aug 10, 2021

@pwnall This looks good to me too.

@pwnall pwnall self-assigned this Aug 11, 2021
@pwnall
Copy link
Member

pwnall commented Aug 11, 2021

@JunHe77 Thank you for this contribution!

I started the process of getting this PR merged through our internal repository. When the process completes, this PR will be merged.

@JunHe77
Copy link
Contributor Author

JunHe77 commented Aug 12, 2021

Thanks a lot for reviewing this, @pwnall, @atdt .

@pwnall pwnall merged commit 5c87bc6 into google:master Aug 13, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants