Skip to content

Conversation

@bnjbvr
Copy link
Member

@bnjbvr bnjbvr commented Jan 27, 2021

No description provided.

Copy link
Member

@cfallin cfallin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, this should be a nice improvement when the CPU supports it!

Could you add a lowering test with the lzcnt flag set?

Also, if it's not too much trouble, it looks like tzcnt also exists and I see the TODO under ctz below; would you be able to add that as well? I'm fine with doing this in a separate PR if you'd like to get this in first.

@github-actions github-actions bot added cranelift Issues related to the Cranelift code generator cranelift:area:x64 Issues related to x64 codegen labels Jan 27, 2021
@bnjbvr bnjbvr force-pushed the x64-lzcnt branch 2 times, most recently from 0515844 to e1d5306 Compare January 28, 2021 11:33
@bnjbvr
Copy link
Member Author

bnjbvr commented Jan 28, 2021

@cfallin done! Also done popcnt, same category of easy todos.

For Clz/Ctz, in theory we could use Tzcount/Lzcount for i128 too: that requires local control flow (e.g. for Clz, lzcount on high component, if it's non zero it's the right result, otherwise the result is lzcount on the low component), so a new Vcode inst. Could be done in a subsequent PR.

Copy link
Member

@cfallin cfallin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM; thanks!

@bnjbvr bnjbvr merged commit 2275519 into bytecodealliance:main Jan 29, 2021
@bnjbvr bnjbvr deleted the x64-lzcnt branch January 29, 2021 18:41
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

cranelift:area:x64 Issues related to x64 codegen cranelift Issues related to the Cranelift code generator

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants