Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SIMD traces for mulhi usage #107

Closed
jfbastien opened this issue Nov 3, 2017 · 5 comments
Closed

SIMD traces for mulhi usage #107

jfbastien opened this issue Nov 3, 2017 · 5 comments

Comments

@jfbastien
Copy link
Member

Action item: Intel folks to see in their traces how the instructions are used (variable or constants as inputs).

@PeterJensen
Copy link
Contributor

PeterJensen commented Nov 3, 2017

As I recall, the issue was whether to restrict just one of the source operands to be constant for this instruction:

  __m128i _mm_mulhi_epi16 (__m128i a, __m128i b);  (PMULHW)

It would be helpful, if someone (James @jzern ?) could point out where in the Webp benchmark this instruction is used.

EDIT: I did a search for mulhi in the https://github.com/webmproject/libwebp repo and got a bunch of hits in the dsp directory. Are those the right ones to look at?

@jzern
Copy link

jzern commented Nov 3, 2017

On the portable-intrinsics branch there's examples for neon, sse2 and portable-intrinsics, the second value for all calls are constants. The NEON half of the portable intrinsics could be refined like dec_neon.c, it's using the same constant values as sse2 for convenience in the implementation.

https://chromium.googlesource.com/webm/libwebp/+/0af22e17d67e6b81fee6d42a53ce6f40aad416e1/src/dsp/dec_wasm.c#115
https://chromium.googlesource.com/webm/libwebp/+/0af22e17d67e6b81fee6d42a53ce6f40aad416e1/src/dsp/dec_neon.c#975
https://chromium.googlesource.com/webm/libwebp/+/0af22e17d67e6b81fee6d42a53ce6f40aad416e1/src/dsp/dec_sse2.c#88

@PeterJensen
Copy link
Contributor

Thanks @jzern !

I was looking at the ARM NEON instruction manual for the VQDMULH instruction and didn't see that it requires one of the source operands to be constant. If both SEE and NEON support both operands being non-constant, a potential WASM instruction for mulhi might as well do that too, right? Maybe I didn't read the NEON documentation right. Here's the info I'm looking at:

http://infocenter.arm.com/help/index.jsp?topic=/com.arm.doc.dui0489g/CJAJIIGG.html

(I couldn't find a permalink to the actual instruction, so you have to search for it :(

@jzern
Copy link

jzern commented Nov 4, 2017

I was looking at the ARM NEON instruction manual for the VQDMULH instruction and didn't see that it requires one of the source operands to be constant. If both SEE and NEON support both operands being non-constant, a potential WASM instruction for mulhi might as well do that too, right?

You're right NEON doesn't. The intrinsics do offer a scalar variant, though. So 2 non-constants is an option, one thing that needs to be considered is the range, however. With the doubling that the NEON does it forces one vector to 15 bits.

@dtig
Copy link
Member

dtig commented Oct 4, 2022

SIMD proposal merged, closing as no longer relevant.

@dtig dtig closed this as completed Oct 4, 2022
fitzgen pushed a commit to fitzgen/meetings that referenced this issue Sep 14, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants