Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Consolidate _mm_prefetch #547

Closed
jserv opened this issue Oct 22, 2022 · 1 comment · Fixed by #550
Closed

Consolidate _mm_prefetch #547

jserv opened this issue Oct 22, 2022 · 1 comment · Fixed by #550
Assignees

Comments

@jserv
Copy link
Member

jserv commented Oct 22, 2022

Current _mm_prefetch does not behave as Intel documentation states:

Fetch the line of data from memory that contains address p to a location in the cache heirarchy specified by the locality hint i.

We shall consolidate:

  1. Refine the function prototype. i.e., void _mm_prefetch(char const *p, int i)
  2. Provide the corresponding test cases. See test/x86/sse.c (Function test_simde_mm_prefetch)
  3. Properly manipulate the locality hint.

The implementation from SIMDe:

void simde_mm_prefetch (const void* p, int i) {
    switch(i) {
      case SIMDE_MM_HINT_NTA:
        __builtin_prefetch(p, 0, 0);
        break;
      case SIMDE_MM_HINT_T0:
        __builtin_prefetch(p, 0, 3);
        break;
      case SIMDE_MM_HINT_T1:
        __builtin_prefetch(p, 0, 2);
        break;
      case SIMDE_MM_HINT_T2:
        __builtin_prefetch(p, 0, 1);
        break;
      case SIMDE_MM_HINT_ENTA:
        __builtin_prefetch(p, 1, 0);
        break;
      case SIMDE_MM_HINT_ET0:
        __builtin_prefetch(p, 1, 3);
        break;
      case SIMDE_MM_HINT_ET1:
        __builtin_prefetch(p, 1, 2);
        break;
      case SIMDE_MM_HINT_ET2:
        __builtin_prefetch(p, 0, 1);
        break;
    }
}

Reference: SIMDe Issue #897.

howjmay added a commit to howjmay/sse2neon that referenced this issue Oct 30, 2022
howjmay added a commit to howjmay/sse2neon that referenced this issue Oct 30, 2022
howjmay added a commit to howjmay/sse2neon that referenced this issue Oct 30, 2022
howjmay added a commit to howjmay/sse2neon that referenced this issue Oct 30, 2022
howjmay added a commit to howjmay/sse2neon that referenced this issue Oct 30, 2022
@jserv
Copy link
Member Author

jserv commented Oct 30, 2022

Evan Nemerson, the original author of SIMDe, commented as following:

ARM C Language Extensions (ACLE) has __pld and, in 1.1+, __pldx. VS is the only ARM compiler I'm aware of targeting ARM which doesn’t support ACLE.

@howjmay, you should check if the generated code with __builtin_prefetch is identical to the counterpart with __pld or __pldx.

howjmay added a commit to howjmay/sse2neon that referenced this issue Nov 2, 2022
howjmay added a commit to howjmay/sse2neon that referenced this issue Nov 2, 2022
howjmay added a commit to howjmay/sse2neon that referenced this issue Nov 2, 2022
@jserv jserv closed this as completed in #550 Nov 2, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants