Skip to content

Conversation

@sayantn
Copy link
Contributor

@sayantn sayantn commented Nov 28, 2025

Fix some x86 intrinsics too

  • _mm_{u}comineq_sh (these should be unordered, i.e. should return true if either operand is nan)
  • _mm_mask_cvt{epi16_epi8, pd_ps, pd_epi32} (top 64 bits should be 0)

@sayantn sayantn force-pushed the intrinsic-test branch 3 times, most recently from 0ecf65e to f2e124b Compare November 29, 2025 06:34
@sayantn
Copy link
Contributor Author

sayantn commented Nov 29, 2025

r? @folkertdev

Just lmk before merging, will need to remove the last commit

@rustbot
Copy link
Collaborator

rustbot commented Nov 29, 2025

Failed to set assignee to folk: invalid assignee

Note: Only org members with at least the repository "read" role, users with write permissions, or people who have commented on the PR may be assigned.

@sayantn sayantn marked this pull request as ready for review November 29, 2025 12:27
@folkertdev
Copy link
Contributor

cc @tgross35 was there some caveat to this approach?

@tgross35
Copy link
Contributor

Not sure about the rest but just regarding formatting, these were added at a time that we didn't have Debug for f16

@folkertdev
Copy link
Contributor

Right just double-checking that there is not some edge case where we'd get incorrect outputs by upcasting to f32.

Copy link
Contributor

@folkertdev folkertdev left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should be good to go then with the most recent commit removed

@folkertdev folkertdev added this pull request to the merge queue Nov 29, 2025
_mm512_castsi256_si512
# _mm512_conj_pch

# Clang bug
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we document what clang bug(s) these are?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah I was thinking about putting some clang PRs to correct these. in summary,

  • _mm{256}_extract_epi{8,16} does (int) (unsigned {char,short}), but afaik C does sign-extension whenever the target type is signed, so this is getting sign-extended, whereas Intel specifies that it should zero-extend.
  • _mm512_mask_reduce_{max,min}_{ps,pd} uses {f32,f64}::{MIN,MAX} in place of masked-out values, but Intel specifies it should use {f32,f64}::{NEG_INFINITY,INFINITY}

Merged via the queue into rust-lang:main with commit c56916c Nov 29, 2025
73 checks passed
@sayantn sayantn deleted the intrinsic-test branch November 30, 2025 10:58
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants