Skip to content

ppc64-QVM: direct mov_sx_rx via mtvsrws+xscvspdp on POWER9+#392

Merged
ec- merged 1 commit into
ec-:mainfrom
runlevel5:mov_sx_rx_direct
May 4, 2026
Merged

ppc64-QVM: direct mov_sx_rx via mtvsrws+xscvspdp on POWER9+#392
ec- merged 1 commit into
ec-:mainfrom
runlevel5:mov_sx_rx_direct

Conversation

@runlevel5
Copy link
Copy Markdown
Contributor

@runlevel5 runlevel5 commented Apr 30, 2026

Replaces the stw+lfs memory round-trip with two register ops. The earlier mtvsrwz attempt failed because that instruction writes word element 1 of the VSR while xscvspdp reads word element 0; mtvsrws splats the GPR into all four word elements so the bit pattern is also at element 0.

Gated behind USE_ISA_3_0; older CPUs keep the memory path.

Currently being tested. Please do not review.

Replaces the stw+lfs memory round-trip with two register ops. The earlier
mtvsrwz attempt failed because that instruction writes word element 1 of
the VSR while xscvspdp reads word element 0; mtvsrws splats the GPR into
all four word elements so the bit pattern is also at element 0.

Gated behind USE_ISA_3_0; older CPUs keep the memory path.
@runlevel5 runlevel5 marked this pull request as ready for review May 3, 2026 13:33
@runlevel5
Copy link
Copy Markdown
Contributor Author

@ec- this one is ready for review

@ec- ec- merged commit ed1064f into ec-:main May 4, 2026
28 checks passed
@runlevel5 runlevel5 deleted the mov_sx_rx_direct branch May 5, 2026 02:05
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants