Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(vfpu): add missing instructions + prefix support (C000[X,Y,Z]) #160

Merged
merged 11 commits into from
Apr 17, 2024

Conversation

SK83RJOSH
Copy link
Contributor

@SK83RJOSH SK83RJOSH commented Jan 6, 2024

I generated most of the missing instructions using a tool I created and have confirmed the accuracy of the code gen via tests I'm writing for a vector math library. Though it would be wise to do a regression test of this commit against the rust-psp samples. πŸ™‚

This table details the supported instructions, including their encoding/parameter formats for convenience. Anything not supported is likely best to be hand-rolled from this point on.

Instruction Support (99/113)

? Inst Ops Enc
⬜ bvf vfpu-branch vfpu-branch
⬜ bvfl vfpu-branch vfpu-branch
⬜ bvt vfpu-branch vfpu-branch
⬜ bvtl vfpu-branch vfpu-branch
βœ… lv.q vfpu-load16 vfpu-memory-quad
βœ… lv.s vfpu-load4 vfpu-memory
⬜ lvl.q vfpu-load16 vfpu-memory-quad
⬜ lvr.q vfpu-load16 vfpu-memory-quad
βœ… mfvc vfpu-control-gpr vfpu-gpr-control
βœ… mtvc vfpu-control-gpr vfpu-gpr-control
βœ… sv.q vfpu-store16 vfpu-memory-quad
βœ… sv.s vfpu-store4 vfpu-memory
⬜ svl.q vfpu-store16 vfpu-memory-quad
⬜ svr.q vfpu-store16 vfpu-memory-quad
βœ… vabs vector-unary vfpu-alu
βœ… vadd vector-binary vfpu-alu
βœ… vasin vector-unary vfpu-alu
βœ… vavg vector-unary-reduce vfpu-alu
βœ… vbfy1 vector-unary vfpu-alu
βœ… vbfy2 vector-unary vfpu-alu
βœ… vc2i vector-unary-expand4 vfpu-alu
⬜ vcmovf vfpu-condmove vfpu-condmove
⬜ vcmovt vfpu-condmove vfpu-condmove
⬜ vcmp vfpu-compare vfpu-alu-compare
βœ… vcos vector-unary vfpu-alu
βœ… vcrs vector-binary vfpu-alu
βœ… vcrsp vector-binary vfpu-alu
βœ… vcst vector-nullary-cst vector-imm5
βœ… vdet vector-binary-reduce vfpu-alu
βœ… vdiv vector-binary vfpu-alu
βœ… vdot vector-binary-reduce vfpu-alu
βœ… vexp2 vector-unary vfpu-alu
βœ… vf2h vector-unary-reduce2 vfpu-alu
βœ… vf2id vector-unary-scale vector-imm5
βœ… vf2in vector-unary-scale vector-imm5
βœ… vf2iu vector-unary-scale vector-imm5
βœ… vf2iz vector-unary-scale vector-imm5
βœ… vfad vector-unary-reduce vfpu-alu
βœ… vfim vector-nullary-uimm16 vector-imm16
βœ… vflush vfpu-static vfpu-fixedop
βœ… vh2f vector-unary-expand2 vfpu-alu
βœ… vhdp vector-binary-reduce vfpu-alu
βœ… vhtfm2 vector-matrix-transform vfpu-alu-m1
βœ… vhtfm3 vector-matrix-transform vfpu-alu-m1
βœ… vhtfm4 vector-matrix-transform vfpu-alu-m1
βœ… vi2c vector-unary-reduce vfpu-alu
βœ… vi2f vector-unary-scale vector-imm5
βœ… vi2s vector-unary-reduce2 vfpu-alu
βœ… vi2uc vector-unary-reduce vfpu-alu
βœ… vi2us vector-unary-reduce2 vfpu-alu
βœ… vidt vector-nullary vfpu-alu
βœ… viim vector-nullary-uimm16 vector-imm16
βœ… vlgb vector-unary vfpu-alu
βœ… vlog2 vector-unary vfpu-alu
βœ… vmax vector-binary vfpu-alu
⬜ vmfvc vfpu-control-read vfpu-read-control
βœ… vmidt matrix-nullary vfpu-alu
βœ… vmin vector-binary vfpu-alu
βœ… vmmov matrix-unary vfpu-alu
βœ… vmmul matrix-binary vfpu-alu
βœ… vmone matrix-nullary vfpu-alu
βœ… vmov vector-unary vfpu-alu
βœ… vmscl matrix-binary-scale vfpu-alu
⬜ vmtvc vfpu-control-write vfpu-write-control
βœ… vmul vector-binary vfpu-alu
βœ… vmzero matrix-nullary vfpu-alu
βœ… vneg vector-unary vfpu-alu
βœ… vnop vfpu-static vfpu-fixedop
βœ… vnrcp vector-unary vfpu-alu
βœ… vnsin vector-unary vfpu-alu
βœ… vocp vector-unary vfpu-alu
βœ… vone vector-nullary vfpu-alu
βœ… vpfxd vfpu-prefix vfpu-prefix
βœ… vpfxs vfpu-prefix vfpu-prefix
βœ… vpfxt vfpu-prefix vfpu-prefix
βœ… vqmul vector-binary vfpu-alu
βœ… vrcp vector-unary vfpu-alu
βœ… vrexp2 vector-unary vfpu-alu
βœ… vrndf1 vector-nullary vfpu-alu
βœ… vrndf2 vector-nullary vfpu-alu
βœ… vrndi vector-nullary vfpu-alu
βœ… vrnds vector-inullary vfpu-alu
βœ… vrot vector-unary-rot vector-imm5
βœ… vrsq vector-unary vfpu-alu
βœ… vs2i vector-unary-expand2 vfpu-alu
βœ… vsat0 vector-unary vfpu-alu
βœ… vsat1 vector-unary vfpu-alu
βœ… vsbn vector-binary vfpu-alu
βœ… vsbz vector-unary vfpu-alu
βœ… vscl vector-binary-scale vfpu-alu
βœ… vscmp vector-binary vfpu-alu
βœ… vsge vector-binary vfpu-alu
βœ… vsgn vector-unary vfpu-alu
βœ… vsin vector-unary vfpu-alu
βœ… vslt vector-binary vfpu-alu
βœ… vsocp vector-unary-expand2 vfpu-alu
βœ… vsqrt vector-unary vfpu-alu
βœ… vsrt1 vector-unary vfpu-alu
βœ… vsrt2 vector-unary vfpu-alu
βœ… vsrt3 vector-unary vfpu-alu
βœ… vsrt4 vector-unary vfpu-alu
βœ… vsub vector-binary vfpu-alu
βœ… vsync vfpu-static vfpu-fixedop
βœ… vt4444 vector-unary-reduce2 vfpu-alu
βœ… vt5551 vector-unary-reduce2 vfpu-alu
βœ… vt5650 vector-unary-reduce2 vfpu-alu
βœ… vtfm2 vector-matrix-transform vfpu-alu
βœ… vtfm3 vector-matrix-transform vfpu-alu
βœ… vtfm4 vector-matrix-transform vfpu-alu
βœ… vuc2ifs vector-unary-expand4 vfpu-alu
βœ… vus2i vector-unary-expand2 vfpu-alu
⬜ vwbn vector-unary-mod vector-imm8
βœ… vzero vector-nullary vfpu-alu

@SK83RJOSH
Copy link
Contributor Author

@sajattack this would close #63, and we should create a new issue tracking the missing instructions :)

@SK83RJOSH
Copy link
Contributor Author

SK83RJOSH commented Jan 6, 2024

I would also like to add that there's a possibility of implementing swizzling to all supported instructions, but it would require an additional TT muncher, or an additional macro that supports all combinations of: X, |X|, -|X|

I wasn't clever enough to do the muncher, and the latter would be a combinatorial explosion. I figured Saj or Potato might be better at macros though, so if either of you can come up with a clever way to do it I can go ahead and do another round of codegen.

EDIT: You already had a muncher for this, added prefix support to all arguments that support it :)

@SK83RJOSH SK83RJOSH changed the title feat(vfpu): add missing instructions feat(vfpu): add missing instructions + inline prefix support (C000[X,Y,Z]) Jan 6, 2024
@SK83RJOSH SK83RJOSH changed the title feat(vfpu): add missing instructions + inline prefix support (C000[X,Y,Z]) feat(vfpu): add missing instructions + prefix support (C000[X,Y,Z]) Jan 6, 2024
@SK83RJOSH
Copy link
Contributor Author

@overdrivenpotato I think this PR is ready for review, if you'd do me the honors ❀️

@sajattack
Copy link
Collaborator

@SK83RJOSH
VFPU math tests (using vsin and vcos) are failing
https://ci.mijalkovic.ca/teams/rust-psp/pipelines/rust-psp/jobs/run-tests-for-pr/builds/409.2

#[no_mangle]
pub unsafe extern "C" fn cosf(scalar: f32) -> f32 {
let out: f32;
vfpu_asm! (
"mfc1 {tmp}, {scalar}",
"mtv {tmp}, S000",
"nop",
"vcst.s S001, VFPU_2_PI",
"vmul.s S000, S000, S001",
"vcos.s S000, S000",
"mfv {tmp}, S000",
"mtc1 {tmp}, {scalar}",
"nop",
scalar = inlateout(freg) scalar => out,
tmp = out(reg) _,
options(nostack, nomem),
);
out
}
#[no_mangle]
pub unsafe extern "C" fn sinf(scalar: f32) -> f32 {
let out: f32;
vfpu_asm! (
"mfc1 {tmp}, {scalar}",
"mtv {tmp}, S000",
"nop",
"vcst.s S001, VFPU_2_PI",
"vmul.s S000, S000, S001",
"vsin.s S000, S000",
"mfv {tmp}, S000",
"mtc1 {tmp}, {scalar}",
"nop",
scalar = inlateout(freg) scalar => out,
tmp = out(reg) _,
options(nostack, nomem),
);
out
}

https://github.com/overdrivenpotato/rust-psp/blob/master/ci/tests/src/math_test.rs

@davidgfnet
Copy link

BTW you could use the automated test generator at https://github.com/pspdev/vfpu-docs/ to generate your tests. It would need to be modified to output Rust code, but seems doable to me :)
Just my 2c really, I really hate writing tests so I only wrote a small subset of them (the ones that take more time to automate than just manually write).

@SK83RJOSH
Copy link
Contributor Author

SK83RJOSH commented Jan 15, 2024

My plan was to try to write tests for a few instructions of each encoding + each of their flavors; since I think that should be rigorous enough for all codegen but I'll most definitely take a look at that. πŸ™

Ironically enough, this entire thing has lead me down another rabbit hole atm, which is getting assertions/stack traces working correctly so I can make the user story for testing a bit less cumbersome... though after a week of plugging away at that I may have to accept what we have πŸ˜„

@SK83RJOSH
Copy link
Contributor Author

SK83RJOSH commented Jan 21, 2024

Alrighty, added tests that should hit most if not all of the code gen, with the exception of fixedop + imm16 since those are pretty straight forward/are near identical to other instructions. The only thing I have tested here, that I would like to, is matrix operations.

Hold off on merging this until I can validate those, since it would be very unfortunate if gum breaks.

Okey-dokey, I tested the samples + my own project for regressions in gum, only found one so this should be ready to merge. πŸ™‚

@sajattack sajattack merged commit b8c9734 into overdrivenpotato:master Apr 17, 2024
1 check passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants