Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[AMDGPU][MC][GFX11] Lacking MC tests for several gfx11 instructions #56887

Closed
Sisyph opened this issue Aug 2, 2022 · 19 comments
Closed

[AMDGPU][MC][GFX11] Lacking MC tests for several gfx11 instructions #56887

Sisyph opened this issue Aug 2, 2022 · 19 comments
Assignees
Labels

Comments

@Sisyph
Copy link
Contributor

Sisyph commented Aug 2, 2022

We have tests for the dpp versions of these instructions, but not the base _e32 or _e64 versions.

These tests will be useful for catching regressions in planned work to support true16 instructions.
Currently, these instructions will not be emitted using registers above 127, but they can be assembled or disassembled with those high registers. The asm and disasm will be incorrect from the perspective of what the hardware will do. For example

Input:

v_ceil_f16 v0, v255

Output:

v_ceil_f16_e32 v0, v255                 ; encoding: [0xff,0xb9,0x00,0x7e]

True behavior of object code with 0xff,0xb9,0x00,0x7e :

v_ceil_f16    v0.l, v127.h

VOP2:
v_mul_f16
v_min_f16
v_max_f16
v_ldexp_f16
v_add_f16
v_sub_f16

VOP1:
V_CVT_F16_U16
V_CVT_F16_I16
V_CVT_U16_F16
V_CVT_I16_F16
V_RCP_F16
V_SQRT_F16
V_RSQ_F16
V_LOG_F16
V_EXP_F16
V_SIN_F16
V_COS_F16
V_FREXP_MANT_F16
V_FREXP_EXP_I16_F16
V_FLOOR_F16
V_CEIL_F16
V_TRUNC_F16
V_RNDNE_F16
V_FRACT_F16
V_CVT_NORM_I16_F16
V_CVT_NORM_U16_F16
V_NOT_B16
V_CVT_I32_I16

@dpreobra Can you please help with creating the tests?

@Sisyph Sisyph added backend:AMDGPU mc Machine (object) code labels Aug 2, 2022
@llvmbot
Copy link
Collaborator

llvmbot commented Aug 2, 2022

@llvm/issue-subscribers-backend-amdgpu

@dpreobra
Copy link
Collaborator

dpreobra commented Aug 2, 2022

Sure. I can also generate tests in true 16-bit syntax (SP3-like) - they can be disabled by XFAIL for now.
Do you mind if I split large existing tests (like gfx11_vop123.s) by formats to make them more manageable?

@Sisyph
Copy link
Contributor Author

Sisyph commented Aug 2, 2022

Thanks! I've got no issue with splitting large tests. Generating tests in true 16-bit syntax (.l/.h) will need to happen eventually, so if you have a clear idea how to do that, please go ahead.

@dpreobra
Copy link
Collaborator

dpreobra commented Aug 2, 2022

Do you prefer tests without VGPRs>127? These tests may be correctly assembled by SP3, for example:

v_ceil_f16    v1, v127

It is accepted by SP3 and interpreted as follows:

v_ceil_f16    v1.l, v127.l

If syntax w/o suffices will be supported by llvm assembler, such tests will require no modifications after switching to new syntax.

@Sisyph
Copy link
Contributor Author

Sisyph commented Aug 2, 2022

Yes, I would prefer tests without VGPRs>127 (in the _e32 versions of the instructions).

Unfortunately, it is not so easy to deduce whether a register is 16 or 32 bit in the AsmParser, so I think suffixes .l/.h will be required. Therefore, the current tests should have no suffixes, but when the True16 support is in place they will require modification (hopefully very mechanically) to be parsed correctly.

@dpreobra
Copy link
Collaborator

Are you planning to support .h/.l syntax for VOP3/VOP3P? And if yes, are you planning to drop op_sel/op_sel_hi modifiers support?

@Sisyph
Copy link
Contributor Author

Sisyph commented Aug 19, 2022

Are you planning to support .h/.l syntax for VOP3/VOP3P? And if yes, are you planning to drop op_sel/op_sel_hi modifiers support?

Good question, I don't have an answer to this yet. Yes I want to support the .h/.l syntax in VOP3/VOP3P.
At that point, the op_sel/op_sel_hi seem kind of redundant. I still need to analyze usage of op_sel/op_sel_hi through the backend. op_sel/op_sel_hi could be only changed in asm syntax, or references to it in codegen could be changed as well. It looks like only a nomenclature change to me, the concepts are unchanged. It should also be considered how packed instructions will be impacted. If you have some opinions on this I would welcome it.

@dpreobra
Copy link
Collaborator

Well, it seems to me that it would be feasible to support .h/.l syntax and also drop op_sel support, at least in MC layer. Not sure about Codegen.

I'm asking because I want to prepare test generator for new syntax before you start working on these features. Any estimations on when this may happen? Thanks.

@Sisyph
Copy link
Contributor Author

Sisyph commented Aug 23, 2022

I think it is feasible as well. Thanks for working on the test suite. I am working on the true16 features and would guess given the current priorities the first emission with that syntax would be in about a month.

@Sisyph
Copy link
Contributor Author

Sisyph commented Aug 25, 2022

I am almost done with what can be called True16 patch1. In this patch, I will restrict VOP1, VOP2, and VOPC instructions using 16 bit operands to only allow VGPRs <=127 for those operands.

In addition to what coverage we have now, I think there should be the following MC Test coverage.

  • For each vop1, vop2, vopc instructions with 16 bit operands
    • For each 16 bit operand in the instruction
      • Try to assemble the base mnemonic using a vgpr above 127
        • expect a legal _e64 (vop3) encoding
      • Try to assemble the base mnemonic using a vgpr above 127 and a dpp operand
        • expect a legal _e64_dpp encoding
      • DPP8 likewise
      • Try to assemble the mnemonic with forced _e32 suffix using a vgpr above 127
        • expect an error

Disassembly

  • For any vop1,2,c encoding using a vgpr above 127
    • This should never have been emitted by the compiler and will never be emitted.
    • It currently disassembles to legal (but incorrect) _e32 assembly
    • It should disassemble with error after my True16 Patch 1

Does any of that overlap with your planned work @dpreobra?

@dpreobra
Copy link
Collaborator

I planned to generate tests for these cases, but it will take some time.
Right now my scripts can only generate tests for true 16 bit operands using .l/.h suffices.
I suggest that you push your patch with a few basic tests, and then I'll generate tests for all cases described above. Will that work for you?

It should disassemble with error after my True16 Patch 1

I assume we do not need full-fledged tests for this case, do we? Because eventually (with full true 16 bit operand support) every encoded VGPR value would be valid. Maybe a few tests for this case would suffice?

@Sisyph
Copy link
Contributor Author

Sisyph commented Aug 25, 2022

Ok, thanks for letting me know.
In True16 Patch 1 we still use 32 bit VGPRs, does that make a difference to your script?
The .l/.h is actually orthogonal to those assembly cases above, because if you assumed every tests above without a suffix referred to .l, then we would want duplicates of all of them with .h

I will plan to generate all these tests that are required for the short term.

Good point about the disassembler tests not needing to be full-fledged because of the eventual change again.

@dpreobra
Copy link
Collaborator

In True16 Patch 1 we still use 32 bit VGPRs, does that make a difference to your script?

The script supports a ‘legacy’ mode with v0..v127 and no .h/.l and a ‘true16bit’ mode with 128 .h/.l registers. It would not be difficult to generate tests we discussed above. Does this have higher priority than GFX11 test coverage analysis and update?

@Sisyph
Copy link
Contributor Author

Sisyph commented Aug 25, 2022

In True16 Patch 1 we still use 32 bit VGPRs, does that make a difference to your script?

The script supports a ‘legacy’ mode with v0..v127 and no .h/.l and a ‘true16bit’ mode with 128 .h/.l registers. It would not be difficult to generate tests we discussed above. Does this have higher priority than GFX11 test coverage analysis and update?

From that description, neither of the modes directly supports the tests I suggested.
At this point I would prefer to generate those myself.

I think they are roughly equal in priority. Thanks for being flexible; no need to change your focus.

@dpreobra
Copy link
Collaborator

There is a comment in VOP3P.DPP tests stating that op_sel must be 0 and op_sel_hi must be all 1:

// For test purpose only. OP_SEL has to be set to all 0 and OP_SEL_HI has to be set to all 1

I saw this requirement in SPG but was unsure if it was a mistake or not. DOT opcodes do not support op_sel so this requirement is satisfied automatically. But fma_mix opcodes use op_sel in a special manner so I'm not sure.

SP3 does allow using arbitrary op_sel values, for example:

v_fma_mix_f32 v5, sel_hi(v1), sel_lo(v2), v3 dpp8:[7,6,5,4,3,2,1,0]

Do you think we shall enforce this requirement on fma_mix opcodes?

@Sisyph
Copy link
Contributor Author

Sisyph commented Aug 30, 2022

My comment is about what is done in code generation. AMDGPUDAGToDAGISel::SelectFMAD_FMA Says we only use these instructions when "there is actually an operand using the conversion from f16." @arsenm wrote that code and can confirm if that is the behavior

Do you think we shall enforce this requirement on fma_mix opcodes?

So I think its fine to test all op_sel combinations and accept them in the Assembler.

@dpreobra
Copy link
Collaborator

dpreobra commented Aug 30, 2022

Thanks, I see. Actually my question was about dpp versions of fma_mix opcodes, sorry that I did not make it clear.
Unfortunately, there are no codegen tests for dpp versions of these instructions.

Ok, I'll generate tests for dpp versions assuming that op_sel is supported.

@dpreobra
Copy link
Collaborator

Enclosed are tests for true 16 bit operands (VOP1, VOP2 and VOPC). They are intended as a drop-in replacement for existing tests when true 16bit support is implemented. Please let me know if you find any issues while using the tests.

test-update.zip

@Sisyph
Copy link
Contributor Author

Sisyph commented Oct 11, 2022

Enclosed are tests for true 16 bit operands (VOP1, VOP2 and VOPC). They are intended as a drop-in replacement for existing tests when true 16bit support is implemented. Please let me know if you find any issues while using the tests.

test-update.zip

I am using those test for my current True16 development. They are very useful, thanks! I think this ticket can be considered done, I will close it.

@Sisyph Sisyph closed this as completed Oct 11, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

4 participants