Skip to content

Commit

Permalink
[AMDGPU][GFX10][DOC][NFC] Update assembler syntax description
Browse files Browse the repository at this point in the history
Summary of changes:
- Update MUBUF lds syntax (see https://reviews.llvm.org/D124485).
- Add v_cvt_pkrtz_f16_f32_dpp, v_cvt_pkrtz_f16_f32_sdwa.
- Update SMEM syntax (see https://reviews.llvm.org/D127314).
- Enable op_sel for v_add_nc_u16, v_sub_nc_u16 (see https://reviews.llvm.org/D123594).
- Minor bug fixing and improvements.
  • Loading branch information
dpreobra committed Jul 4, 2022
1 parent cce64e7 commit f90f0e8
Show file tree
Hide file tree
Showing 18 changed files with 469 additions and 450 deletions.
764 changes: 384 additions & 380 deletions llvm/docs/AMDGPU/AMDGPUAsmGFX10.rst

Large diffs are not rendered by default.

42 changes: 21 additions & 21 deletions llvm/docs/AMDGPU/gfx10_hwreg.rst
Expand Up @@ -41,27 +41,27 @@ or :ref:`absolute expressions<amdgpu_synid_absolute_expression>`.

Defined register *names* include:

==================== ==========================================
Name Description
==================== ==========================================
HW_REG_MODE Shader writeable mode bits.
HW_REG_STATUS Shader read-only status.
HW_REG_TRAPSTS Trap status.
HW_REG_HW_ID1 Id of wave, simd, compute unit, etc.
HW_REG_HW_ID2 Id of queue, pipeline, etc.
HW_REG_GPR_ALLOC Per-wave SGPR and VGPR allocation.
HW_REG_LDS_ALLOC Per-wave LDS allocation.
HW_REG_IB_STS Counters of outstanding instructions.
HW_REG_SH_MEM_BASES Memory aperture.
HW_REG_TBA_LO tba_lo register.
HW_REG_TBA_HI tba_hi register.
HW_REG_TMA_LO tma_lo register.
HW_REG_TMA_HI tma_hi register.
HW_REG_FLAT_SCR_LO flat_scratch_lo register.
HW_REG_FLAT_SCR_HI flat_scratch_hi register.
HW_REG_XNACK_MASK xnack_mask register.
HW_REG_POPS_PACKER pops_packer register.
==================== ==========================================
============================== ==========================================
Name Description
============================== ==========================================
HW_REG_MODE Shader writeable mode bits.
HW_REG_STATUS Shader read-only status.
HW_REG_TRAPSTS Trap status.
HW_REG_HW_ID1 Id of wave, simd, compute unit, etc.
HW_REG_HW_ID2 Id of queue, pipeline, etc.
HW_REG_GPR_ALLOC Per-wave SGPR and VGPR allocation.
HW_REG_LDS_ALLOC Per-wave LDS allocation.
HW_REG_IB_STS Counters of outstanding instructions.
HW_REG_SH_MEM_BASES Memory aperture.
HW_REG_TBA_LO tba_lo register.
HW_REG_TBA_HI tba_hi register.
HW_REG_TMA_LO tma_lo register.
HW_REG_TMA_HI tma_hi register.
HW_REG_FLAT_SCR_LO flat_scratch_lo register.
HW_REG_FLAT_SCR_HI flat_scratch_hi register.
HW_REG_XNACK_MASK xnack_mask register.
HW_REG_POPS_PACKER pops_packer register.
============================== ==========================================

Examples:

Expand Down
13 changes: 13 additions & 0 deletions llvm/docs/AMDGPU/gfx10_opt_0d447d.rst
@@ -0,0 +1,13 @@
..
**************************************************
* *
* Automatically generated file, do not edit! *
* *
**************************************************
.. _amdgpu_synid_gfx10_opt_0d447d:

opt
===

This is an optional operand. It must be used if and only if :ref:`lds<amdgpu_synid_lds>` is omitted.
Expand Up @@ -5,7 +5,7 @@
* *
**************************************************
.. _amdgpu_synid_gfx10_opt:
.. _amdgpu_synid_gfx10_opt_847aed:

opt
===
Expand Down
Expand Up @@ -5,16 +5,18 @@
* *
**************************************************
.. _amdgpu_synid_gfx10_soffset_c40a5a:
.. _amdgpu_synid_gfx10_soffset_73dae7:

soffset
=======

An offset added to the base address to get memory address.
An offset from the base address.

* If offset is specified as a register, it supplies an unsigned byte offset.
* If offset is specified as a 21-bit immediate, it supplies a signed byte offset.

Note that an *immediate* offset may be specified using either :ref:`simm21<amdgpu_synid_simm21>` operand or :ref:`offset21s<amdgpu_synid_smem_offset21s>` modifier, but not both.

*Size:* 1 dword.

*Operands:* :ref:`s<amdgpu_synid_s>`, :ref:`vcc<amdgpu_synid_vcc>`, :ref:`ttmp<amdgpu_synid_ttmp>`, :ref:`null<amdgpu_synid_null>`, :ref:`m0<amdgpu_synid_m0>`, :ref:`simm21<amdgpu_synid_simm21>`
Expand Up @@ -5,12 +5,14 @@
* *
**************************************************
.. _amdgpu_synid_gfx10_soffset_59fade:
.. _amdgpu_synid_gfx10_soffset_d01a5c:

soffset
=======

An unsigned 20-bit offset added to the base address to get memory address.
An unsigned offset from the base address. My be specified as either a register or a 20-bit immediate.

Note that an *immediate* offset may be specified using either :ref:`uimm20<amdgpu_synid_uimm20>` operand or :ref:`offset20u<amdgpu_synid_smem_offset20u>` modifier, but not both.

*Size:* 1 dword.

Expand Down
Expand Up @@ -5,7 +5,7 @@
* *
**************************************************
.. _amdgpu_synid_gfx10_vdata_c61803:
.. _amdgpu_synid_gfx10_vdata_0aba12:

vdata
=====
Expand All @@ -16,6 +16,6 @@ Optionally may serve as an output data:

* If :ref:`glc<amdgpu_synid_glc>` is specified, gets the memory value before the operation.

*Size:* 1 dword by default. :ref:`tfe<amdgpu_synid_tfe>` adds 1 dword if specified.
*Size:* 1 dword.

*Operands:* :ref:`v<amdgpu_synid_v>`
Expand Up @@ -5,7 +5,7 @@
* *
**************************************************
.. _amdgpu_synid_gfx10_vdata_b2a787:
.. _amdgpu_synid_gfx10_vdata_16d321:

vdata
=====
Expand All @@ -16,6 +16,6 @@ Optionally may serve as an output data:

* If :ref:`glc<amdgpu_synid_glc>` is specified, gets the memory value before the operation.

*Size:* 2 dwords by default. :ref:`tfe<amdgpu_synid_tfe>` adds 1 dword if specified.
*Size:* 2 dwords.

*Operands:* :ref:`v<amdgpu_synid_v>`
Expand Up @@ -5,7 +5,7 @@
* *
**************************************************
.. _amdgpu_synid_gfx10_vdata_325b78:
.. _amdgpu_synid_gfx10_vdata_35851e:

vdata
=====
Expand All @@ -16,10 +16,10 @@ Optionally may serve as an output data:

* If :ref:`glc<amdgpu_synid_glc>` is specified, gets the memory value before the operation.

*Size:* depends on :ref:`dmask<amdgpu_synid_dmask>` and :ref:`tfe<amdgpu_synid_tfe>`:
*Size:* depends on :ref:`dmask<amdgpu_synid_dmask>`:

* :ref:`dmask<amdgpu_synid_dmask>` may specify 1 data element for 32-bit-per-pixel surfaces or 2 data elements for 64-bit-per-pixel surfaces. Each data element occupies 1 dword.
* :ref:`tfe<amdgpu_synid_tfe>` adds 1 dword if specified.


Note: the surface data format is indicated in the image resource constant but not in the instruction.

Expand Down
Expand Up @@ -5,7 +5,7 @@
* *
**************************************************
.. _amdgpu_synid_gfx10_vdata_87fb90:
.. _amdgpu_synid_gfx10_vdata_890652:

vdata
=====
Expand All @@ -16,6 +16,6 @@ Optionally may serve as an output data:

* If :ref:`glc<amdgpu_synid_glc>` is specified, gets the memory value before the operation.

*Size:* 4 dwords by default. :ref:`tfe<amdgpu_synid_tfe>` adds 1 dword if specified.
*Size:* 4 dwords.

*Operands:* :ref:`v<amdgpu_synid_v>`
Expand Up @@ -5,7 +5,7 @@
* *
**************************************************
.. _amdgpu_synid_gfx10_vdata_4d8ecf:
.. _amdgpu_synid_gfx10_vdata_a9ff5a:

vdata
=====
Expand All @@ -16,10 +16,10 @@ Optionally may serve as an output data:

* If :ref:`glc<amdgpu_synid_glc>` is specified, gets the memory value before the operation.

*Size:* depends on :ref:`dmask<amdgpu_synid_dmask>` and :ref:`tfe<amdgpu_synid_tfe>`:
*Size:* depends on :ref:`dmask<amdgpu_synid_dmask>`:

* :ref:`dmask<amdgpu_synid_dmask>` may specify 2 data elements for 32-bit-per-pixel surfaces or 4 data elements for 64-bit-per-pixel surfaces. Each data element occupies 1 dword.
* :ref:`tfe<amdgpu_synid_tfe>` adds 1 dword if specified.


Note: the surface data format is indicated in the image resource constant but not in the instruction.

Expand Down
Expand Up @@ -5,7 +5,7 @@
* *
**************************************************
.. _amdgpu_synid_gfx10_vdst_48d3a8:
.. _amdgpu_synid_gfx10_vdst_2ea017:

vdst
====
Expand All @@ -14,9 +14,9 @@ Image data to load by an *image_gather4* instruction.

*Size:* 4 data elements by default. Each data element occupies either 32 bits or 16 bits depending on :ref:`d16<amdgpu_synid_d16>`.

:ref:`d16<amdgpu_synid_d16>` and :ref:`tfe<amdgpu_synid_tfe>` affect operand size as follows:
:ref:`d16<amdgpu_synid_d16>` affects operand size as follows:

* :ref:`d16<amdgpu_synid_d16>` specifies that data elements in registers are packed; each value occupies 16 bits.
* :ref:`tfe<amdgpu_synid_tfe>` adds one dword if specified.


*Operands:* :ref:`v<amdgpu_synid_v>`
19 changes: 19 additions & 0 deletions llvm/docs/AMDGPU/gfx10_vdst_322561.rst
@@ -0,0 +1,19 @@
..
**************************************************
* *
* Automatically generated file, do not edit! *
* *
**************************************************
.. _amdgpu_synid_gfx10_vdst_322561:

vdst
====

Instruction output: data read from a memory buffer.

This is an optional operand. It must be used if and only if :ref:`lds<amdgpu_synid_lds>` is omitted.

*Size:* 1 dword.

*Operands:* :ref:`v<amdgpu_synid_v>`
Expand Up @@ -5,13 +5,13 @@
* *
**************************************************
.. _amdgpu_synid_gfx10_vdst_5d50a1:
.. _amdgpu_synid_gfx10_vdst_709347:

vdst
====

Instruction output: data read from a memory buffer.

*Size:* 1 dword by default. :ref:`tfe<amdgpu_synid_tfe>` adds 1 dword if specified.
*Size:* 1 dword.

*Operands:* :ref:`v<amdgpu_synid_v>`
21 changes: 0 additions & 21 deletions llvm/docs/AMDGPU/gfx10_vdst_719833.rst

This file was deleted.

Expand Up @@ -5,13 +5,13 @@
* *
**************************************************
.. _amdgpu_synid_gfx10_vdst_f47754:
.. _amdgpu_synid_gfx10_vdst_81a6ed:

vdst
====

Instruction output: data read from a memory buffer.

*Size:* 4 dwords by default. :ref:`tfe<amdgpu_synid_tfe>` adds 1 dword if specified.
*Size:* 3 dwords.

*Operands:* :ref:`v<amdgpu_synid_v>`
Expand Up @@ -5,13 +5,13 @@
* *
**************************************************
.. _amdgpu_synid_gfx10_vdst_a49b76:
.. _amdgpu_synid_gfx10_vdst_d71f1c:

vdst
====

Instruction output: data read from a memory buffer.

*Size:* 3 dwords by default. :ref:`tfe<amdgpu_synid_tfe>` adds 1 dword if specified.
*Size:* 2 dwords.

*Operands:* :ref:`v<amdgpu_synid_v>`
Expand Up @@ -5,13 +5,13 @@
* *
**************************************************
.. _amdgpu_synid_gfx10_vdst_d7c57e:
.. _amdgpu_synid_gfx10_vdst_dd8a32:

vdst
====

Instruction output: data read from a memory buffer.

*Size:* 2 dwords by default. :ref:`tfe<amdgpu_synid_tfe>` adds 1 dword if specified.
*Size:* 4 dwords.

*Operands:* :ref:`v<amdgpu_synid_v>`

0 comments on commit f90f0e8

Please sign in to comment.