Skip to content

Commit 3f7985e

Browse files
committed
[AMDGPU][MC][NFC][DOC] Updated AMD GPU assembler syntax description.
Summary of changes: - added description of MTBUF instructions and format modifier; - described limitations of f16 inline constants when used with integer operands; - updated description of gfx9+ flat global addressing modes; - v_accvgpr_write_b32 src0 corrections (gfx908); - minor bugfixing and improvements.
1 parent b488935 commit 3f7985e

File tree

154 files changed

+2570
-1995
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

154 files changed

+2570
-1995
lines changed

llvm/docs/AMDGPU/AMDGPUAsmGFX10.rst

Lines changed: 819 additions & 750 deletions
Large diffs are not rendered by default.

llvm/docs/AMDGPU/AMDGPUAsmGFX1011.rst

Lines changed: 4 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -71,8 +71,8 @@ VOP3P
7171
**INSTRUCTION** **DST** **SRC0** **SRC1** **SRC2** **MODIFIERS**
7272
\ |---|\ |---|\ |---|\ |---|\ |---|\ |---|\ |---|\ |---|\ |---|\ |---|\ |---|\ |---|\ |---|\ |---|\ |---|\ |---|\ |---|\ |---|\ |---|\ |---|\ |---|\ |---|\ |---|\ |---|\ |---|\ |---|\ |---|\ |---|\ |---|\ |---|\ |---|\ |---|\ |---|\ |---|\ |---|\ |---|\ |---|\ |---|\ |---|\ |---|\ |---|\ |---|\ |---|\ |---|\ |---|\ |---|\ |---|\ |---|\ |---|\ |---|\ |---|\ |---|\ |---|\ |---|\ |---|\ |---|\ |---|\ |---|\ |---|\ |---|\ |---|\ |---|\ |---|\ |---|\ |---|\ |---|\ |---|\ |---|\ |---|\ |---|\ |---|\ |---|\ |---|\ |---|\ |---|\ |---|\ |---|\ |---|\ |---|\ |---|\ |---|\ |---|\ |---|\ |---|\ |---|\ |---|\ |---|\ |---|\ |---|\ |---|\ |---|\ |---|\ |---|\ |---|\ |---|\ |---|\ |---|\ |---|\ |---|\ |---|
7373
v_dot2_f32_f16 :ref:`vdst<amdgpu_synid1011_vdst32_0>`, :ref:`src0<amdgpu_synid1011_src32_0>`::ref:`f16x2<amdgpu_synid1011_type_dev>`, :ref:`src1<amdgpu_synid1011_src32_1>`::ref:`f16x2<amdgpu_synid1011_type_dev>`, :ref:`src2<amdgpu_synid1011_src32_1>`::ref:`f32<amdgpu_synid1011_type_dev>` :ref:`neg_lo<amdgpu_synid_neg_lo>` :ref:`neg_hi<amdgpu_synid_neg_hi>` :ref:`clamp<amdgpu_synid_clamp>`
74-
v_dot2_i32_i16 :ref:`vdst<amdgpu_synid1011_vdst32_0>`, :ref:`src0<amdgpu_synid1011_src32_0>`::ref:`i16x2<amdgpu_synid1011_type_dev>`, :ref:`src1<amdgpu_synid1011_src32_1>`::ref:`i16x2<amdgpu_synid1011_type_dev>`, :ref:`src2<amdgpu_synid1011_src32_1>`::ref:`i32<amdgpu_synid1011_type_dev>` :ref:`clamp<amdgpu_synid_clamp>`
75-
v_dot2_u32_u16 :ref:`vdst<amdgpu_synid1011_vdst32_0>`, :ref:`src0<amdgpu_synid1011_src32_0>`::ref:`u16x2<amdgpu_synid1011_type_dev>`, :ref:`src1<amdgpu_synid1011_src32_1>`::ref:`u16x2<amdgpu_synid1011_type_dev>`, :ref:`src2<amdgpu_synid1011_src32_1>`::ref:`u32<amdgpu_synid1011_type_dev>` :ref:`clamp<amdgpu_synid_clamp>`
74+
v_dot2_i32_i16 :ref:`vdst<amdgpu_synid1011_vdst32_0>`, :ref:`src0<amdgpu_synid1011_src32_2>`::ref:`i16x2<amdgpu_synid1011_type_dev>`, :ref:`src1<amdgpu_synid1011_src32_3>`::ref:`i16x2<amdgpu_synid1011_type_dev>`, :ref:`src2<amdgpu_synid1011_src32_1>`::ref:`i32<amdgpu_synid1011_type_dev>` :ref:`clamp<amdgpu_synid_clamp>`
75+
v_dot2_u32_u16 :ref:`vdst<amdgpu_synid1011_vdst32_0>`, :ref:`src0<amdgpu_synid1011_src32_2>`::ref:`u16x2<amdgpu_synid1011_type_dev>`, :ref:`src1<amdgpu_synid1011_src32_3>`::ref:`u16x2<amdgpu_synid1011_type_dev>`, :ref:`src2<amdgpu_synid1011_src32_1>`::ref:`u32<amdgpu_synid1011_type_dev>` :ref:`clamp<amdgpu_synid_clamp>`
7676
v_dot4_i32_i8 :ref:`vdst<amdgpu_synid1011_vdst32_0>`, :ref:`src0<amdgpu_synid1011_src32_0>`::ref:`i8x4<amdgpu_synid1011_type_dev>`, :ref:`src1<amdgpu_synid1011_src32_1>`::ref:`i8x4<amdgpu_synid1011_type_dev>`, :ref:`src2<amdgpu_synid1011_src32_1>`::ref:`i32<amdgpu_synid1011_type_dev>` :ref:`clamp<amdgpu_synid_clamp>`
7777
v_dot4_u32_u8 :ref:`vdst<amdgpu_synid1011_vdst32_0>`, :ref:`src0<amdgpu_synid1011_src32_0>`::ref:`u8x4<amdgpu_synid1011_type_dev>`, :ref:`src1<amdgpu_synid1011_src32_1>`::ref:`u8x4<amdgpu_synid1011_type_dev>`, :ref:`src2<amdgpu_synid1011_src32_1>`::ref:`u32<amdgpu_synid1011_type_dev>` :ref:`clamp<amdgpu_synid_clamp>`
7878
v_dot8_i32_i4 :ref:`vdst<amdgpu_synid1011_vdst32_0>`, :ref:`src0<amdgpu_synid1011_src32_0>`::ref:`i4x8<amdgpu_synid1011_type_dev>`, :ref:`src1<amdgpu_synid1011_src32_1>`::ref:`i4x8<amdgpu_synid1011_type_dev>`, :ref:`src2<amdgpu_synid1011_src32_1>`::ref:`i32<amdgpu_synid1011_type_dev>` :ref:`clamp<amdgpu_synid_clamp>`
@@ -87,6 +87,8 @@ VOP3P
8787
AMDGPUAsmGFX10
8888
gfx1011_src32_0
8989
gfx1011_src32_1
90+
gfx1011_src32_2
91+
gfx1011_src32_3
9092
gfx1011_vdst32_0
9193
gfx1011_vsrc32_0
9294
gfx1011_type_dev

llvm/docs/AMDGPU/AMDGPUAsmGFX7.rst

Lines changed: 146 additions & 126 deletions
Large diffs are not rendered by default.

llvm/docs/AMDGPU/AMDGPUAsmGFX8.rst

Lines changed: 475 additions & 451 deletions
Large diffs are not rendered by default.

llvm/docs/AMDGPU/AMDGPUAsmGFX9.rst

Lines changed: 545 additions & 520 deletions
Large diffs are not rendered by default.

llvm/docs/AMDGPU/AMDGPUAsmGFX906.rst

Lines changed: 4 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -64,8 +64,8 @@ VOP3P
6464
**INSTRUCTION** **DST** **SRC0** **SRC1** **SRC2** **MODIFIERS**
6565
\ |---|\ |---|\ |---|\ |---|\ |---|\ |---|\ |---|\ |---|\ |---|\ |---|\ |---|\ |---|\ |---|\ |---|\ |---|\ |---|\ |---|\ |---|\ |---|\ |---|\ |---|\ |---|\ |---|\ |---|\ |---|\ |---|\ |---|\ |---|\ |---|\ |---|\ |---|\ |---|\ |---|\ |---|\ |---|\ |---|\ |---|\ |---|\ |---|\ |---|\ |---|\ |---|\ |---|\ |---|\ |---|\ |---|\ |---|\ |---|\ |---|\ |---|\ |---|\ |---|\ |---|\ |---|\ |---|\ |---|\ |---|\ |---|\ |---|\ |---|\ |---|\ |---|\ |---|\ |---|\ |---|\ |---|\ |---|\ |---|\ |---|\ |---|\ |---|\ |---|\ |---|\ |---|\ |---|\ |---|\ |---|\ |---|\ |---|\ |---|\ |---|\ |---|\ |---|\ |---|\ |---|\ |---|\ |---|\ |---|\ |---|\ |---|\ |---|\ |---|\ |---|\ |---|\ |---|\ |---|\ |---|\ |---|\ |---|\ |---|\ |---|\ |---|\ |---|\ |---|\ |---|\ |---|\ |---|
6666
v_dot2_f32_f16 :ref:`vdst<amdgpu_synid906_vdst32_0>`, :ref:`src0<amdgpu_synid906_src32_1>`::ref:`f16x2<amdgpu_synid906_type_dev>`, :ref:`src1<amdgpu_synid906_src32_2>`::ref:`f16x2<amdgpu_synid906_type_dev>`, :ref:`src2<amdgpu_synid906_src32_2>`::ref:`f32<amdgpu_synid906_type_dev>` :ref:`neg_lo<amdgpu_synid_neg_lo>` :ref:`neg_hi<amdgpu_synid_neg_hi>` :ref:`clamp<amdgpu_synid_clamp>`
67-
v_dot2_i32_i16 :ref:`vdst<amdgpu_synid906_vdst32_0>`, :ref:`src0<amdgpu_synid906_src32_1>`::ref:`i16x2<amdgpu_synid906_type_dev>`, :ref:`src1<amdgpu_synid906_src32_2>`::ref:`i16x2<amdgpu_synid906_type_dev>`, :ref:`src2<amdgpu_synid906_src32_2>`::ref:`i32<amdgpu_synid906_type_dev>` :ref:`clamp<amdgpu_synid_clamp>`
68-
v_dot2_u32_u16 :ref:`vdst<amdgpu_synid906_vdst32_0>`, :ref:`src0<amdgpu_synid906_src32_1>`::ref:`u16x2<amdgpu_synid906_type_dev>`, :ref:`src1<amdgpu_synid906_src32_2>`::ref:`u16x2<amdgpu_synid906_type_dev>`, :ref:`src2<amdgpu_synid906_src32_2>`::ref:`u32<amdgpu_synid906_type_dev>` :ref:`clamp<amdgpu_synid_clamp>`
67+
v_dot2_i32_i16 :ref:`vdst<amdgpu_synid906_vdst32_0>`, :ref:`src0<amdgpu_synid906_src32_3>`::ref:`i16x2<amdgpu_synid906_type_dev>`, :ref:`src1<amdgpu_synid906_src32_4>`::ref:`i16x2<amdgpu_synid906_type_dev>`, :ref:`src2<amdgpu_synid906_src32_2>`::ref:`i32<amdgpu_synid906_type_dev>` :ref:`clamp<amdgpu_synid_clamp>`
68+
v_dot2_u32_u16 :ref:`vdst<amdgpu_synid906_vdst32_0>`, :ref:`src0<amdgpu_synid906_src32_3>`::ref:`u16x2<amdgpu_synid906_type_dev>`, :ref:`src1<amdgpu_synid906_src32_4>`::ref:`u16x2<amdgpu_synid906_type_dev>`, :ref:`src2<amdgpu_synid906_src32_2>`::ref:`u32<amdgpu_synid906_type_dev>` :ref:`clamp<amdgpu_synid_clamp>`
6969
v_dot4_i32_i8 :ref:`vdst<amdgpu_synid906_vdst32_0>`, :ref:`src0<amdgpu_synid906_src32_1>`::ref:`i8x4<amdgpu_synid906_type_dev>`, :ref:`src1<amdgpu_synid906_src32_2>`::ref:`i8x4<amdgpu_synid906_type_dev>`, :ref:`src2<amdgpu_synid906_src32_2>`::ref:`i32<amdgpu_synid906_type_dev>` :ref:`clamp<amdgpu_synid_clamp>`
7070
v_dot4_u32_u8 :ref:`vdst<amdgpu_synid906_vdst32_0>`, :ref:`src0<amdgpu_synid906_src32_1>`::ref:`u8x4<amdgpu_synid906_type_dev>`, :ref:`src1<amdgpu_synid906_src32_2>`::ref:`u8x4<amdgpu_synid906_type_dev>`, :ref:`src2<amdgpu_synid906_src32_2>`::ref:`u32<amdgpu_synid906_type_dev>` :ref:`clamp<amdgpu_synid_clamp>`
7171
v_dot8_i32_i4 :ref:`vdst<amdgpu_synid906_vdst32_0>`, :ref:`src0<amdgpu_synid906_src32_1>`::ref:`i4x8<amdgpu_synid906_type_dev>`, :ref:`src1<amdgpu_synid906_src32_2>`::ref:`i4x8<amdgpu_synid906_type_dev>`, :ref:`src2<amdgpu_synid906_src32_2>`::ref:`i32<amdgpu_synid906_type_dev>` :ref:`clamp<amdgpu_synid_clamp>`
@@ -84,6 +84,8 @@ VOP3P
8484
gfx906_src32_0
8585
gfx906_src32_1
8686
gfx906_src32_2
87+
gfx906_src32_3
88+
gfx906_src32_4
8789
gfx906_vdst32_0
8890
gfx906_vsrc32_0
8991
gfx906_mad_type_dev

llvm/docs/AMDGPU/AMDGPUAsmGFX908.rst

Lines changed: 4 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -95,8 +95,8 @@ VOP3P
9595
v_accvgpr_read_b32 :ref:`vdst<amdgpu_synid908_vdst32_0>`, :ref:`asrc<amdgpu_synid908_asrc32_0>`
9696
v_accvgpr_write_b32 :ref:`adst<amdgpu_synid908_adst32_0>`, :ref:`src<amdgpu_synid908_src32_3>`
9797
v_dot2_f32_f16 :ref:`vdst<amdgpu_synid908_vdst32_0>`, :ref:`src0<amdgpu_synid908_src32_1>`::ref:`f16x2<amdgpu_synid908_type_dev>`, :ref:`src1<amdgpu_synid908_src32_2>`::ref:`f16x2<amdgpu_synid908_type_dev>`, :ref:`src2<amdgpu_synid908_src32_2>`::ref:`f32<amdgpu_synid908_type_dev>` :ref:`neg_lo<amdgpu_synid_neg_lo>` :ref:`neg_hi<amdgpu_synid_neg_hi>` :ref:`clamp<amdgpu_synid_clamp>`
98-
v_dot2_i32_i16 :ref:`vdst<amdgpu_synid908_vdst32_0>`, :ref:`src0<amdgpu_synid908_src32_1>`::ref:`i16x2<amdgpu_synid908_type_dev>`, :ref:`src1<amdgpu_synid908_src32_2>`::ref:`i16x2<amdgpu_synid908_type_dev>`, :ref:`src2<amdgpu_synid908_src32_2>`::ref:`i32<amdgpu_synid908_type_dev>` :ref:`clamp<amdgpu_synid_clamp>`
99-
v_dot2_u32_u16 :ref:`vdst<amdgpu_synid908_vdst32_0>`, :ref:`src0<amdgpu_synid908_src32_1>`::ref:`u16x2<amdgpu_synid908_type_dev>`, :ref:`src1<amdgpu_synid908_src32_2>`::ref:`u16x2<amdgpu_synid908_type_dev>`, :ref:`src2<amdgpu_synid908_src32_2>`::ref:`u32<amdgpu_synid908_type_dev>` :ref:`clamp<amdgpu_synid_clamp>`
98+
v_dot2_i32_i16 :ref:`vdst<amdgpu_synid908_vdst32_0>`, :ref:`src0<amdgpu_synid908_src32_4>`::ref:`i16x2<amdgpu_synid908_type_dev>`, :ref:`src1<amdgpu_synid908_src32_5>`::ref:`i16x2<amdgpu_synid908_type_dev>`, :ref:`src2<amdgpu_synid908_src32_2>`::ref:`i32<amdgpu_synid908_type_dev>` :ref:`clamp<amdgpu_synid_clamp>`
99+
v_dot2_u32_u16 :ref:`vdst<amdgpu_synid908_vdst32_0>`, :ref:`src0<amdgpu_synid908_src32_4>`::ref:`u16x2<amdgpu_synid908_type_dev>`, :ref:`src1<amdgpu_synid908_src32_5>`::ref:`u16x2<amdgpu_synid908_type_dev>`, :ref:`src2<amdgpu_synid908_src32_2>`::ref:`u32<amdgpu_synid908_type_dev>` :ref:`clamp<amdgpu_synid_clamp>`
100100
v_dot4_i32_i8 :ref:`vdst<amdgpu_synid908_vdst32_0>`, :ref:`src0<amdgpu_synid908_src32_1>`::ref:`i8x4<amdgpu_synid908_type_dev>`, :ref:`src1<amdgpu_synid908_src32_2>`::ref:`i8x4<amdgpu_synid908_type_dev>`, :ref:`src2<amdgpu_synid908_src32_2>`::ref:`i32<amdgpu_synid908_type_dev>` :ref:`clamp<amdgpu_synid_clamp>`
101101
v_dot4_u32_u8 :ref:`vdst<amdgpu_synid908_vdst32_0>`, :ref:`src0<amdgpu_synid908_src32_1>`::ref:`u8x4<amdgpu_synid908_type_dev>`, :ref:`src1<amdgpu_synid908_src32_2>`::ref:`u8x4<amdgpu_synid908_type_dev>`, :ref:`src2<amdgpu_synid908_src32_2>`::ref:`u32<amdgpu_synid908_type_dev>` :ref:`clamp<amdgpu_synid_clamp>`
102102
v_dot8_i32_i4 :ref:`vdst<amdgpu_synid908_vdst32_0>`, :ref:`src0<amdgpu_synid908_src32_1>`::ref:`i4x8<amdgpu_synid908_type_dev>`, :ref:`src1<amdgpu_synid908_src32_2>`::ref:`i4x8<amdgpu_synid908_type_dev>`, :ref:`src2<amdgpu_synid908_src32_2>`::ref:`i32<amdgpu_synid908_type_dev>` :ref:`clamp<amdgpu_synid_clamp>`
@@ -150,6 +150,8 @@ VOP3P
150150
gfx908_src32_1
151151
gfx908_src32_2
152152
gfx908_src32_3
153+
gfx908_src32_4
154+
gfx908_src32_5
153155
gfx908_vaddr_flat_global
154156
gfx908_vasrc32_0
155157
gfx908_vasrc64_0
Lines changed: 17 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,17 @@
1+
..
2+
**************************************************
3+
* *
4+
* Automatically generated file, do not edit! *
5+
* *
6+
**************************************************
7+
8+
.. _amdgpu_synid1011_src32_2:
9+
10+
src
11+
===========================
12+
13+
Instruction input.
14+
15+
*Size:* 1 dword.
16+
17+
*Operands:* :ref:`v<amdgpu_synid_v>`, :ref:`s<amdgpu_synid_s>`, :ref:`vcc<amdgpu_synid_vcc>`, :ref:`ttmp<amdgpu_synid_ttmp>`, :ref:`m0<amdgpu_synid_m0>`, :ref:`exec<amdgpu_synid_exec>`, :ref:`vccz<amdgpu_synid_vccz>`, :ref:`execz<amdgpu_synid_execz>`, :ref:`scc<amdgpu_synid_scc>`, :ref:`lds_direct<amdgpu_synid_lds_direct>`, :ref:`iconst<amdgpu_synid_iconst>`, :ref:`literal<amdgpu_synid_literal>`
Lines changed: 17 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,17 @@
1+
..
2+
**************************************************
3+
* *
4+
* Automatically generated file, do not edit! *
5+
* *
6+
**************************************************
7+
8+
.. _amdgpu_synid1011_src32_3:
9+
10+
src
11+
===========================
12+
13+
Instruction input.
14+
15+
*Size:* 1 dword.
16+
17+
*Operands:* :ref:`v<amdgpu_synid_v>`, :ref:`s<amdgpu_synid_s>`, :ref:`vcc<amdgpu_synid_vcc>`, :ref:`ttmp<amdgpu_synid_ttmp>`, :ref:`m0<amdgpu_synid_m0>`, :ref:`exec<amdgpu_synid_exec>`, :ref:`vccz<amdgpu_synid_vccz>`, :ref:`execz<amdgpu_synid_execz>`, :ref:`scc<amdgpu_synid_scc>`, :ref:`iconst<amdgpu_synid_iconst>`, :ref:`literal<amdgpu_synid_literal>`

llvm/docs/AMDGPU/gfx10_addr_mimg.rst

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -17,7 +17,7 @@ This operand may be specified using either :ref:`standard VGPR syntax<amdgpu_syn
1717
*Size:* 1-13 dwords. Actual size depends on syntax, opcode, :ref:`dim<amdgpu_synid_dim>` and :ref:`a16<amdgpu_synid_a16>`.
1818

1919
* If specified using :ref:`NSA VGPR syntax<amdgpu_synid_nsa>`, the size is 1-13 dwords.
20-
* If specified using :ref:`standard VGPR syntax<amdgpu_synid_vcc_lo>`, the size is 1, 2, 3, 4, 8 or 16 dwords. Note that assembler currently supports a limited range of register sequences.
20+
* If specified using :ref:`standard VGPR syntax<amdgpu_synid_v>`, the size is 1, 2, 3, 4, 8 or 16 dwords. Note that assembler currently supports a limited range of register sequences.
2121

2222

2323
*Operands:* :ref:`v<amdgpu_synid_v>`

0 commit comments

Comments
 (0)