-
Notifications
You must be signed in to change notification settings - Fork 15.2k
[AMDGPU] Update buffer fat pointer docs for gfx1250, fix formatting #167818
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[AMDGPU] Update buffer fat pointer docs for gfx1250, fix formatting #167818
Conversation
|
@llvm/pr-subscribers-backend-amdgpu Author: Krzysztof Drewniak (krzysz00) ChangesFull diff: https://github.com/llvm/llvm-project/pull/167818.diff 1 Files Affected:
diff --git a/llvm/docs/AMDGPUUsage.rst b/llvm/docs/AMDGPUUsage.rst
index b8b372d4113c1..d268a7b358e9d 100644
--- a/llvm/docs/AMDGPUUsage.rst
+++ b/llvm/docs/AMDGPUUsage.rst
@@ -1011,9 +1011,9 @@ supported for the ``amdgcn`` target.
bounds checking may be disabled, buffer fat pointers may choose to enable
it or not). The cache swizzle support introduced in gfx942 may be used.
- These pointers can be created by `addrspacecast` from a buffer resource
- (`ptr addrspace(8)`) or by using `llvm.amdgcn.make.buffer.rsrc` to produce a
- `ptr addrspace(7)` directly, which produces a buffer fat pointer with an initial
+ These pointers can be created by ``addrspacecast`` from a buffer resource
+ (``ptr addrspace(8)```) or by using `llvm.amdgcn.make.buffer.rsrc` to produce a
+ ``ptr addrspace(7)`` directly, which produces a buffer fat pointer with an initial
offset of 0 and prevents the address space cast from being rewritten away.
The ``align`` attribute on operations from buffer fat pointers is deemed to apply
@@ -1028,26 +1028,33 @@ supported for the ``amdgcn`` target.
**Buffer Resource**
The buffer resource pointer, in address space 8, is the newer form
for representing buffer descriptors in AMDGPU IR, replacing their
- previous representation as `<4 x i32>`. It is a non-integral pointer
- that represents a 128-bit buffer descriptor resource (`V#`).
+ previous representation as ``<4 x i32>``. It is a non-integral pointer
+ that represents a 128-bit buffer descriptor resource (``V#``).
Since, in general, a buffer resource supports complex addressing modes that cannot
be easily represented in LLVM (such as implicit swizzled access to structured
- buffers), it is **illegal** to perform non-trivial address computations, such as
- ``getelementptr`` operations, on buffer resources. They may be passed to
- AMDGPU buffer intrinsics, and they may be converted to and from ``i128``.
+ buffers), performing address computations such as ``getelementptr`` is not
+ recommended on ``ptr addrspace(8)``s (if such computations are performed, the
+ offset must be wavefront-uniform.) Note that such a usage of GEP is currently
+ **unimplemented** in the backend, as it would require a wrapping 48-bit
+ addition. Buffer resources may be passed to AMDGPU buffer intrinsics, and they
+ may be converted to and from ``i128``.
Casting a buffer resource to a buffer fat pointer is permitted and adds an offset
of 0.
Buffer resources can be created from 64-bit pointers (which should be either
- generic or global) using the `llvm.amdgcn.make.buffer.rsrc` intrinsic, which
+ generic or global) using the ``llvm.amdgcn.make.buffer.rsrc`` intrinsic, which
takes the pointer, which becomes the base of the resource,
the 16-bit stride (and swzizzle control) field stored in bits `63:48` of a `V#`,
the 32-bit NumRecords/extent field (bits `95:64`), and the 32-bit flags field
(bits `127:96`). The specific interpretation of these fields varies by the
target architecture and is detailed in the ISA descriptions.
+ On gfx1250, the base pointer is instead truncated to 57 bits and the NumRecords
+ field is 45 bits, which necessicated a change to ``make.buffer.rsrcs``'s arguments
+ in order to make that field an ``i64``.
+
When buffer resources are passed to buffer intrinsics such as
``llvm.amdgcn.raw.ptr.buffer.load`` or
``llvm.amdgcn.struct.ptr.buffer.store``, the ``align`` attribute on the
@@ -1079,9 +1086,9 @@ supported for the ``amdgcn`` target.
the stride is the size of a structured element, the "add tid" flag must be 0,
and the swizzle enable bits must be off.
- These pointers can be created by `addrspacecast` from a buffer resource
- (`ptr addrspace(8)`) or by using `llvm.amdgcn.make.buffer.rsrc` to produce a
- `ptr addrspace(9)` directly, which produces a buffer strided pointer whose initial
+ These pointers can be created by ``addrspacecast`` from a buffer resource
+ (``ptr addrspace(8)``) or by using ``llvm.amdgcn.make.buffer.rsrc`` to produce a
+ ``ptr addrspace(9)``` directly, which produces a buffer strided pointer whose initial
index and offset values are both 0. This prevents the address space cast from
being rewritten away.
|
| offset must be wavefront-uniform.) Note that such a usage of GEP is currently | ||
| **unimplemented** in the backend, as it would require a wrapping 48-bit | ||
| addition. Buffer resources may be passed to AMDGPU buffer intrinsics, and they | ||
| may be converted to and from ``i128``. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Q: does the pointer support other operations?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Which ones are you thinking of?
(The answer is "probably not" by virtue of "no one thought they needed them")
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
GEP is supported so I suppose it supports things like ptr[i]. How about address space cast?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
GEP is "supported" in the sense that we know what its semantics should be , but, last I checked, everything explodes if you try to implement them.
You can't addrspacecast in, since we don't know what the flags would be.
Addrspacecasting out ... is also very patches welcome
(Heck, so is the ptr addrspace(7) to ptr addrspace(1) cast, which could exist but doesn't)
No description provided.