Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[DirectX][docs] Document DXContainer format #90908

Merged
merged 2 commits into from
May 6, 2024

Conversation

llvm-beanz
Copy link
Collaborator

This adds a document to describe the DXContainer format and the structures of data inside the file.

Resolves #88775

This adds a document to describe the DXContainer format and the
structures of data inside the file.

Resolves llvm#88775
@llvmbot
Copy link
Collaborator

llvmbot commented May 2, 2024

@llvm/pr-subscribers-backend-directx

Author: Chris B (llvm-beanz)

Changes

This adds a document to describe the DXContainer format and the structures of data inside the file.

Resolves #88775


Full diff: https://github.com/llvm/llvm-project/pull/90908.diff

2 Files Affected:

  • (added) llvm/docs/DirectX/DXContainer.rst (+395)
  • (modified) llvm/docs/DirectXUsage.rst (+4-1)
diff --git a/llvm/docs/DirectX/DXContainer.rst b/llvm/docs/DirectX/DXContainer.rst
new file mode 100644
index 00000000000000..61b056b76068a8
--- /dev/null
+++ b/llvm/docs/DirectX/DXContainer.rst
@@ -0,0 +1,395 @@
+=================
+DirectX Container
+=================
+
+.. contents::
+   :local:
+
+.. toctree::
+   :hidden:
+
+Overview
+========
+
+The DirectX Container (DXContainer) file format is the binary file format for
+compiled shaders targeting the DirectX runtime. The file format is also called
+the DXIL Container or DXBC file format. Because the file format can be used to
+include either DXIL or DXBC compiled shaders, the nomenclature in LLVM is simply
+DirectX Container.
+
+DirectX Container files are read by the compiler and associated tools as well as
+the DirectX runtime, profiling tools and other users. This document serves as a
+companion to the implementation in LLVM to more completely document the file
+format for its many users.
+
+Basic Structure
+===============
+
+A DXContainer file begins with a header, and is then followed by a sequence of
+"parts", which are analogous to object file sections. Each part contains a part
+header, and some number of bytes of data after the header in a defined format.
+
+DX Container data structures are encoded little-endian in the binary file.
+
+The LLVM versions of all data structures described and/or referenced in this
+file are defined in
+`llvm/include/llvm/BinaryFormat/DXContainer.h
+<https://github.com/llvm/llvm-project/blob/main/llvm/include/llvm/BinaryFormat/DXContainer.h>`_.
+Some pseudo code is provided in blocks below to ease understanding of this
+document, but reading it with the header available will provide the most
+clarity.
+
+File Header
+-----------
+
+.. code-block:: c
+
+  struct Header {
+    uint8_t Magic[4];
+    uint8_t Digest[16];
+    uint16_t MajorVersion;
+    uint16_t MinorVersion;
+    uint32_t FileSize;
+    uint32_t PartCount;
+  };
+
+The DXContainer header matches the pseudo-definition above. It begins with a
+four character code (magic number) with the value ``DXBC`` to denote the file
+format.
+
+The ``Digest`` is a 128bit hash digest computed with a proprietary algorithm and
+encoded in the binary by the bytecode validator.
+
+The ``MajorVersion`` and ``MinorVersion`` encode the file format version
+``1.0``.
+
+The remaining fields encode 32-bit unsigned integers for the file size and
+number of parts.
+
+Following the part header is an array of ``PartCount`` 32-bit unsigned integers
+specifying the offsets of each part header.
+
+Part Data
+---------
+
+.. code-block:: c
+
+  struct PartHeader {
+    uint8_t Name[4];
+    uint32_t Size;
+  }
+
+Each part begins with a part header. A part header includes the 4-character part
+name, and a 32-bit unsigned integer specifying the size of the part data. The
+part header is followed by ``Size`` bytes of data comprising the part.
+
+Part Formats
+============
+
+The part name indicates the format of the part data. There are 23 part headers
+used by DXC and FXC:
+
+#. `DXIL`_ - Stores the DXIL bytecode.
+#. `HASH`_ - Stores the shader MD5 hash.
+#. ILDB - Stores the DXIL bytecode with LLVM Debug Information embedded in the module.
+#. ILDN - Stores shader debug name for external debug information.
+#. `ISG1`_ - Stores the input signature for Shader Model 5.1+.
+#. ISGN - Stores the input signature for Shader Model 4 and earlier.
+#. `OSG1`_ - Stores the output signature for Shader Model 5.1+.
+#. OSG5 - Stores the output signature for Shader Model 5.
+#. OSGN - Stores the output signature for Shader Model 4 and earlier.
+#. PCSG - Stores the patch constant signature for Shader Model 5.1 and earlier.
+#. PDBI - Stores PDB information.
+#. PRIV - Stores arbitrary private data.
+#. `PSG1`_ - Stores the patch constant signature for Shader Model 6+.
+#. `PSV0`_ - Stores Pipeline State Validation data.
+#. RDAT - Stores Runtime Data.
+#. RDEF - Stores resource definitions.
+#. RTS0 - Stores compiled root signature.
+#. `SFI0`_ - Stores shader feature flags.
+#. SHDR - Stores compiled DXBC bytecode.
+#. SHEX - Stores compiled DXBC bytecode.
+#. SRCI - Stores shader source information.
+#. STAT - Stores shader statistics.
+#. VERS - Stores shader compiler version information.
+
+DXIL Part
+---------
+.. _DXIL:
+
+The DXIL part is comprised of three data structures: the ``ProgramHeader``, the
+``BitcodeHeader`` and the bitcode serialized LLVM IR Module.
+
+The ``ProgramHeader`` contains the shader model version and pipeline stage
+enumeration value. This identifies the target profile of the contained shader
+bitcode.
+
+The ``BitcodeHeader`` contains the DXIL version information and refers to the
+start of the bitcode data.
+
+HASH Part
+---------
+.. _HASH:
+
+The HASH part contains a 32-bit unsigned integer with the shader hash flags, and
+a 128-bit MD5 hash digest. The flags field can either have the value ``0`` to
+indicate no flags, or ``1`` to indicate that the file hash was computed
+including the source code that produced the binary.
+
+Program Signature (SG1) Parts
+-----------------------------
+.. _ISG1:
+.. _OSG1:
+.. _PSG1:
+
+.. code-block:: c
+
+  struct ProgramSignatureHeader {
+    uint32_t ParamCount;
+    uint32_t FirstParamOffset;
+  }
+
+The program signature parts (ISG1, OSG1, & PSG1) all use the same data
+structures to encode inputs, outputs and patch information. The
+``ProgramSignatureHeader`` includes two 32-bit unsigned integers to specify the
+number of signature parameters and the offset of the first parameter.
+
+Beginning at ``FirstParamOffset`` bytes from the start of the
+``ProgramSignatureHeader``, ``ParamCount`` ``ProgramSignatureElement``
+structures are written. Following the ``ProgramSignatureElements`` is a string
+table of null terminated strings padded to 32-byte alignment. This string table
+matches the DWARF string table format as implemented by LLVM.
+
+Each ``ProgramSignatureElement`` encodes a ``NameOffset`` value which specifies
+the offset into the string table. A value of ``0`` denotes no name. The offsets
+encoded here are from the beginning of the ``ProgramSignatureHeader`` not the
+beginning of the string table.
+
+The ``ProgramSignatureElement`` contains several enumeration fields which are
+defined in `llvm/include/llvm/BinaryFormat/DXContainerConstants.def <https://github.com/llvm/llvm-project/blob/main/llvm/include/llvm/BinaryFormat/DXContainerConstants.def>`_.
+These fields encode the D3D system value, the type of data and its precision
+requirements.
+
+PSV0 Part
+---------
+.. _PSV0:
+
+The Pipeline State Validation data encodes versioned runtime information
+structures. These structures use a scheme where in lieu of encoding a version
+number, they encode the size of the structure and each new version of the
+structure is additive. This allows readers to infer the version of the structure
+by comparing the encoded size with the size of known structures. If the encoded
+size is larger than any known structure, the largest known structure can validly
+parse the data represented in the known structure.
+
+In LLVM we represent the versions of the associated data structures with
+versioned namespaces under the ``llvm::dxbc::PSV`` namespace (e.g. ``v0``,
+``v1``). Each structure in the ``v0`` namespace is the base version, the
+structures in the ``v1`` namespace inherit from the ``v0`` namespace, and the
+``v2`` structures inherit from the ``v1`` structures, and so on.
+
+The high-level structure of the PSV data is:
+
+#. ``RuntimeInfo`` structure
+#. Resource bindings
+#. Signature elements
+#. Mask Vectors (Output, Input, InputPatch, PatchOutput)
+
+Immediately following the part header for the PSV0 part is a 32-bit unsigned
+integer specifying the size of the ``RuntimeInfo`` structure that follows.
+
+Immediately following the ``RuntimeInfo`` structure is a 32-bit unsigned integer
+specifying the number of resource bindings. If the number of resources is
+greater than zero, another unsigned 32-bit integer follows to specify the size
+of the ``ResourceBindInfo`` structure. This is followed by the specified number
+of structures of the specified size (which infers the version of the structure).
+
+For version 0 of the data this ends the part data.
+
+PSV0 Signature Elements
+~~~~~~~~~~~~~~~~~~~~~~~
+
+The signature elements are conceptually a single concept but the data is encoded
+in three different blocks. The first block is a string table, the second block
+is an index table, and the third block is the elements themselves, which in turn
+are separeated by input, output and patch constant or primitive elements.
+
+Signature elements capture much of the same data captured in the :ref:`SG1
+<ISG1>` parts. The use of an index table allows de-duplciation of data for a more
+compact final representation.
+
+The string table begins with a 32-bit unsigned integer specifying the table
+size. This string table uses the DXContainer format as implemented in LLVM. This
+format prefixes the string table with a null byte so that offset ``0`` is a null
+string, and pads to 32-byte alignment.
+
+The index table begins with a 32-bit unsigned integer specifying the size of the
+table, and is followed by that many 32-bit unsigned integers representing the
+table. The index table may or may not deduplicate repeated sequences (both DXC
+and Clang do). The indices signify the indices in the flattened aggregate
+representation which the signature element describes. A single semantic may have
+more than one entry in this table to denote the different attributes of its
+members.
+
+For example given the following code:
+
+.. code-block:: c
+
+  struct VSOut_1
+  {
+      float4 f3 : VOUT2;
+      float3 f4 : VOUT3;
+  };
+
+
+  struct VSOut
+  {
+      float4 f1 : VOUT0;
+      float2 f2[4] : VOUT1;
+      VSOut_1 s;
+      int4 f5 : VOUT4;
+  };
+
+  void main(out VSOut o1 : A) {
+  }
+
+The semantic ``A`` gets expanded into 5 output signature elements. Those
+elements are:
+
+.. note::
+
+  In the example below, it is a coincidence that the rows match the indices, in
+  more complicated examples with multiple semantics this is not the case.
+
+#. Index 0 starts at row 0, contains 4 columns, and is float32. This represents
+   to ``f1`` in the source.
+#. Index 1, 2, 3, and 4 starts at row 1, contains two columns and is float32.
+   This represents ``f2``, and it spreads across rows 1 - 4.
+#. Index 5 starts at row 5, contains 4 columns, and is float32. This represents
+   ``f3`` in the source.
+#. Index 6 starts at row 6, contains 3 columns, and is float32. This represents
+   ``f4``.
+#. Index 7 starts at row 7, contains 4 columns, and is signed 32-bit integer.
+   This represents ``f5`` in the source.
+
+The LLVM ``obj2yaml`` tool can parse this data out of the PSV and present it in
+human readable YAML. For the example above it produces the output:
+
+.. code-block:: YAML
+
+  SigOutputElements:
+    - Name:            A
+      Indices:         [ 0 ]
+      StartRow:        0
+      Cols:            4
+      StartCol:        0
+      Allocated:       true
+      Kind:            Arbitrary
+      ComponentType:   Float32
+      Interpolation:   Linear
+      DynamicMask:     0x0
+      Stream:          0
+    - Name:            A
+      Indices:         [ 1, 2, 3, 4 ]
+      StartRow:        1
+      Cols:            2
+      StartCol:        0
+      Allocated:       true
+      Kind:            Arbitrary
+      ComponentType:   Float32
+      Interpolation:   Linear
+      DynamicMask:     0x0
+      Stream:          0
+    - Name:            A
+      Indices:         [ 5 ]
+      StartRow:        5
+      Cols:            4
+      StartCol:        0
+      Allocated:       true
+      Kind:            Arbitrary
+      ComponentType:   Float32
+      Interpolation:   Linear
+      DynamicMask:     0x0
+      Stream:          0
+    - Name:            A
+      Indices:         [ 6 ]
+      StartRow:        6
+      Cols:            3
+      StartCol:        0
+      Allocated:       true
+      Kind:            Arbitrary
+      ComponentType:   Float32
+      Interpolation:   Linear
+      DynamicMask:     0x0
+      Stream:          0
+    - Name:            A
+      Indices:         [ 7 ]
+      StartRow:        7
+      Cols:            4
+      StartCol:        0
+      Allocated:       true
+      Kind:            Arbitrary
+      ComponentType:   SInt32
+      Interpolation:   Constant
+      DynamicMask:     0x0
+      Stream:          0
+
+The number of signature elements of each type is encoded in the
+``llvm::dxbc::PSV::v1::RuntimeInfo`` structure. If any of the element count
+values are non-zero, the size of the ``ProgramSignatureElement`` structure is
+encoded next to allow versioning of that structure. Today there is only one
+version. Following the size field is the specified number of signature elements
+in the order input, output, then patch constant or primitive.
+
+Following the signature elements is a sequence of mask vectors encoded as a
+series of 32-bit integers. Each 32-bit integer in the mask encodes values for 8
+input/output/patch or primitive elements. The mask vector is filled from least
+significant bit to most significant bit with each added element shifting the
+previous elements left. A reader needs to consult the total number of vectors
+encoded in the ``RuntimeInfo`` structure to know how to read the mask vector.
+
+If the shader has ``UsesViewID`` enabled in the ``RuntimeInfo`` an output mask
+vector will be included. The output mask vector is four arrays of 32-bit
+unsigned integers. Each of the four arrays corresponds to an output stream.
+Geometry shaders have a maximum of four output streams, all other shader stages
+only support one output stream. Each bit in the mask vector identifies one
+column of an output from the output signature depends on the ViewID.
+
+If the shader has ``UsesViewID`` enabled, it is a hull shader, and it has patch
+constant or primitive vector elements, a patch constant or primitive vector mask
+will be included. It is identical in structure to the output mask vector. Each
+bit in the mask vector identifies one column of a patch constant output which
+depends on the ViewID.
+
+The next series of mask vectors are similar in structure to the output mask
+vector, but they contain an extra dimension.
+
+The output/input map is encoded next if the shader has inputs and outputs. The
+output/input mask encodes which outputs are impacted by each column of each
+input. The size for each mask vector is the size of the output max vector * the
+number of inputs * 4 (for each component). Each bit in the mask vector
+identifies one column of an output and a column of an input. A value of 1 means
+the output is impacted by the input.
+
+If the shader is a hull shader, and it has inputs and patch outputs, an input to
+patch map will be included next. This is identical in structure to the
+output/input map. The dimensions are defined by the size of the patch constant
+or primitive vector mask * the number of inputs * 4 (for each component). Each
+bit in the mask vector identifies one column of a patch constant output and a
+column of an input. A value of 1 means the output is impacted by the input.
+
+If the shader is a domain shader, and it has outputs and patch outputs, an
+output patch map will be included next. This is identical in structure to the
+output/input map. The dimensions are defined by the size of the patch constant
+or primitive vector mask * the number of outputs * 4 (for each component). Each
+bit in the mask vector identifies one column of a patch constant input and a
+column of an output. A value of 1 means the output is impacted by the primitive
+input.
+
+SFI0 Part
+---------
+.. _SFI0:
+
+The SFI0 part encodes a 64-bit unsigned integer bitmask of the feature flags.
+This denotes which optional features the shader requires. The flag values are
+defined in `llvm/include/llvm/BinaryFormat/DXContainerConstants.def <https://github.com/llvm/llvm-project/blob/main/llvm/include/llvm/BinaryFormat/DXContainerConstants.def>`_.
diff --git a/llvm/docs/DirectXUsage.rst b/llvm/docs/DirectXUsage.rst
index 79543e19bd34bb..b6bb0fa34bae87 100644
--- a/llvm/docs/DirectXUsage.rst
+++ b/llvm/docs/DirectXUsage.rst
@@ -14,6 +14,7 @@ User Guide for the DirectX Target
    :hidden:
 
    DirectX/DXILArchitecture
+   DirectX/DXContainer
 
 Introduction
 ============
@@ -78,9 +79,11 @@ both ``DXBC`` and ``DXIL`` outputs, and the ultimate goal is to support both as
 code generation targets in LLVM, the LLVM codebase uses a more neutral name,
 ``DXContainer``.
 
-The ``DXContainer`` format is sparsely documented in the functional
+The ``DXcontainer`` format is sparsely documented in the functional
 specification, but a reference implementation exists in the
 `DirectXShaderCompiler. <https://github.com/microsoft/DirectXShaderCompiler>`_.
+The format is documented in the LLVM project docs as well (see
+:doc:`DirectX/DXContainer`).
 
 Support for generating ``DXContainer`` files in LLVM, is being added to the LLVM
 MC layer for object streamers and writers, and to the Object and ObjectYAML

@@ -78,9 +79,11 @@ both ``DXBC`` and ``DXIL`` outputs, and the ultimate goal is to support both as
code generation targets in LLVM, the LLVM codebase uses a more neutral name,
``DXContainer``.

The ``DXContainer`` format is sparsely documented in the functional
The ``DXcontainer`` format is sparsely documented in the functional
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is the change to lowercase "c" here intentional? It looks like it is capitalized everywhere else.

Comment on lines 89 to 90
The part name indicates the format of the part data. There are 23 part headers
used by DXC and FXC:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it may be possible to interpret this as saying that all these parts will be present, but I don't think that's really the case, is it?

It might be nice to know which ones may be generated by DXC, FXC or LLVM.

.. _DXIL:

The DXIL part is comprised of three data structures: the ``ProgramHeader``, the
``BitcodeHeader`` and the bitcode serialized LLVM IR Module.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LLVM IR Module

But it isn't LLVM IR that matches the version of LLVM that this file lives in, is it?


HASH Part
---------
.. _HASH:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this the hash generated by dxil.dll?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The HASH part is just an MD5, the hash that is included in the header is the one generated by dxil.dll.

identifies one column of an output and a column of an input. A value of 1 means
the output is impacted by the input.

If the shader is a hull shader, and it has inputs and patch outputs, an input to
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there such a thing as a hull shader without inputs or outputs? Similar question for DS.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🤷 I have no idea if it would be useful, but I think you can define one. We have a pseudo-code explanation of the format here:
https://github.com/microsoft/DirectXShaderCompiler/blob/main/include/dxc/DxilContainer/DxilPipelineStateValidation.h#L836

Copy link
Contributor

@dmpots dmpots left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks good according to my understanding.

more complicated examples with multiple semantics this is not the case.

#. Index 0 starts at row 0, contains 4 columns, and is float32. This represents
to ``f1`` in the source.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"This represents to" is probably supposed to be "This corresponds to" or "This represents" (without the "to")


Each part begins with a part header. A part header includes the 4-character part
name, and a 32-bit unsigned integer specifying the size of the part data. The
part header is followed by ``Size`` bytes of data comprising the part.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we say anything about the alignment of parts? I feel like they are supposed to be dword aligned, but not sure if that is enforced anywhere or just the common case.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In Clang I've enforced that they are 32-bit aligned, I don't think we enforce that in DXC, which is probably why we get so many ubsan failures in the container reader/writer code.

@llvm-beanz llvm-beanz merged commit afeedd9 into llvm:main May 6, 2024
5 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Status: Done
Development

Successfully merging this pull request may close these issues.

[DirectX] Document DXContainer format
4 participants