Tensile is a tool for creating benchmark-driven backend libraries for GEMMs, GEMM-like problems (such as batched GEMM), and general N-dimensional tensor contractions on a GPU. The Tensile library is mainly used as a backend library for rocBLAS. Tensile acts as the performance backbone for a wide variety of 'compute' applications running on AMD GPUs.
Note
The published documentation is available at Tensile in an organized, easy-to-read format, with search and a table of contents. The documentation source files reside in the Tensile/docs/src
folder of this repository. As with all ROCm projects, the documentation is open source. For more information on contributing to the documentation, see Contribute to ROCm documentation.