Skip to content

[QST] How to set tile size for fp64 in examples/36_gather_scatter_fusion? #612

@umiswing

Description

@umiswing

Hello! I'm trying to make examples/36_gather_scatter_fusion works for fp64. A,B,C and D are all fp64 and row major, with a shape of (m,n,k)=(4,4,4), and index_size=2.
I found examples/18_ampere_fp64_tensorop_affine2_gemm and set a tile size same as this example:

// This code section describes the tile size a thread block will compute
using ShapeMMAThreadBlock = cutlass::gemm::GemmShape<128, 128, 16>; 
// This code section describes tile size a warp will compute
using ShapeMMAWarp = cutlass::gemm::GemmShape<32, 64, 16>; 
// This code section describes the size of MMA op
using ShapeMMAOp = cutlass::gemm::GemmShape<8, 8, 4>; 

But a complication error occur:

/home/me/cutlass/include/cutlass/transform/threadblock/predicated_tile_access_iterator.h(353): error: static assertion failed with "Vectors implied by the thread map must be divisible by the access type."
          detected during:
            instantiation of class "cutlass::transform::threadblock::PredicatedTileAccessIterator<Shape_, Element_, cutlass::layout::PitchLinear, AdvanceRank, ThreadMap_, AccessType_, Gather> [with Shape_=cutlass::layout::PitchLinearShape<16, 128>, Element_=double, AdvanceRank=0, ThreadMap_=cutlass::transform::PitchLinearWarpRakedThreadMap<cutlass::layout::PitchLinearShape<16, 128>, 256, cutlass::layout::PitchLinearShape<16, 2>, 1>, AccessType_=cutlass::Array<double, 2, true>, Gather=true]" 

According to #566, I guess may be I should set the tile size to half of the tile size for fp32, so I set the tile size as:

using ShapeMMAThreadBlock =
    cutlass::gemm::GemmShape<128, 128, 8>; 
// This code section describes tile size a warp will compute
using ShapeMMAWarp = cutlass::gemm::GemmShape<64, 64, 8>; 
// This code section describes the size of MMA op
using ShapeMMAOp = cutlass::gemm::GemmShape<16, 8, 4>;

But a similar complication error occur. :(

What tile size works for fp64? Are there any rules I can refer to to set the correct tile size? By the way, is it possible to change tile size to accelerate?

Metadata

Metadata

Assignees

No one assigned

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions