Skip to content

Matrix B skips LDS#264

Closed
ltqin wants to merge 36 commits into
developfrom
bmatrix_skip_lds
Closed

Matrix B skips LDS#264
ltqin wants to merge 36 commits into
developfrom
bmatrix_skip_lds

Conversation

@ltqin
Copy link
Copy Markdown
Collaborator

@ltqin ltqin commented May 30, 2022

No description provided.

@ltqin ltqin requested a review from zjing14 May 30, 2022 11:28

// return block_id to C matrix tile idx (m0, n0) mapping
__host__ __device__ static constexpr auto
MakeDefaultBlock2CTileMap(const CGridDesc_M_N& c_grid_desc_m_n, index_t M01, index_t N01)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This function has been moved to its own classes. Please see PR #235 for reference.

do
{
a_blockwise_copy.RunRead(a_grid_desc_k0_m_k1, a_grid_buf);
b_threadwise_copy.Run(b_grid_desc_k0_k1_k2_n0_n1_n2_n3_k3,
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have a feeling that maybe we can configure B BlockwiseCopy in a way that it simply skips loading to LDS. This way maybe we can just reuse the same GridwiseGemmPipeline_v1 and BlockwiseGemm. Is it doable?

__host__ __device__ static constexpr index_t
CalculateGridSize(const CGridDesc_M_N& c_grid_desc_m_n)
{
const auto M = c_grid_desc_m_n.GetLength(I0);
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Functionality moved to classes. See PR #235

@ltqin ltqin closed this Jun 20, 2022
@illsilin illsilin deleted the bmatrix_skip_lds branch December 7, 2023 18:41
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants