Update augment branch to use kernels for zero-fill before layernorm and rmsnorm by cbcase · Pull Request #1 · augmentcode/TransformerEngine

cbcase · 2024-10-16T23:51:37Z

This PR updates the TE code to use a fill kernel (not memset) to zero-out buffers before calling the fast fp8 layernorm and rmsnorm kernels. We are making this change because the memset ops introduce gaps in cuda graph execution.

MarkusRabe

My only comment might be that it might be helpful to keep the existing code runnable, just hidden by a flag.

MarkusRabe · 2024-10-17T17:32:20Z

transformer_engine/common/layer_norm/ln_api.cpp

        params.barrier = reinterpret_cast<int*>(barrier->data.dptr);
    }

+    const char *envval = std::getenv("NVTE_FORCE_MEMSET");


The only nit is that it would be good to document the flag. This is a fall back to the original behavior of library.

cbcase added 3 commits October 16, 2024 23:32

Use zero-out kernel for layer_norm

ec31df8

zero_out for rmsnorm too

a46780d

update version string with git commit too

6b86380

MarkusRabe approved these changes Oct 17, 2024

View reviewed changes

MarkusRabe reviewed Oct 17, 2024

View reviewed changes

add back orig memset with envvar [no-ci]

7f56a78

MarkusRabe reviewed Oct 17, 2024

View reviewed changes

add a comment

fa15a2d

cbcase merged commit 6bdede8 into v0.13-augment Oct 17, 2024

cbcase deleted the v0.13-augment-zero-out branch October 17, 2024 20:44

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Update augment branch to use kernels for zero-fill before layernorm and rmsnorm#1

Update augment branch to use kernels for zero-fill before layernorm and rmsnorm#1
cbcase merged 5 commits intov0.13-augmentfrom
v0.13-augment-zero-out

cbcase commented Oct 16, 2024

Uh oh!

MarkusRabe left a comment

Uh oh!

MarkusRabe Oct 17, 2024

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

cbcase commented Oct 16, 2024

Uh oh!

MarkusRabe left a comment

Choose a reason for hiding this comment

Uh oh!

MarkusRabe Oct 17, 2024

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants