-
-
Notifications
You must be signed in to change notification settings - Fork 796
Open
Description
Thanks for the tutorial! When learning fused_layer_norm, I see the capping on maximum block_size. I tried removing the constraint and indeed the results became incorrect. But I could not understand why such constraint exists. Does it mean there is an upperbound of BLOCK_SIZE, exceeding which can cause incorrect behaviors?
MAX_FUSED_SIZE = 65536 // x.element_size()
BLOCK_SIZE = min(MAX_FUSED_SIZE, triton.next_power_of_2(N))
if N > BLOCK_SIZE:
raise RuntimeError("This layer norm doesn't support feature dim >= 64KB.")
Metadata
Metadata
Assignees
Labels
No labels