Skip to content

Latest commit

History

History
21 lines (17 loc) 路 453 Bytes

SubNorm.mdx

File metadata and controls

21 lines (17 loc) 路 453 Bytes

import { Callout } from "nextra-theme-docs" import { Tab, Tabs } from "nextra-theme-docs"

SubNorm (Attention)

Apply layer normalization before projection.

PreNorm(
    Attention(
        dim=768,
        plugins=[
            SubNorm(dim=768)
        ],
    ),
    dim=768,
)

This plugin implements Sub-LN for Foundation Transformers. Note that Sub-LN presumes Pre-LN rather than Post-LN