Skip to content

torchortho #24

@K-H-Ismail

Description

@K-H-Ismail

KAT used the variance preserving initialization as formulated in the Kaimining initialization for learnable rational activations. This implies calculating the 2nd order moment of a rational function, which has a complicated closed form. We show that this 2nd order moment can be easily computed by considering orthogonal functions. As an example, we used orthogonal polynomials (Hermite) and trigonometric functions (Fourier) and showed that they can be used to achieve better results in image classification on ImageNet using ConvNeXt and next token prediction on OpenWebText using GPT-2.

📄 Paper: Learnable Polynomial, Trigonometric, and Tropical Activations
💻 Code: torchortho on GitHub

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions