Skip to content

LELU Activation Function: Proposal for PyTorch #165982

@falseywinchnet

Description

@falseywinchnet

Summary

We propose the introduction of a fixed scaling sigmoid activation function to be called LELU — the Logistic Error Linear Unit. LELU serves as a lower-cost, analytically consistent alternative to GELU that is derived from the fixed-point correspondence between the logistic sigmoid and its associated cumulative distribution function (CDF) and probability density function (PDF).


Motivation

The Gaussian Error Linear Unit (GELU) is designed around the Gaussian CDF, offering smoother activation transitions than ReLU or ELU. However, GELU requires relatively costly operations (erf/tanh + cubic terms) and is only an approximation of the true Gaussian gate.

In contrast, LELU is grounded in the Logistic CDF which is analytically simpler, fully differentiable, and exhibits similar curvature to the Gaussian but with computational advantages. The logistic sigmoid family offers a natural match to many normalized training distributions observed in deep networks.


Derivation Overview

LELU is defined analogously to GELU:

LELU(x) = x * 0.5 * (1 + tanh(a * (x)))

where a = π / (2√3) arises from the equivalence between the variance of the standard logistic distribution and that of the standard normal distribution.


Proposal

Add torch.nn.LELU and torch.nn.functional.lelu

Documentation and tests can mirror the structure of torch.nn.GELU, with additional notes on logistic variance equivalence and performance observations.

import math
import torch
import torch.nn as nn
import torch.nn.functional as F

class LELU(nn.Module):
    def __init__(self):
        super().__init__()
    def forward(self, x: torch.Tensor) -> torch.Tensor:
            return x * torch.sigmoid((math.pi / math.sqrt(3.0) )* x)

cc @albanD @mruberry @jbschlosser @walterddr @mikaylagawarecki

Metadata

Metadata

Assignees

No one assigned

    Labels

    module: nnRelated to torch.nntriagedThis issue has been looked at a team member, and triaged and prioritized into an appropriate module

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions