Skip to content

Register silu alias in ActivationOperationsConverter (P1) #484

@michalharakal

Description

@michalharakal

Context

`ActivationOperationsConverter` currently registers the activation function as `swish` and lowers it to `x * sigmoid(x)`. Every Llama / Mistral / Qwen / Gemma family model names the same function `silu` (SiLU — Sigmoid Linear Unit) in their graph metadata, not `swish`. PyTorch's `F.silu` and HuggingFace's `ACT2FN["silu"]` both emit `silu`.

Today a traced Llama model with `silu` activation falls through the converter registry's "no converter found" path.

This PR

Strictly additive, scoped to one line of code plus one line of test:

  1. Add `"silu"` and `"SiLU"` to `ActivationOperationsConverter.supportedOperations`.
  2. Add the `"silu"` case to the `when` in `convert`, dispatching to the existing `convertSwish` (same lowering: `x * sigmoid(x)`).
  3. Extend `ActivationOperationsConverterTest`: assert `registry.isSupported("silu")` and that a graph using `silu` emits a module containing both `stablehlo.exponential` and `stablehlo.multiply` — the same assertions as the existing Swish test, just with the alias name.

Why this is its own issue

Scoping to a single alias keeps the change mechanically reviewable in under 30 seconds and keeps any follow-on work (proper GELU/gelu_tanh registration, unified activation metadata table) unblocked by this trivial PR.

Out of scope

  • New activation functions.
  • Any other alias (gelu_new, relu6, etc.). Handle in follow-ups if traced models surface them.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions