Skip to content

Add concat / slice / cast StableHLO converters (P1) #489

@michalharakal

Description

@michalharakal

Context

Follow-up to #483 (gather/embedding). Structural tensor ops that are load-bearing for LLM export but currently missing from the StableHLO emitter:

  • `concat` / `concatenate` / `cat` / `stack` — needed for KV-cache append, `rotate_half` inside RoPE, and any multi-head attention that concatenates per-head outputs.
  • `slice` — needed for KV-cache read, `rotate_half`'s `x[..., :half]` / `x[..., half:]` pattern, and any head-split.
  • `cast` / `convert` / `to` — needed for dtype transitions (FP16 activation \u2192 FP32 attention scores, int32 indices \u2192 whatever the embedding table is, etc.).

None of these exist in the emitter today. Every traced model that touches any of them fails at the converter registry's "no converter found" path.

Target lowerings

All three map to single StableHLO ops:

// concat along axis 0
%out = stablehlo.concatenate %a, %b, %c, dim = 0 : (tensor<..>, tensor<..>, tensor<..>) -> tensor<..>

// slice [start:limit:stride] along each dim
%out = stablehlo.slice %x [0:4:1, 0:8:1] : (tensor<8x16xf32>) -> tensor<4x8xf32>

// dtype conversion
%out = stablehlo.convert %x : (tensor<..xf32>) -> tensor<..xf16>

This PR

  1. Extend `ShapeOperationsConverter` with `concat` / `concatenate` / `cat` / `stack` and `slice` handling. Existing file already covers `reshape`, `flatten`, `squeeze`, `unsqueeze`, so structural tensor ops fit there.
  2. Extend `MathOperationsConverter` with `cast` / `convert` / `to` handling. Cast is an elementwise type transformation; goes with the other elementwise ops.
  3. Parameter reading:
    • concat: `axis` / `dim` (default 0), list of input operands.
    • slice: `start_indices` / `limit_indices` / `strides` arrays; default strides = 1 per dim.
    • cast: target `dtype` / `to_dtype` parameter.
  4. Unit tests in `ShapeOperationsConverterTest` and `MathOperationsConverterTest` covering each new op.

Out of scope

  • `gather` / `scatter` — separate converter (Add gather / embedding StableHLO converter (P1) #483 already landed gather).
  • Advanced slice forms (dynamic slice, strided slice with runtime bounds). The first PR targets static slice.
  • `cast` quantization-aware conversion — depends on further P0-1 work.
  • RoPE integration. Separate issue.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions