Skip to content

sym_max support in ExecuTorch #11988

Open
@guangy10

Description

@guangy10

🐛 Describe the bug

Hit this issue when I was debugging the gemma-3 error in optimum-executorch (CI link).

Upon checking the complained op_overload, the function object is actually a sym_max which is introduced via this line

https://github.com/huggingface/transformers/blob/1d45d90e5d1552eccb6d8cc9b7bba283ccefb808/src/transformers/cache_utils.py#L1725-L1732

        if self.is_sliding[layer_idx]:
            query_length = cache_position.shape[0]
            ...
            local_mask_kv_length = max(query_length, self.sliding_window)
            return local_mask_kv_length, local_mask_kv_offset

in the upstream transformers in the HybridCache class, which is expected as now we set the cache_poistion dim to be dynamic in huggingface/optimum-executorch#73.

The export works fine to have the sym_max in the exported graph (test_hybrid_cache_exportability passed in the transformers), but when further lowering the graph to executorch, the call to_executorch failed.

On chatting with @kimishpatel , it seems like we are missing implementation for sym_max as an op.

Versions

trunk

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    Status

    No status

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions