Open
Description
🐛 Describe the bug
Hit this issue when I was debugging the gemma-3 error in optimum-executorch
(CI link).
Upon checking the complained op_overload
, the function object is actually a sym_max
which is introduced via this line
if self.is_sliding[layer_idx]:
query_length = cache_position.shape[0]
...
local_mask_kv_length = max(query_length, self.sliding_window)
return local_mask_kv_length, local_mask_kv_offset
in the upstream transformers
in the HybridCache
class, which is expected as now we set the cache_poistion
dim to be dynamic in huggingface/optimum-executorch#73.
The export works fine to have the sym_max
in the exported graph (test_hybrid_cache_exportability
passed in the transformers
), but when further lowering the graph to executorch, the call to_executorch
failed.
On chatting with @kimishpatel , it seems like we are missing implementation for sym_max
as an op.
Versions
trunk
Metadata
Metadata
Assignees
Labels
Type
Projects
Status
No status