Skip to content

[QwixOdmlOp] Update odml_op to 1) handle every leaf of pytree output activations for fake quantization and 2) allow ops to veto quantization requests from previous ops if the op is excluded from quantization.#223

Merged
copybara-service[bot] merged 1 commit intomainfrom
test_877766241
Mar 6, 2026

Conversation

@copybara-service
Copy link
Copy Markdown

@copybara-service copybara-service Bot commented Mar 3, 2026

[QwixOdmlOp] Update odml_op to 1) handle every leaf of pytree output activations for fake quantization and 2) allow ops to veto quantization requests from previous ops if the op is excluded from quantization.

  1. The _fake_quant_output method now iterates through all jax.Array leaves within the provided output structure. This ensures that auxiliary data, such as activation flags and quantization rules, are applied to individual arrays even when the output is a tuple or list, fixing issues like untagged arrays within down_block outputs.

  2. The _maybe_fake_quant method now respects the current op's act_qtype when it is explicitly set to None and veto quantization requests from previous ops.

  3. Add an abundance of inline comments to _maybe_fake_quant for better readability.

@copybara-service copybara-service Bot changed the title [QwixOdmlOp] Tag all leaf arrays in output activations for fake quantization. [QwixOdmlOp] Update odml_op to 1) handle every leaf of pytree output activations for fake quantization and 2) allow ops to veto quantization requests from previous ops if the op is excluded from quantization. Mar 5, 2026
…activations for fake quantization and 2) allow ops to veto quantization requests from previous ops if the op is excluded from quantization.

1) The `_fake_quant_output` method now iterates through all `jax.Array` leaves within the provided output structure. This ensures that auxiliary data, such as activation flags and quantization rules, are applied to individual arrays even when the output is a tuple or list, fixing issues like untagged arrays within `down_block` outputs.

2) The `_maybe_fake_quant` method now respects the current op's `act_qtype` when it is explicitly set to `None` and veto quantization requests from previous ops.

3) Add an abundance of inline comments to `_maybe_fake_quant` for better readability.

PiperOrigin-RevId: 879782171
@copybara-service copybara-service Bot merged commit da19ac0 into main Mar 6, 2026
@copybara-service copybara-service Bot deleted the test_877766241 branch March 6, 2026 21:07
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant