[Relax][Frontend][TFLite] Add NON_MAX_SUPPRESSION_V4 converter#19464
[Relax][Frontend][TFLite] Add NON_MAX_SUPPRESSION_V4 converter#19464tlopex merged 1 commit intoapache:mainfrom
Conversation
Adds the missing TFLite NonMaxSuppressionV4 frontend handler. The underlying relax.op.vision.non_max_suppression already covers V4's behavior with soft_nms_sigma at the default 0.0 (hard-NMS path). The handler bridges TFLite's tensor format to the Relax op, following the same pattern as convert_nms_v5 (apache#19426) but without its soft-NMS branching. Tests cover conversion and IR structural assertions, run with `pytest tests/python/relax/test_frontend_tflite.py -k nms_v4`. E2E correctness runs on the nightly gate (CI_ENV_NIGHTLY). Relates to apache#19412.
There was a problem hiding this comment.
Code Review
This pull request introduces support for the NON_MAX_SUPPRESSION_V4 operator in the TFLite frontend for TVM Relax. The changes include the implementation of the convert_nms_v4 method and the addition of comprehensive unit tests covering IR verification and end-to-end correctness. Feedback was provided to address a potential shape mismatch issue that occurs when the requested max_output_size exceeds the number of input boxes, suggesting that the output indices should be padded to ensure a consistent tensor shape.
| selected_indices = relax.op.squeeze(nms_ret[0], axis=[0]) | ||
| selected_indices = relax.op.strided_slice( | ||
| selected_indices, axes=[0], begin=[0], end=[max_output_size] | ||
| ) |
There was a problem hiding this comment.
The current implementation of strided_slice may result in an incorrect output shape when max_output_size is greater than the number of input boxes (num_boxes). In TFLite, NON_MAX_SUPPRESSION_V4 (and V5) typically pads the output to exactly max_output_size if pad_to_max_output_size is enabled (which is common).
Since relax.op.vision.non_max_suppression returns a tensor of shape [batch_size, num_anchors], the squeezed selected_indices has length num_boxes. If max_output_size > num_boxes, strided_slice with end=[max_output_size] will clip to num_boxes, resulting in a shape mismatch if the rest of the graph expects max_output_size.
Consider padding the indices to ensure the output shape is consistently max_output_size regardless of the input size.
| selected_indices = relax.op.squeeze(nms_ret[0], axis=[0]) | |
| selected_indices = relax.op.strided_slice( | |
| selected_indices, axes=[0], begin=[0], end=[max_output_size] | |
| ) | |
| num_boxes = int(self.get_tensor_shape(input_tensors[0])[0]) | |
| if max_output_size > num_boxes: | |
| selected_indices = relax.op.nn.pad( | |
| selected_indices, pad_width=[0, max_output_size - num_boxes], pad_value=-1 | |
| ) | |
| selected_indices = relax.op.strided_slice( | |
| selected_indices, axes=[0], begin=[0], end=[max_output_size] | |
| ) |
Adds the missing TFLite NonMaxSuppressionV4 frontend handler. The
underlying relax.op.vision.non_max_suppression already covers V4's
behavior with soft_nms_sigma at the default 0.0 (hard-NMS path). The
handler bridges TFLite's tensor format to the Relax op, following the
same pattern as convert_nms_v5 (#19426) but without its soft-NMS
branching.
Tests cover conversion and IR structural assertions, run with
pytest tests/python/relax/test_frontend_tflite.py -k nms_v4. E2Ecorrectness runs on the nightly gate (CI_ENV_NIGHTLY).
Relates to #19412.