Skip to content

Add NonMaxSuppression operator#735

Merged
kali merged 4 commits into
sonos:mainfrom
cchudant:nonmaxsuppression
Jun 27, 2022
Merged

Add NonMaxSuppression operator#735
kali merged 4 commits into
sonos:mainfrom
cchudant:nonmaxsuppression

Conversation

@cchudant
Copy link
Copy Markdown
Contributor

@cchudant cchudant commented Jun 8, 2022

Hello!

This PR implements the ONNX NonMaxSuppression operator. (https://github.com/onnx/onnx/blob/main/docs/Operators.md#NonMaxSuppression)

I am not sure whether my TypedOp implementation is correct, I had to use a Symbol to make it work. The thing is, NonMaxSuppression outputs a tensor of the shape [num_selected_indices, 3], where the num_selected_indices variable can only determined by actually evaluating the operator.

It currently does not support NNEF, should I add support for it?
To do that, if I understand correctly, I need to move my operator into tract_core, convert the ONNX operator into an expansion operator, and implement the wire function. Is that correct?

Also, I added node-1.10.1 to the harness reset-test-list script, and found out the node-1.10.1.txt file was not up to date. This PR also updates it, let me know if this was indeed a mistake or deliberate.

Let me know if there is any code style / documentation issue :)

@kali
Copy link
Copy Markdown
Collaborator

kali commented Jun 8, 2022

Looks pretty good! You were right about adding the symbol in output_facts, that's the way to do this.

I would prefer to have this operator also bound to nnef, yeah, I try to do it systematically for ONNX ops. But not necessarily in core, maybe rather in onnx-opl. the *-opl crates are meant to contains operators that are a bit too specific to go to tract-core, but that we need at runtime. There are a few examples already, it should be relatively straightforward (TreeEnsembleClassifier comes to mind.)

@cchudant
Copy link
Copy Markdown
Contributor Author

cchudant commented Jun 8, 2022

Thanks! I will look into that then!

Is this something you would prefer I do before it gets merged, or can I come back to this in a bit?

@kali
Copy link
Copy Markdown
Collaborator

kali commented Jun 8, 2022

I would prefer to have it right away, and it should not be too hard tbh, typically it takes minutes, you already did the hard work, and there are plenty of examples. If you hit a snag, I will help.

@cchudant
Copy link
Copy Markdown
Contributor Author

Okay, so i've worked a bit on that.

@kali I have hit a little snag: on optim, the expansion returns this error:

thread 'node_1_10_1::optim::test_nonmaxsuppression_two_classes' panicked at 'called `Result::unwrap()` on an `Err` value: Translating node #5 "selected_indices" NonMaxSuppression ToTypedTranslator

Caused by:
    0: translating op NonMaxSuppression { optional_max_output_boxes_per_class_input: Some(2), optional_iou_threshold_input: Some(3), optional_score_threshold_input: Some(4), center_point_box: TwoPoints }
    1: Output mismatch after rewiring expansion for output #0: expected 4,3,I64 got n,3,I64', harness/onnx-test-suite/tests/onnx/mod.rs:146:52

This is because of this condition https://github.com/sonos/tract/blob/main/hir/src/ops/expandable.rs#L112 which it seems, does not like unifying my TDim::Symbol n with 4.

Am I doing something wrong?

@kali
Copy link
Copy Markdown
Collaborator

kali commented Jun 15, 2022

Mmmm... where do the N and 4 respectively come from ? We may just have an overzealous check somewhere.

@cchudant
Copy link
Copy Markdown
Contributor Author

4 comes from the model, N comes from output_facts here https://github.com/sonos/tract/pull/735/files#diff-c042ebaaf9b9b7146570b8b638a6c53eebfe2bbec2d16f7b0d11d733c4134f66R223

From what I understand: replacing symbols with values does not happen in the solver. Where does it happen?

I also noticed i did not add any inference rule like s.equals(&outputs[0].shape[0], self.num_selected_indices_symbol.to_dim())?;, and when I add it, the error happens earlier, during solving.

[2022-06-16T08:21:41Z TRACE tract_hir::infer::rules::solver]   Applying rule outputs[0].shape[0] == Sym(Symbol('n', 1))

thread 'node_1_10_1::plain::test_nonmaxsuppression_identical_boxes' panicked at 'called `Result::unwrap()` on an `Err` value: Failed analyse for node #5 "selected_indices" NonMaxSuppression

Caused by:
    1: Applying rule outputs[0].shape[0] == Sym(Symbol('n', 1))
    2: Impossible to unify Val(1) with Sym(Symbol('n', 1)).', harness/onnx-test-suite/tests/onnx/mod.rs:135:34

@kali I am guessing the solver does not support unifying variables with constants yet. Is that correct?

@kali
Copy link
Copy Markdown
Collaborator

kali commented Jun 16, 2022

OK, I'll reach out to you, we need to discuss this more interactively :)

@cchudant
Copy link
Copy Markdown
Contributor Author

I think everything is fixed!

Copy link
Copy Markdown
Collaborator

@kali kali left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So I think we have a problem with optional inputs handling. The rest is good.

Comment thread onnx-opl/src/non_max_suppression.rs Outdated
}

fn eval(&self, mut inputs: TVec<Arc<Tensor>>) -> TractResult<TVec<Arc<Tensor>>> {
let mut inputs = inputs.drain(..);
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you can use let (boxes, scores) = args_2!(inputs)?;

Comment thread onnx-opl/src/non_max_suppression.rs Outdated
inputs.next().map_or(Ok(0i64), |val| val.to_scalar::<i64>().copied())?;
let iou_threshold =
inputs.next().map_or(Ok(0f32), |val| val.to_scalar::<f32>().copied())?;
let score_threshold = inputs
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ach. ok. maybe not.

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mmm... Is this working when some inputs are skipped ? like when max_output_boxes_per_class is not there but iou_threashold is? Because in ONNX you can skip an input by using the empty string (ewww). tract does not allow such a thing :)

I usually proceed like here: have the optional input id as a field...
https://github.com/sonos/tract/blob/main/hir/src/ops/array/strided_slice.rs#L6

And then plug them in like this:
https://github.com/sonos/tract/blob/main/onnx/src/ops/array/slice.rs#L129

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All right :) I just saw the Expansion and... you actually found about that, well done... But how does the -opl eval() know which optional input is which ? Unless I'm mistaken, we're loosing this information at Expansion::wire() time.

Leaving you with a couple options, I think:

  • either propagate the optional_*_input stuff in the -opl op
  • decide that these optional inputs have to be constant, and turn them into value fields in the -opl op
  • decide that they will be always here: we can wire a const (target.add_const in Expansion::wire) with the default value when they're missing

Please consider option #2 seriously: it may be less satisfactory as it is less generic, but one of my strategies to avoid tract being bloating with useless code is to wait for evidence of usage for stuff that looks a bit exotic... (arguably this lead to half baked ops like Resize...).

Copy link
Copy Markdown
Contributor Author

@cchudant cchudant Jun 17, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I prefer the third approach, it's probably the cleanest and changing the current code to use it shouldn't add more than one or two lines of code. If I understand correctly, it's also how the nnef code does optional arguments, right?

I can do the second approach if you want though.

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Go for the third approach, no problem.

Optional args in NNEF ? they are handled by the syntax (as it has named parameters). Then it's a mix of accepting tensors as inputs or coercing to single integer values depending on the operators, I would say.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@kali I did the changes, let me know if they are ok or if you want anything else changed :)

Comment thread onnx-opl/src/non_max_suppression.rs Outdated
@kali
Copy link
Copy Markdown
Collaborator

kali commented Jun 27, 2022

LGTM, merging when CI is happy.

@kali kali merged commit 9dc07e4 into sonos:main Jun 27, 2022
@kali
Copy link
Copy Markdown
Collaborator

kali commented Jun 27, 2022

🎉

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants