-
Notifications
You must be signed in to change notification settings - Fork 3.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[RFC] NMS API Change #2535
Comments
I asked @kevinthesun to bring this up mainly to make sure we have put enough thoughts into the API naming. as per https://docs.tvm.ai/contribute/code_review.html#deliberate-on-api-and-data-structures Can we also do a brief survey of existing frameworks and their API on nms? Would also be helpful to get feedback from developers who worked on object detection before. cc @winstywang @liangfu @hlu1 @antinucleon |
From functionality-wise I feel good about the new proposal, but @tqchen is correct, we probably need to put some effort survey for example, tf(https://www.tensorflow.org/api_docs/python/tf/image/non_max_suppression) |
The TF's API is definitely something we should look into, and possibly adopt |
It looks like tf nms can be covered by current implementation. I'll take a look at tf nms implementation details to see whether there is any difference. |
Yes, I think we have a superset of APIs, which looks good to me |
It is good that we have a superset if API, however. We might want to make sure that the function and parameter names are consistent |
TF non_max_suppression_v3 op has the same functionality with current tam implementation. There is one different parameter naming. TVM uses topk which is from mxnet while tf uses max_output_size. |
Is it possible to also look at other libraries(keras, pytorch)? |
Here’s one implementation from keras: https://github.com/pierluigiferrari/ssd_keras/blob/master/models/keras_ssd512.py |
One implementation from pytorch: https://github.com/kuangliu/torchcv/blob/6291f3e1e4bbf6467fd6b1e79001d34a59481bb6/torchcv/utils/box.py#L88 It is similar to tf nms. tf and pytorch implementation returns a variable lengthindices of selected boxes. We can add an argument to choose return types. Keras nms has a different format of inputs. We need to do some preprocess while converting. The output format is the same as tam implementation. |
Can we summarize all the argument names(keras, tf, proposed) and types at the RFC post? |
API summary updated. |
@zhreshold @vinx13 @Laurawly Please share your thoughts on the API names. My feeling is that perhaps we should make the API as consistent as possible as Pytorch/TF (use name non_max_suppression). But use a different name or an additional argument that indicates -1 padding. |
@kevinthesun Is the shape of |
@kevinthesun where shall I find the API summary? |
@Laurawly I think it's updated on the top of this page. @kevinthesun have you checked the layout of the |
@vinx13 We have another op get_valid_counts to generate this tensor for tf/pt nms and mxnet box_nms operator. I separate it out from nms to make this API support different ways of generating this tensor. @liangdzou I think you are talking about score. This API supports [class_id, score, l, t, r, b], and should cover your case. However, different data layout, such as different order of axes, haven't been supported. |
@tqchen We can change the name to align with tf/pt. For type of return, I feel it's easier to support both mxnet/keras and tf/pt by adding another argument to indicate the return type. |
That sounds good, @kevinthesun can you update the proposed API? Perhaps we want to make the additional return type option mandatory to avoid confusion from tf/pt ops |
@tqchen API updated. |
If everyone is fine with proposed API, I'll go ahead updating current implementation. |
@kevinthesun What's the difference between |
@Laurawly Actually TF always return |
@zhreshold Yes |
@vinx13 @Laurawly @zhreshold @kevinthesun please comment if we have reach consensus on the API |
The new API LGTM. |
LGTM |
@kevinthesun Last comment on the API design, can you create a mapping between the TF/PT arguments and our current ones? Why are there still different ones? Is it possible to force most of the convention to be consistent with TF/PT? It is a bit hard for me to draw connections. I would also recommend putting non-overlapping arguments at the end of the argument list. |
'data' in tvm can be composed by the 'boxes' and 'score' in tf/pt.
|
I see, one thing that worries me is that the user might be confused on the divergence. Perhaps we should do |
@kevinthesun the argument |
@tqchen Are you talking about renaming the api itself to non_maximum_supression_return_indices? I think it's better to keep current name since we support returning either the box indices or boxes. @liangfu Currently we only support [class_id, score, bl, bt, br, bb] data layout. I'll mention this in docstring. For example, when you converting a gluoncv ssd model, if your original model set id_axis to be 1, it will return an error. However, we still need this argument to indicting when we want to ignore this axis. tf/pt nms inputs doesn't have this axis. In this case, we set id_axis=-1 to ignore it. |
My main concern is that |
We can make it optional and by default returning indices? |
As long as the default behavior is the most common one(as in Tf pt) it is fine |
@kevinthesun can you conclude the RFC by summarizing the discussed API and tag everyone for a quick consensus check |
API summary is updated. @tqchen @zhreshold @Laurawly @vinx13 @liangfu |
LGTM |
thanks for the update, it lgtm |
Thanks, @kevinthesun I think we can conclude this PFC and move on to implementation. Thanks for everyone's helpful discussion |
Thanks, everyone for a great discussion |
Hi, @kevinthesun , I saw you mentioned that when users use tf/pt models with tvm's non_max_suppression op, they can compose 'boxes' and 'score' to generate the 'data' parameter. |
To support gluoncv object detection model, nms operator api needs to be changed.
While the old api is nms(data, valid_count, overlap_threshold, force_suppress, topk), new api is non_max_suppression(data, valid_count, return_indices, iou_threshold, force_suppress, topk, id_axis, invalid_to_bottom).
This new api can support both mxnet legacy ssd model and gluoncv box_nms op.
Some investigation for nms implementation in other frameworks:
One key difference between tvm implementation and tf/pt implementation is tvm always returns a fixed shape output and pad invalid boxed with -1, while tf/pt returns a variable shape tensor denpending on input data values.
@zhreshold @tqchen @Laurawly @vinx13 Do you have concerns about naming or other aspects?
The text was updated successfully, but these errors were encountered: