New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add bbox V2 as discussed in #1142 #1177
Conversation
kornia/geometry/bbox_v2.py
Outdated
[0., 0., 0., 0., 0.]]]) | ||
""" | ||
_check_bbox_dimensions(boxes) | ||
mask = torch.zeros((*boxes.shape[:-2], height, width), dtype=boxes.dtype, device=boxes.device) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't know a clean approach to support scripting and without creating a lot of boilderplate code for this case *boxes.shape[:-2]
. There are several lines similar to this. ¿How could I refactor to be torchscript friendly?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just boxes.shape[0], boxes.shape[1]
will do. It is 2D only right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It could be (B, N, 4, 2) or (N, 4, 2). A workaround would be to convert (N, 4, 2) to (1, N, 4, 2), make all calculations and then move back to (N, 4, 2). If there isn't another option, I would do it.
@shijianjian I have one doubt. After transforming a box, the box coordinates order (top-left, top-right, bottom-left, bottom-right). could be no longer valid. For example, with an horizontal flip. Current kornia.geometric.bbox.transform_bbox returns them with correcting the coordinates. I think that |
Indeed. I had similar thoughts back to some time ago. The reason I did not do anything to update it because we also have the functionality of inversing back the bbox transform operation. If we modified the points after an operation, it will be no longer possible to inverse it back. It might break the consistency of some of our features, like what we did in |
As it's a long response, it could create a debate and it may result in another PR, I answer it in the original issue. Here is the answer |
docs/source/geometry.bbox_v2.rst
Outdated
@@ -0,0 +1,11 @@ | |||
kornia.geometry.bbox |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Which is the command to generate docs locally? I don't know it and I couldn't check if it appear in kornia docs.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
from project root you can use make build-docs
assert kornia_xywh_plus_1.shape == expected_box_xywh_plus_1.shape | ||
assert_allclose(kornia_xywh_plus_1, expected_box_xywh_plus_1) | ||
|
||
def test_gradcheck(self, device, dtype): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fails. I don't know how to solve it. It failed for the old kornia.geometry.bbox code.
test/geometry/test_bbox_v2.py:122 (TestBbox2D.test_gradcheck[cpu-float32])
self = <test.geometry.test_bbox_v2.TestBbox2D object at 0x7f128c56ff90>
device = device(type='cpu'), dtype = torch.float32
def test_gradcheck(self, device, dtype):
boxes1 = torch.tensor([[[1.0, 1.0], [3.0, 1.0], [3.0, 2.0], [1.0, 2.0]]], device=device, dtype=dtype)
boxes1 = utils.tensor_to_gradcheck_var(boxes1)
boxes2 = utils.tensor_to_gradcheck_var(boxes1.detach().clone())
boxes_xyxy = torch.tensor([[1.0, 3.0, 5.0, 6.0]])
> assert gradcheck(infer_bbox_shape, (boxes1,), raise_exception=True)
test_bbox_v2.py:130:
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
../../.dev_env/envs/venv/lib/python3.7/site-packages/torch/autograd/gradcheck.py:1245: in gradcheck
return _gradcheck_helper(**args)
../../.dev_env/envs/venv/lib/python3.7/site-packages/torch/autograd/gradcheck.py:1259: in _gradcheck_helper
rtol, atol, check_grad_dtypes, check_forward_ad=check_forward_ad, nondet_tol=nondet_tol)
../../.dev_env/envs/venv/lib/python3.7/site-packages/torch/autograd/gradcheck.py:931: in _gradcheck_real_imag
rtol, atol, check_grad_dtypes, nondet_tol)
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
func = <function infer_bbox_shape at 0x7f128cde4f80>
func_out = (tensor([1.], dtype=torch.float64, grad_fn=<SelectBackward>), tensor([2.], dtype=torch.float64, grad_fn=<SelectBackward>))
tupled_inputs = (tensor([[[1., 1.],
[3., 1.],
[3., 2.],
[1., 2.]]], dtype=torch.float64, grad_fn=<CopyBackwards>),)
outputs = (tensor([1.], dtype=torch.float64, grad_fn=<SelectBackward>), tensor([2.], dtype=torch.float64, grad_fn=<SelectBackward>))
eps = 1e-06, rtol = 0.001, atol = 1e-05, check_grad_dtypes = False
nondet_tol = 0.0
def _slow_gradcheck(func, func_out, tupled_inputs, outputs, eps, rtol, atol, check_grad_dtypes,
nondet_tol, *, use_forward_ad=False, complex_indices=None, test_imag=False):
if not outputs:
return _check_no_differentiable_outputs(func, tupled_inputs, _as_tuple(func_out), eps)
numerical = _transpose(_get_numerical_jacobian(func, tupled_inputs, outputs, eps=eps, is_forward_ad=use_forward_ad))
if use_forward_ad:
analytical_forward = _get_analytical_jacobian_forward_ad(func, tupled_inputs, outputs, check_grad_dtypes=check_grad_dtypes)
for i, n_per_out in enumerate(numerical):
for j, n in enumerate(n_per_out):
a = analytical_forward[j][i]
if not _allclose_with_type_promotion(a, n.to(a.device), rtol, atol):
raise GradcheckError(_get_notallclose_msg(a, n, i, j, complex_indices, test_imag,
is_forward_ad=True))
else:
for i, o in enumerate(outputs):
analytical = _check_analytical_jacobian_attributes(tupled_inputs, o, nondet_tol, check_grad_dtypes)
for j, (a, n) in enumerate(zip(analytical, numerical[i])):
if not _allclose_with_type_promotion(a, n.to(a.device), rtol, atol):
> raise GradcheckError(_get_notallclose_msg(a, n, i, j, complex_indices, test_imag))
E torch.autograd.gradcheck.GradcheckError: Jacobian mismatch for output 0 with respect to input 0,
E numerical:tensor([[ 0.0000],
E [-0.5000],
E [ 0.0000],
E [-0.5000],
E [ 0.0000],
E [ 0.5000],
E [ 0.0000],
E [ 0.5000]], dtype=torch.float64)
E analytical:tensor([[ 0.],
E [-1.],
E [ 0.],
E [ 0.],
E [ 0.],
E [ 1.],
E [ 0.],
E [ 0.]], dtype=torch.float64)
../../.dev_env/envs/venv/lib/python3.7/site-packages/torch/autograd/gradcheck.py:978: GradcheckError
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@shijianjian @dkoguciuk any idea about this ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think functions like infer_bbox_shape
should be non-differentiable. The gradients of shape inferring does not make too much sense.
@edgarriba @shijianjian this PR is ready to review. What it's left is to come up with a better name for Also, I couldn't test the lastest changes with cuda as I currently don't have access to a pc with it. I don't expect test to fail. If so, I would get one. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. I thought also we were going for a BoundingBox class too.
width: torch.Tensor, | ||
height: torch.Tensor, | ||
depth: torch.Tensor, | ||
): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
): | |
) -> torch.Tensor: |
raise ValueError(f"3D bbox shape must be (N, 8, 3) or (B, N, 8, 3). Got {boxes.shape}.") | ||
|
||
|
||
def _boxes_to_polygons(xmin: torch.Tensor, ymin: torch.Tensor, width: torch.Tensor, height: torch.Tensor): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
def _boxes_to_polygons(xmin: torch.Tensor, ymin: torch.Tensor, width: torch.Tensor, height: torch.Tensor): | |
def _boxes_to_polygons(xmin: torch.Tensor, ymin: torch.Tensor, width: torch.Tensor, height: torch.Tensor) -> torch.Tensor: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This codebase looks good. Agree with @edgarriba, that this suite API is still in "v1" version which is still functional. We may probably merge this API to the current ones. Regarding "v2", I think we were discussing a class BBox()
, class Polygon()
, or similar.
return hexahedrons | ||
|
||
|
||
def kornia_bbox_to_bbox(kornia_boxes: torch.Tensor, mode: str = "xyxy") -> torch.Tensor: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
By "kornai_bbox", we are inferring the "polygon" , right? kornia_bbox is a bit confusing to me.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes. We are inferring the polygon. polygon_to_bbox
would be a good name. However, for me, it's a bit confusing that a bbox module requires to apply the inverse operation (bbox_to_polygon
) to work with it's functions.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see your point. That is probably the reason that why will we need a "polygon.py" module. The higher level bbox module shall encapsulate such conversions.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Like the Idea
``depth = zmax - zmin + 1``. | ||
* 'xyzwhd': boxes are assumed to be in the format ``xmin, ymin, zmin, width, height, depth`` where | ||
``width = xmax - xmin``, ``height = ymax - ymin`` and ``depth = zmax - zmin``. | ||
* 'xyzwhd_plus_1': like 'xyzwhd' where ``width = xmax - xmin + 1``, ``height = ymax - ymin + 1`` and |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Probably need a better naming and explanation.
kornia_boxes = kornia_boxes if batched else kornia_boxes.unsqueeze(0) | ||
|
||
# Create boxes in xyxy format. | ||
boxes = torch.stack([kornia_boxes.min(dim=-2).values, kornia_boxes.max(dim=-2).values], dim=-2).view( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would ".values" break the gradients?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't know. Pytorch docs has this warning: This function produces deterministic (sub)gradients unlike min(dim=0)
. Could it be "dim=-2" part?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That warning probably related to pytorch/pytorch#35699.
I didn't pursue BoundingBox, Boxes, Polygons or similar as I though it would be a big change and it didn't seem to me that they were clearly defined. So, I matched "v1" code style. I agree that using a class would be ideal. I think that it'll mainly a refactor of this code. However, I would need a clear guidance in how to name it, were it would live, which methods should have and if it need if the class needs to support torchscript. About merging this code with "v1", it would break backward compatibility as they use different conventions. Finally, I see that some cuda checks are failing. I'll fix them. |
@hal-314 why it does not have backward compatibility if we have had implemented the "plus_1" format? |
By backward compatibility I meant that users will need to update their code. I have changed almost every function, most of the time, to be coherent between them. I implemented "plus_1" as this was a common format before. So, to help users to use kornia with their code base, I implemented it. |
@hal-314 If I understand it right, there is no differences to our users if we set the mode default to "plus_1"? Later, when we have a more concrete idea of which direction we will go, we may add a deprecation warning to deprecate "plus_1" notation. |
That change would ease users transition. Regardless, users will need to update their code. For example, there isn't bbox_generator anymore or transform_bbox neither accept or returns boxes in xyxy old format. Now, it expects and returns boxes as quadrilaterals and it's argument order has been changed to match with other parts of kornia functional api. The old and new api aren't compatible as it's impossible to be if we want a more coherent api (we discussed in #1142). |
We will need a wrapper to help users move painlessly then. The proper deprecation step must include a deprecation warning over several releases. I am not against updating code, but we shall ease the learning curve and let users know how to update. Still, I think this shall be merged into v1 version. The v2 version shall be bigger, like with supports of BBox classes, etc. |
agree here - we don't want users complaining for deprecating code without giving a shout out |
Agree with throwing depreciation warnings as seems that we will merge this code in "v1". I have a couple of doubts with this approach:
Finally, I don't know if it's easier to abandon this PR and rewrite it using the Bbox class directly. I'll try to make a proposal for this class so we can know how much work would be. |
Not sure how much differences there. Assuming its big, then we could do things like: def transform_bbox(*args, **kwargs, mode='old'):
if mode == 'old':
warning.warn("deprecated")
return _transform_bbox_v1(*args, **kwargs)
return _transform_bbox_v2(*args, **kwargs)
I made one for augmentation but removed it after the deprecation. We may create a proper one under |
@hal-314 Hey, are you feeling like to get this PR merged soon? We are preparing the next release. Let us know! |
@shijianjian I tried to make this new api and the old one compatible. However, I found difficult to do. Also, it was hard to document. So, I trying the V2 approach with objects. Locally, I have a working implementation for 2D boxes. When I have it, it would superseed this. I hope to finish and make a PR by the next week or two. |
I see. So, do you mean to close this PR then open a new one? By any chances, is there any functions in this PR that we can cherry-pick and merge? |
Yes. If you prefer, I'll closed this PR without waiting to the need one.
I don't think so as box format was changed to remove "+1" convention, function names, etc. Batch support involves changing several lines + new tests. However, it's easy to add to the current codebase taking this PR as reference. |
Close this PR as #1304 superseeds it. |
* Add boxes V2 as discussed in #1142 * Implememted some PR suggestions * Update docs + add missing tests * Fix docs + add rectangle checks * Fix deepsource errors * Update test/geometry/test_bbox_v2.py * Implements comments + fix tests * Disable gradients in Boxes3D.to_tensor * Rename bbox_v2 to boxes * Fix "plus" convention. Previously, it used an incorrect sign * Add missing 3D test * Change to use "+1" convention as discussed in #1398 * Apply suggestions from code review Co-authored-by: Edgar Riba <edgar.riba@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Fix PR comments * doctest fixes Co-authored-by: Jian Shi <sj8716643@126.com> Co-authored-by: Edgar Riba <edgar.riba@gmail.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Description
Add bbox V2 as discussed in #1142. I think that code is practically finished. See TODO section below for the missing bits. They could be tackle in other PRs. Also, I think I won't have much free time until the end of the summer. So, I open this PR so others contributors can continue this work during the summer if they want.
There are several new functions. I'm not sure if their names follows kornia convention. I tried as best as I could.
Status
Ready to review
Differences from
kornia.geometry.bbox
:General:
Boxes 3D:
index_fill
with tensor regular indexing (ex: t[:, 4, 5])Tests:
TODO to do in future PRs):
Types of changes
PR Checklist
PR Implementer
This is a small checklist for the implementation details of this PR.
If there are any questions regarding code style or other conventions check out our
summary.
make test
make build-docs
make lint
make mypy
KorniaTeam
KorniaTeam workflow
closes #IssueNumber
at the bottom ifnot already in description)
Reviewer
Reviewer workflow
kornia
design conventions?