Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add bbox V2 as discussed in #1142 #1177

Closed
wants to merge 4 commits into from
Closed

Conversation

hal-314
Copy link
Contributor

@hal-314 hal-314 commented Jul 26, 2021

Description

Add bbox V2 as discussed in #1142. I think that code is practically finished. See TODO section below for the missing bits. They could be tackle in other PRs. Also, I think I won't have much free time until the end of the summer. So, I open this PR so others contributors can continue this work during the summer if they want.

There are several new functions. I'm not sure if their names follows kornia convention. I tried as best as I could.

Status

Ready to review

Differences from kornia.geometry.bbox:

General:

  • Batch support.
  • Change bbox format to remove +1
  • Docs updated.
  • Fix support for dtype and device. Now, results will always match with dtype and device of the original boxes. The only exception is with bbox_to_kornia_bbox* when original boxes are integer. They will be cast to float.
  • Remove boxes validation inside bbox_to_mask* and infer_bbox_shape* to match with transform_bbox. I think it should be up to the user to do it or use the future Boxes structure. Only a basic shape check is done now.
  • Renamed bbox_generator* to bbox_to_kornia_bbox* and changed its arguments.
  • Created kornia_bbox_to_bbox* that converts boxes in kornia format into a more standard format like xyxy, etc.
    • I'm not sure if I can make it differentiable for cases of xywh and xyzwhd.
  • transform_bbox: align to transform wrap convention: first pass boxes, then the transformation matrix.
  • boxes_to_mask & boxes_to_mask3d:
    • probably, it could be made differentiable with a performance penalty to remove inplace modifications.
    • Simplified.

Boxes 3D:

  • Renamed bbox_to_mask3d to bbox3d_to_mask3d to keep the convention of bbox3d.
  • Replace index_fill with tensor regular indexing (ex: t[:, 4, 5])
  • Add transform_bbox3d for consistency.
  • Code has been simplified.
  • Changed bbox_to_mask3d signature to accept depth, width and height instead of size. Now, it matches 2D convention and bbox_generator3d.

Tests:

  • Lots of new tests. Now, there is 100% coverage.
  • Grad checks. Current kornia.geometry.bbox wouldn't pass them either. So, I don't know if this new code is expected to pass.

TODO to do in future PRs):

  • Port kornia containers to use them. At least, expose bbox_v2. Care should be taken since transforms assumes +1 convention. At least, it happens with random flips. Check this article.

Types of changes

  • [] Bug fix (non-breaking change which fixes an issue)
  • [] Breaking change (fix or new feature that would cause existing functionality to change)
  • New tests added to cover the changes
  • Docstrings/Documentation updated

PR Checklist

PR Implementer

This is a small checklist for the implementation details of this PR.

If there are any questions regarding code style or other conventions check out our
summary.

  • Did you discuss the functionality or any breaking changes before ?
  • Pass all tests: did you test in local ? make test
  • Unittests: did you add tests for your new functionality ?
  • Documentations: did you build documentation ? make build-docs
  • Implementation: is your code well commented and follow conventions ? make lint
  • Docstrings & Typing: has your code documentation and typing ? make mypy
  • Update notebooks & documentation if necessary

KorniaTeam

KorniaTeam workflow
  • Assign correct label
  • Assign PR to a reviewer
  • Does this PR close an Issue? (add closes #IssueNumber at the bottom if
    not already in description)

Reviewer

Reviewer workflow
  • Do all tests pass? (Unittests, Typing, Linting, Documentation, Environment)
  • Does the implementation follow kornia design conventions?
  • Is the documentation complete enough ?
  • Are the tests covering simple and corner cases ?

[0., 0., 0., 0., 0.]]])
"""
_check_bbox_dimensions(boxes)
mask = torch.zeros((*boxes.shape[:-2], height, width), dtype=boxes.dtype, device=boxes.device)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't know a clean approach to support scripting and without creating a lot of boilderplate code for this case *boxes.shape[:-2] . There are several lines similar to this. ¿How could I refactor to be torchscript friendly?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just boxes.shape[0], boxes.shape[1] will do. It is 2D only right?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It could be (B, N, 4, 2) or (N, 4, 2). A workaround would be to convert (N, 4, 2) to (1, N, 4, 2), make all calculations and then move back to (N, 4, 2). If there isn't another option, I would do it.

@hal-314
Copy link
Contributor Author

hal-314 commented Jul 30, 2021

@shijianjian I have one doubt. After transforming a box, the box coordinates order (top-left, top-right, bottom-left, bottom-right). could be no longer valid. For example, with an horizontal flip. Current kornia.geometric.bbox.transform_bbox returns them with correcting the coordinates. I think that transform_bbox should correct coordinates order to return a valid kornia bbox. Users could mimic the old behavior by treating bbox coordinates as points. If it's needed, we could provide a function to convert from kornia bbox to points and from points to kornia bbox

@shijianjian
Copy link
Member

@shijianjian I have one doubt. After transforming a box, the box coordinates order (top-left, top-right, bottom-left, bottom-right). could be no longer valid. For example, with an horizontal flip. Current kornia.geometric.bbox.transform_bbox returns them with correcting the coordinates. I think that transform_bbox should correct coordinates order to return a valid kornia bbox. Users could mimic the old behavior by treating bbox coordinates as points. If it's needed, we could provide a function to convert from kornia bbox to points and from points to kornia bbox

Indeed. I had similar thoughts back to some time ago. The reason I did not do anything to update it because we also have the functionality of inversing back the bbox transform operation. If we modified the points after an operation, it will be no longer possible to inverse it back. It might break the consistency of some of our features, like what we did in AugmentationSequential to inverse the masks, bboxes, etc. Let me know if you have some solid solutions for this~

@hal-314
Copy link
Contributor Author

hal-314 commented Aug 1, 2021

@shijianjian I have one doubt. After transforming a box, the box coordinates order (top-left, top-right, bottom-left, bottom-right). could be no longer valid. For example, with an horizontal flip. Current kornia.geometric.bbox.transform_bbox returns them with correcting the coordinates. I think that transform_bbox should correct coordinates order to return a valid kornia bbox. Users could mimic the old behavior by treating bbox coordinates as points. If it's needed, we could provide a function to convert from kornia bbox to points and from points to kornia bbox

Indeed. I had similar thoughts back to some time ago. The reason I did not do anything to update it because we also have the functionality of inversing back the bbox transform operation. If we modified the points after an operation, it will be no longer possible to inverse it back. It might break the consistency of some of our features, like what we did in AugmentationSequential to inverse the masks, bboxes, etc. Let me know if you have some solid solutions for this~

As it's a long response, it could create a debate and it may result in another PR, I answer it in the original issue. Here is the answer

@@ -0,0 +1,11 @@
kornia.geometry.bbox
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Which is the command to generate docs locally? I don't know it and I couldn't check if it appear in kornia docs.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

from project root you can use make build-docs

assert kornia_xywh_plus_1.shape == expected_box_xywh_plus_1.shape
assert_allclose(kornia_xywh_plus_1, expected_box_xywh_plus_1)

def test_gradcheck(self, device, dtype):
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fails. I don't know how to solve it. It failed for the old kornia.geometry.bbox code.

test/geometry/test_bbox_v2.py:122 (TestBbox2D.test_gradcheck[cpu-float32])
self = <test.geometry.test_bbox_v2.TestBbox2D object at 0x7f128c56ff90>
device = device(type='cpu'), dtype = torch.float32

    def test_gradcheck(self, device, dtype):
        boxes1 = torch.tensor([[[1.0, 1.0], [3.0, 1.0], [3.0, 2.0], [1.0, 2.0]]], device=device, dtype=dtype)
        boxes1 = utils.tensor_to_gradcheck_var(boxes1)
        boxes2 = utils.tensor_to_gradcheck_var(boxes1.detach().clone())
    
        boxes_xyxy = torch.tensor([[1.0, 3.0, 5.0, 6.0]])
    
>       assert gradcheck(infer_bbox_shape, (boxes1,), raise_exception=True)

test_bbox_v2.py:130: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
../../.dev_env/envs/venv/lib/python3.7/site-packages/torch/autograd/gradcheck.py:1245: in gradcheck
    return _gradcheck_helper(**args)
../../.dev_env/envs/venv/lib/python3.7/site-packages/torch/autograd/gradcheck.py:1259: in _gradcheck_helper
    rtol, atol, check_grad_dtypes, check_forward_ad=check_forward_ad, nondet_tol=nondet_tol)
../../.dev_env/envs/venv/lib/python3.7/site-packages/torch/autograd/gradcheck.py:931: in _gradcheck_real_imag
    rtol, atol, check_grad_dtypes, nondet_tol)
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 

func = <function infer_bbox_shape at 0x7f128cde4f80>
func_out = (tensor([1.], dtype=torch.float64, grad_fn=<SelectBackward>), tensor([2.], dtype=torch.float64, grad_fn=<SelectBackward>))
tupled_inputs = (tensor([[[1., 1.],
         [3., 1.],
         [3., 2.],
         [1., 2.]]], dtype=torch.float64, grad_fn=<CopyBackwards>),)
outputs = (tensor([1.], dtype=torch.float64, grad_fn=<SelectBackward>), tensor([2.], dtype=torch.float64, grad_fn=<SelectBackward>))
eps = 1e-06, rtol = 0.001, atol = 1e-05, check_grad_dtypes = False
nondet_tol = 0.0

    def _slow_gradcheck(func, func_out, tupled_inputs, outputs, eps, rtol, atol, check_grad_dtypes,
                        nondet_tol, *, use_forward_ad=False, complex_indices=None, test_imag=False):
        if not outputs:
            return _check_no_differentiable_outputs(func, tupled_inputs, _as_tuple(func_out), eps)
    
        numerical = _transpose(_get_numerical_jacobian(func, tupled_inputs, outputs, eps=eps, is_forward_ad=use_forward_ad))
    
        if use_forward_ad:
            analytical_forward = _get_analytical_jacobian_forward_ad(func, tupled_inputs, outputs, check_grad_dtypes=check_grad_dtypes)
    
            for i, n_per_out in enumerate(numerical):
                for j, n in enumerate(n_per_out):
                    a = analytical_forward[j][i]
                    if not _allclose_with_type_promotion(a, n.to(a.device), rtol, atol):
                        raise GradcheckError(_get_notallclose_msg(a, n, i, j, complex_indices, test_imag,
                                                                  is_forward_ad=True))
        else:
            for i, o in enumerate(outputs):
                analytical = _check_analytical_jacobian_attributes(tupled_inputs, o, nondet_tol, check_grad_dtypes)
    
                for j, (a, n) in enumerate(zip(analytical, numerical[i])):
                    if not _allclose_with_type_promotion(a, n.to(a.device), rtol, atol):
>                       raise GradcheckError(_get_notallclose_msg(a, n, i, j, complex_indices, test_imag))
E                       torch.autograd.gradcheck.GradcheckError: Jacobian mismatch for output 0 with respect to input 0,
E                       numerical:tensor([[ 0.0000],
E                               [-0.5000],
E                               [ 0.0000],
E                               [-0.5000],
E                               [ 0.0000],
E                               [ 0.5000],
E                               [ 0.0000],
E                               [ 0.5000]], dtype=torch.float64)
E                       analytical:tensor([[ 0.],
E                               [-1.],
E                               [ 0.],
E                               [ 0.],
E                               [ 0.],
E                               [ 1.],
E                               [ 0.],
E                               [ 0.]], dtype=torch.float64)

../../.dev_env/envs/venv/lib/python3.7/site-packages/torch/autograd/gradcheck.py:978: GradcheckError

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@shijianjian @dkoguciuk any idea about this ?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think functions like infer_bbox_shape should be non-differentiable. The gradients of shape inferring does not make too much sense.

@hal-314
Copy link
Contributor Author

hal-314 commented Aug 29, 2021

@edgarriba @shijianjian this PR is ready to review. What it's left is to come up with a better name for bbox_to_kornia_bbox and kornia_bbox_to_bbox.

Also, I couldn't test the lastest changes with cuda as I currently don't have access to a pc with it. I don't expect test to fail. If so, I would get one.

Copy link
Member

@edgarriba edgarriba left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. I thought also we were going for a BoundingBox class too.

width: torch.Tensor,
height: torch.Tensor,
depth: torch.Tensor,
):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
):
) -> torch.Tensor:

raise ValueError(f"3D bbox shape must be (N, 8, 3) or (B, N, 8, 3). Got {boxes.shape}.")


def _boxes_to_polygons(xmin: torch.Tensor, ymin: torch.Tensor, width: torch.Tensor, height: torch.Tensor):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
def _boxes_to_polygons(xmin: torch.Tensor, ymin: torch.Tensor, width: torch.Tensor, height: torch.Tensor):
def _boxes_to_polygons(xmin: torch.Tensor, ymin: torch.Tensor, width: torch.Tensor, height: torch.Tensor) -> torch.Tensor:

Copy link
Member

@shijianjian shijianjian left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This codebase looks good. Agree with @edgarriba, that this suite API is still in "v1" version which is still functional. We may probably merge this API to the current ones. Regarding "v2", I think we were discussing a class BBox(), class Polygon(), or similar.

return hexahedrons


def kornia_bbox_to_bbox(kornia_boxes: torch.Tensor, mode: str = "xyxy") -> torch.Tensor:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

By "kornai_bbox", we are inferring the "polygon" , right? kornia_bbox is a bit confusing to me.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes. We are inferring the polygon. polygon_to_bbox would be a good name. However, for me, it's a bit confusing that a bbox module requires to apply the inverse operation (bbox_to_polygon) to work with it's functions.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see your point. That is probably the reason that why will we need a "polygon.py" module. The higher level bbox module shall encapsulate such conversions.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Like the Idea

``depth = zmax - zmin + 1``.
* 'xyzwhd': boxes are assumed to be in the format ``xmin, ymin, zmin, width, height, depth`` where
``width = xmax - xmin``, ``height = ymax - ymin`` and ``depth = zmax - zmin``.
* 'xyzwhd_plus_1': like 'xyzwhd' where ``width = xmax - xmin + 1``, ``height = ymax - ymin + 1`` and
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Probably need a better naming and explanation.

kornia_boxes = kornia_boxes if batched else kornia_boxes.unsqueeze(0)

# Create boxes in xyxy format.
boxes = torch.stack([kornia_boxes.min(dim=-2).values, kornia_boxes.max(dim=-2).values], dim=-2).view(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would ".values" break the gradients?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't know. Pytorch docs has this warning: This function produces deterministic (sub)gradients unlike min(dim=0). Could it be "dim=-2" part?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That warning probably related to pytorch/pytorch#35699.

@hal-314
Copy link
Contributor Author

hal-314 commented Aug 30, 2021

I didn't pursue BoundingBox, Boxes, Polygons or similar as I though it would be a big change and it didn't seem to me that they were clearly defined. So, I matched "v1" code style. I agree that using a class would be ideal. I think that it'll mainly a refactor of this code. However, I would need a clear guidance in how to name it, were it would live, which methods should have and if it need if the class needs to support torchscript.

About merging this code with "v1", it would break backward compatibility as they use different conventions.

Finally, I see that some cuda checks are failing. I'll fix them.

@shijianjian
Copy link
Member

@hal-314 why it does not have backward compatibility if we have had implemented the "plus_1" format?

@hal-314
Copy link
Contributor Author

hal-314 commented Aug 31, 2021

By backward compatibility I meant that users will need to update their code. I have changed almost every function, most of the time, to be coherent between them.

I implemented "plus_1" as this was a common format before. So, to help users to use kornia with their code base, I implemented it.

@shijianjian
Copy link
Member

@hal-314 If I understand it right, there is no differences to our users if we set the mode default to "plus_1"? Later, when we have a more concrete idea of which direction we will go, we may add a deprecation warning to deprecate "plus_1" notation.

@hal-314
Copy link
Contributor Author

hal-314 commented Aug 31, 2021

That change would ease users transition. Regardless, users will need to update their code. For example, there isn't bbox_generator anymore or transform_bbox neither accept or returns boxes in xyxy old format. Now, it expects and returns boxes as quadrilaterals and it's argument order has been changed to match with other parts of kornia functional api.

The old and new api aren't compatible as it's impossible to be if we want a more coherent api (we discussed in #1142).

@shijianjian
Copy link
Member

That change would ease users transition. Regardless, users will need to update their code. For example, there isn't bbox_generator anymore or transform_bbox neither accept or returns boxes in xyxy old format. Now, it expects and returns boxes as quadrilaterals and it's argument order has been changed to match with other parts of kornia functional api.

The old and new api aren't compatible as it's impossible to be if we want a more coherent api (we discussed in #1142).

We will need a wrapper to help users move painlessly then. The proper deprecation step must include a deprecation warning over several releases. I am not against updating code, but we shall ease the learning curve and let users know how to update.

Still, I think this shall be merged into v1 version. The v2 version shall be bigger, like with supports of BBox classes, etc.

@edgarriba
Copy link
Member

That change would ease users transition. Regardless, users will need to update their code. For example, there isn't bbox_generator anymore or transform_bbox neither accept or returns boxes in xyxy old format. Now, it expects and returns boxes as quadrilaterals and it's argument order has been changed to match with other parts of kornia functional api.
The old and new api aren't compatible as it's impossible to be if we want a more coherent api (we discussed in #1142).

We will need a wrapper to help users move painlessly then. The proper deprecation step must include a deprecation warning over several releases. I am not against updating code, but we shall ease the learning curve and let users know how to update.

Still, I think this shall be merged into v1 version. The v2 version shall be bigger, like with supports of BBox classes, etc.

agree here - we don't want users complaining for deprecating code without giving a shout out

@hal-314
Copy link
Contributor Author

hal-314 commented Sep 3, 2021

Agree with throwing depreciation warnings as seems that we will merge this code in "v1". I have a couple of doubts with this approach:

  • what to do with overlapping names that are incompatible? For example, transform_bbox or bbox_to_mask. I'm not see appending "v2" to be the right approach.
  • there is some decorator to raise depreciation warnings?

Finally, I don't know if it's easier to abandon this PR and rewrite it using the Bbox class directly. I'll try to make a proposal for this class so we can know how much work would be.

@shijianjian
Copy link
Member

  • what to do with overlapping names that are incompatible? For example, transform_bbox or bbox_to_mask. I'm not see appending "v2" to be the right approach.

Not sure how much differences there. Assuming its big, then we could do things like:

def transform_bbox(*args, **kwargs, mode='old'):
    if mode == 'old':
        warning.warn("deprecated")
        return _transform_bbox_v1(*args, **kwargs)
   return _transform_bbox_v2(*args, **kwargs)
  • there is some decorator to raise depreciation warnings?

I made one for augmentation but removed it after the deprecation. We may create a proper one under https://github.com/kornia/kornia/tree/master/kornia/utils.

@shijianjian
Copy link
Member

@hal-314 Hey, are you feeling like to get this PR merged soon? We are preparing the next release. Let us know!

@hal-314
Copy link
Contributor Author

hal-314 commented Sep 14, 2021

@shijianjian I tried to make this new api and the old one compatible. However, I found difficult to do. Also, it was hard to document. So, I trying the V2 approach with objects. Locally, I have a working implementation for 2D boxes. When I have it, it would superseed this. I hope to finish and make a PR by the next week or two.

@shijianjian
Copy link
Member

@shijianjian I tried to make this new api and the old one compatible. However, I found difficult to do. Also, it was hard to document. So, I trying the V2 approach with objects. Locally, I have a working implementation for 2D boxes. When I have it, it would superseed this. I hope to finish and make a PR by the next week or two.

I see. So, do you mean to close this PR then open a new one? By any chances, is there any functions in this PR that we can cherry-pick and merge?

@hal-314
Copy link
Contributor Author

hal-314 commented Sep 14, 2021

I see. So, do you mean to close this PR then open a new one?

Yes. If you prefer, I'll closed this PR without waiting to the need one.

By any chances, is there any functions in this PR that we can cherry-pick and merge?

I don't think so as box format was changed to remove "+1" convention, function names, etc. Batch support involves changing several lines + new tests. However, it's easy to add to the current codebase taking this PR as reference.

@hal-314
Copy link
Contributor Author

hal-314 commented Sep 19, 2021

Close this PR as #1304 superseeds it.

@hal-314 hal-314 closed this Sep 19, 2021
shijianjian added a commit that referenced this pull request Nov 17, 2021
* Add boxes V2 as discussed in #1142

* Implememted some PR suggestions

* Update docs + add missing tests

* Fix docs + add rectangle checks

* Fix deepsource errors

* Update test/geometry/test_bbox_v2.py

* Implements comments + fix tests

* Disable gradients in Boxes3D.to_tensor

* Rename bbox_v2 to boxes

* Fix "plus" convention. Previously, it used an incorrect sign

* Add missing 3D test

* Change to use "+1" convention as discussed in #1398

* Apply suggestions from code review

Co-authored-by: Edgar Riba <edgar.riba@gmail.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Fix PR comments

* doctest fixes

Co-authored-by: Jian Shi <sj8716643@126.com>
Co-authored-by: Edgar Riba <edgar.riba@gmail.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants