Skip to content

Commit

Permalink
Keypoints support (#395)
Browse files Browse the repository at this point in the history
* Keypoints support for Crop transform

* Keypoints support for LongestMaxSize

* Keypoints support for SmallestMaxSize and updated test_longest_max_size_keypoints

* Keypoints support for Resize

* Update Readme

* Keypoints supports for Transpose

* Bboxes and keypoints supports for CropNonEmptyMaskIfExists

* Keypoints supports for RandomCropNearBbox

* Change angle when transpose keypoint

* Update Readme

* angle_to_2pi renamed to angle_to_2pi_range, and fix it
  • Loading branch information
Dipet authored and ternaus committed Oct 8, 2019
1 parent 3666b3d commit 30a3f30
Show file tree
Hide file tree
Showing 4 changed files with 154 additions and 14 deletions.
14 changes: 7 additions & 7 deletions README.md
Expand Up @@ -157,8 +157,8 @@ Spatial-level transforms will simultaneously change both an input image as well
| Transform | Image | Masks | BBoxes | Keypoints |
| ------------------------------------------------------------------------------------------------------------------------------------------------------------------- | :---: | :---: | :----: | :-------: |
| [CenterCrop](https://albumentations.readthedocs.io/en/latest/api/augmentations.html#albumentations.augmentations.transforms.CenterCrop) |||||
| [Crop](https://albumentations.readthedocs.io/en/latest/api/augmentations.html#albumentations.augmentations.transforms.Crop) |||| |
| [CropNonEmptyMaskIfExists](https://albumentations.readthedocs.io/en/latest/api/augmentations.html#albumentations.augmentations.transforms.CropNonEmptyMaskIfExists) ||| | |
| [Crop](https://albumentations.readthedocs.io/en/latest/api/augmentations.html#albumentations.augmentations.transforms.Crop) |||| |
| [CropNonEmptyMaskIfExists](https://albumentations.readthedocs.io/en/latest/api/augmentations.html#albumentations.augmentations.transforms.CropNonEmptyMaskIfExists) ||| | |
| [ElasticTransform](https://albumentations.readthedocs.io/en/latest/api/augmentations.html#albumentations.augmentations.transforms.ElasticTransform) ||| | |
| [Flip](https://albumentations.readthedocs.io/en/latest/api/augmentations.html#albumentations.augmentations.transforms.Flip) |||||
| [GridDistortion](https://albumentations.readthedocs.io/en/latest/api/augmentations.html#albumentations.augmentations.transforms.GridDistortion) ||| | |
Expand All @@ -170,23 +170,23 @@ Spatial-level transforms will simultaneously change both an input image as well
| [IAAPerspective](https://albumentations.readthedocs.io/en/latest/api/imgaug.html#albumentations.imgaug.transforms.IAAPerspective) |||||
| [IAAPiecewiseAffine](https://albumentations.readthedocs.io/en/latest/api/imgaug.html#albumentations.imgaug.transforms.IAAPiecewiseAffine) |||||
| [Lambda](https://albumentations.readthedocs.io/en/latest/api/augmentations.html#albumentations.augmentations.transforms.Lambda) |||||
| [LongestMaxSize](https://albumentations.readthedocs.io/en/latest/api/augmentations.html#albumentations.augmentations.transforms.LongestMaxSize) |||| |
| [LongestMaxSize](https://albumentations.readthedocs.io/en/latest/api/augmentations.html#albumentations.augmentations.transforms.LongestMaxSize) |||| |
| NoOp |||||
| [OpticalDistortion](https://albumentations.readthedocs.io/en/latest/api/augmentations.html#albumentations.augmentations.transforms.OpticalDistortion) ||| | |
| [PadIfNeeded](https://albumentations.readthedocs.io/en/latest/api/augmentations.html#albumentations.augmentations.transforms.PadIfNeeded) |||||
| [RandomCrop](https://albumentations.readthedocs.io/en/latest/api/augmentations.html#albumentations.augmentations.transforms.RandomCrop) |||||
| [RandomCropNearBBox](https://albumentations.readthedocs.io/en/latest/api/augmentations.html#albumentations.augmentations.transforms.RandomCropNearBBox) |||| |
| [RandomCropNearBBox](https://albumentations.readthedocs.io/en/latest/api/augmentations.html#albumentations.augmentations.transforms.RandomCropNearBBox) |||| |
| [RandomGridShuffle](https://albumentations.readthedocs.io/en/latest/api/augmentations.html#albumentations.augmentations.transforms.RandomGridShuffle) ||| | |
| [RandomResizedCrop](https://albumentations.readthedocs.io/en/latest/api/augmentations.html#albumentations.augmentations.transforms.RandomResizedCrop) |||||
| [RandomRotate90](https://albumentations.readthedocs.io/en/latest/api/augmentations.html#albumentations.augmentations.transforms.RandomRotate90) |||||
| [RandomScale](https://albumentations.readthedocs.io/en/latest/api/augmentations.html#albumentations.augmentations.transforms.RandomScale) |||||
| [RandomSizedBBoxSafeCrop](https://albumentations.readthedocs.io/en/latest/api/augmentations.html#albumentations.augmentations.transforms.RandomSizedBBoxSafeCrop) |||| |
| [RandomSizedCrop](https://albumentations.readthedocs.io/en/latest/api/augmentations.html#albumentations.augmentations.transforms.RandomSizedCrop) |||||
| [Resize](https://albumentations.readthedocs.io/en/latest/api/augmentations.html#albumentations.augmentations.transforms.Resize) |||| |
| [Resize](https://albumentations.readthedocs.io/en/latest/api/augmentations.html#albumentations.augmentations.transforms.Resize) |||| |
| [Rotate](https://albumentations.readthedocs.io/en/latest/api/augmentations.html#albumentations.augmentations.transforms.Rotate) |||||
| [ShiftScaleRotate](https://albumentations.readthedocs.io/en/latest/api/augmentations.html#albumentations.augmentations.transforms.ShiftScaleRotate) |||||
| [SmallestMaxSize](https://albumentations.readthedocs.io/en/latest/api/augmentations.html#albumentations.augmentations.transforms.SmallestMaxSize) |||| |
| [Transpose](https://albumentations.readthedocs.io/en/latest/api/augmentations.html#albumentations.augmentations.transforms.Transpose) |||| |
| [SmallestMaxSize](https://albumentations.readthedocs.io/en/latest/api/augmentations.html#albumentations.augmentations.transforms.SmallestMaxSize) |||| |
| [Transpose](https://albumentations.readthedocs.io/en/latest/api/augmentations.html#albumentations.augmentations.transforms.Transpose) |||| |
| [VerticalFlip](https://albumentations.readthedocs.io/en/latest/api/augmentations.html#albumentations.augmentations.transforms.VerticalFlip) |||||

## Migrating from torchvision to albumentations
Expand Down
22 changes: 22 additions & 0 deletions albumentations/augmentations/functional.py
Expand Up @@ -32,6 +32,16 @@ def wrapped_function(img, *args, **kwargs):
return wrapped_function


def angle_to_2pi_range(angle):
if 0 <= angle <= 2 * np.pi:
return angle

if angle < 0:
angle += (abs(angle) // (2 * np.pi) + 1) * 2 * np.pi

return angle % (2 * np.pi)


def preserve_shape(func):
"""
Preserve shape of the image
Expand Down Expand Up @@ -1598,3 +1608,15 @@ def swap_tiles_on_image(image, tiles):
]

return new_image


def keypoint_transpose(keypoint):
x, y, angle, scale = keypoint
angle = angle_to_2pi_range(angle)

if angle <= np.pi:
angle = np.pi - angle
else:
angle = 3 * np.pi - angle

return y, x, angle, scale
72 changes: 65 additions & 7 deletions albumentations/augmentations/transforms.py
Expand Up @@ -171,7 +171,7 @@ class Crop(DualTransform):
y_max (int): maximum lower right y coordinate.
Targets:
image, mask, bboxes
image, mask, bboxes, keypoints
Image types:
uint8, float32
Expand All @@ -190,6 +190,16 @@ def apply(self, img, **params):
def apply_to_bbox(self, bbox, **params):
return F.bbox_crop(bbox, x_min=self.x_min, y_min=self.y_min, x_max=self.x_max, y_max=self.y_max, **params)

def apply_to_keypoint(self, keypoint, **params):
return F.crop_keypoint_by_coords(
keypoint,
crop_coords=[self.x_min, self.y_min, self.x_max, self.y_max],
crop_height=self.y_max - self.y_min,
crop_width=self.x_max - self.x_min,
rows=params["rows"],
cols=params["cols"],
)

def get_transform_init_args_names(self):
return ("x_min", "y_min", "x_max", "y_max")

Expand Down Expand Up @@ -293,7 +303,7 @@ class Transpose(DualTransform):
p (float): probability of applying the transform. Default: 0.5.
Targets:
image, mask, bboxes
image, mask, bboxes, keypoints
Image types:
uint8, float32
Expand All @@ -305,6 +315,9 @@ def apply(self, img, **params):
def apply_to_bbox(self, bbox, **params):
return F.bbox_transpose(bbox, 0, **params)

def apply_to_keypoint(self, keypoint, **params):
return F.keypoint_transpose(keypoint)

def get_transform_init_args_names(self):
return ()

Expand All @@ -318,7 +331,7 @@ class LongestMaxSize(DualTransform):
p (float): probability of applying the transform. Default: 1.
Targets:
image, mask, bboxes
image, mask, bboxes, keypoints
Image types:
uint8, float32
Expand All @@ -336,6 +349,13 @@ def apply_to_bbox(self, bbox, **params):
# Bounding box coordinates are scale invariant
return bbox

def apply_to_keypoint(self, keypoint, **params):
height = params["rows"]
width = params["cols"]

scale = self.max_size / max([height, width])
return F.keypoint_scale(keypoint, scale, scale)

def get_transform_init_args_names(self):
return ("max_size", "interpolation")

Expand All @@ -349,7 +369,7 @@ class SmallestMaxSize(DualTransform):
p (float): probability of applying the transform. Default: 1.
Targets:
image, mask, bboxes
image, mask, bboxes, keypoints
Image types:
uint8, float32
Expand All @@ -366,6 +386,13 @@ def apply(self, img, interpolation=cv2.INTER_LINEAR, **params):
def apply_to_bbox(self, bbox, **params):
return bbox

def apply_to_keypoint(self, keypoint, **params):
height = params["rows"]
width = params["cols"]

scale = self.max_size / min([height, width])
return F.keypoint_scale(keypoint, scale, scale)

def get_transform_init_args_names(self):
return ("max_size", "interpolation")

Expand All @@ -382,7 +409,7 @@ class Resize(DualTransform):
p (float): probability of applying the transform. Default: 1.
Targets:
image, mask, bboxes
image, mask, bboxes, keypoints
Image types:
uint8, float32
Expand All @@ -401,6 +428,12 @@ def apply_to_bbox(self, bbox, **params):
# Bounding box coordinates are scale invariant
return bbox

def apply_to_keypoint(self, keypoint, **params):
height = params["rows"]
width = params["cols"]

return F.keypoint_scale(keypoint, self.height / height, self.width / width)

def get_transform_init_args_names(self):
return ("height", "width", "interpolation")

Expand Down Expand Up @@ -707,7 +740,7 @@ class RandomCropNearBBox(DualTransform):
p (float): probability of applying the transform. Default: 1.
Targets:
image
image, mask, bboxes, keypoints
Image types:
uint8, float32
Expand Down Expand Up @@ -738,6 +771,16 @@ def apply_to_bbox(self, bbox, x_min=0, x_max=0, y_min=0, y_max=0, **params):
w_start = x_min
return F.bbox_crop(bbox, y_max - y_min, x_max - x_min, h_start, w_start, **params)

def apply_to_keypoint(self, keypoint, x_min=0, x_max=0, y_min=0, y_max=0, **params):
return F.crop_keypoint_by_coords(
keypoint,
crop_coords=[x_min, y_min, x_max, y_max],
crop_height=y_max - y_min,
crop_width=x_max - x_min,
rows=params["rows"],
cols=params["cols"],
)

@property
def targets_as_params(self):
return ["cropping_bbox"]
Expand Down Expand Up @@ -980,7 +1023,7 @@ class CropNonEmptyMaskIfExists(DualTransform):
p (float): probability of applying the transform. Default: 1.0.
Targets:
image, mask
image, mask, bboxes, keypoints
Image types:
uint8, float32
Expand All @@ -1002,6 +1045,21 @@ def __init__(self, height, width, ignore_values=None, ignore_channels=None, alwa
def apply(self, img, x_min=0, x_max=0, y_min=0, y_max=0, **params):
return F.crop(img, x_min, y_min, x_max, y_max)

def apply_to_bbox(self, bbox, x_min=0, x_max=0, y_min=0, y_max=0, **params):
return F.bbox_crop(
bbox, x_min=x_min, x_max=x_max, y_min=y_min, y_max=y_max, rows=params["rows"], cols=params["cols"]
)

def apply_to_keypoint(self, keypoint, x_min=0, x_max=0, y_min=0, y_max=0, **params):
return F.crop_keypoint_by_coords(
keypoint,
crop_coords=[x_min, y_min, x_max, y_max],
crop_height=y_max - y_min,
crop_width=x_max - x_min,
rows=params["rows"],
cols=params["cols"],
)

@property
def targets_as_params(self):
return ["mask"]
Expand Down
60 changes: 60 additions & 0 deletions tests/test_transforms.py
Expand Up @@ -411,3 +411,63 @@ def test_downscale(interpolation):
transformed = aug(image=img)["image"]
func_applied = F.downscale(img, scale=0.5, interpolation=interpolation)
np.testing.assert_almost_equal(transformed, func_applied)


def test_crop_keypoints():
image = np.random.randint(0, 256, (100, 100), np.uint8)
keypoints = [[50, 50, 0, 0]]

aug = A.Crop(0, 0, 80, 80, p=1)
result = aug(image=image, keypoints=keypoints)
assert result["keypoints"] == keypoints

aug = A.Crop(50, 50, 100, 100, p=1)
result = aug(image=image, keypoints=keypoints)
assert result["keypoints"] == [[0, 0, 0, 0]]


def test_longest_max_size_keypoints():
img = np.random.randint(0, 256, [50, 10], np.uint8)
keypoints = [[9, 5, 0, 0]]

aug = A.LongestMaxSize(max_size=100, p=1)
result = aug(image=img, keypoints=keypoints)
assert result["keypoints"] == [[18, 10, 0, 0]]

aug = A.LongestMaxSize(max_size=5, p=1)
result = aug(image=img, keypoints=keypoints)
assert result["keypoints"] == [[0.9, 0.5, 0, 0]]

aug = A.LongestMaxSize(max_size=50, p=1)
result = aug(image=img, keypoints=keypoints)
assert result["keypoints"] == [[9, 5, 0, 0]]


def test_smallest_max_size_keypoints():
img = np.random.randint(0, 256, [50, 10], np.uint8)
keypoints = [[9, 5, 0, 0]]

aug = A.SmallestMaxSize(max_size=100, p=1)
result = aug(image=img, keypoints=keypoints)
assert result["keypoints"] == [[90, 50, 0, 0]]

aug = A.SmallestMaxSize(max_size=5, p=1)
result = aug(image=img, keypoints=keypoints)
assert result["keypoints"] == [[4.5, 2.5, 0, 0]]

aug = A.SmallestMaxSize(max_size=10, p=1)
result = aug(image=img, keypoints=keypoints)
assert result["keypoints"] == [[9, 5, 0, 0]]


def test_resize_keypoints():
img = np.random.randint(0, 256, [50, 10], np.uint8)
keypoints = [[9, 5, 0, 0]]

aug = A.Resize(height=100, width=5, p=1)
result = aug(image=img, keypoints=keypoints)
assert result["keypoints"] == [[18, 2.5, 0, 0]]

aug = A.Resize(height=50, width=10, p=1)
result = aug(image=img, keypoints=keypoints)
assert result["keypoints"] == [[9, 5, 0, 0]]

0 comments on commit 30a3f30

Please sign in to comment.