New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Is it possible to modify the labels according to the random rotation of images? #5215
Comments
Hi @WYBupup, Thank you for reaching out. |
Thanks for your reply. I have almost complete the pipeline following your guidance. |
If you're already rotating the images, you can pass the size explicitly to import nvidia.dali as dali
import nvidia.dali.fn as fn
import PIL.Image
import numpy as np
@dali.pipeline_def(batch_size=1, num_threads=4, device_id=0)
def mypipe():
enc, _ = fn.readers.file(file_root=".", files=["alley.png"])
img = fn.decoders.image(enc, device="mixed")
img = fn.resize(img, mode="not_larger", size=256)
rep = fn.rotate(img, angle=90, size=(256, 256))
pad = fn.rotate(img, angle=90, size=(256, 256), fill_value=0)
return rep, pad
pipe = mypipe()
pipe.build()
rep, pad = pipe.run() The results: PIL.Image.fromarray(np.array(rep.as_cpu()[0])) PIL.Image.fromarray(np.array(pad.as_cpu()[0])) If you need a color padding, you can, somewhat counterintuitively, use rot = fn.rotate(img, angle=90)
crop = fn.crop(rot, crop_pos_x=0.5, crop_pos_y=0.5, crop=(256, 256), out_of_bounds_policy="pad", fill_values=[0x76, 0xb9, 0x00]) |
thanks a lot! This is really helpful! |
Also, if you're fine with bilinear resizing without antialiasing, then you can do all those transforms in one go with import nvidia.dali as dali
import nvidia.dali.fn as fn
import PIL.Image
import numpy as np
@dali.pipeline_def(batch_size=1, num_threads=4, device_id=0)
def mypipe():
enc, _ = fn.readers.file(file_root=".", files=["alley.png"])
img = fn.decoders.image(enc, device="mixed")
shape = fn.peek_image_shape(enc)
h = shape[0]
w = shape[1]
size = fn.stack(w, h)
scale = dali.math.min(256/w, 256/h)
out_size = fn.cast(scale * size, dtype=dali.types.INT32)
# use negative angle, since here we use source-to-destination matrix
mtx = fn.transforms.rotation(angle=-90, center=size/2)
mtx = fn.transforms.scale(mtx, scale=fn.stack(scale, scale))
mtx = fn.transforms.translation(mtx, offset=(256.0 - out_size) // 2)
warped = fn.warp_affine(img, size=(256, 256), matrix=mtx, fill_value=0, inverse_map=False)
return warped
pipe = mypipe()
pipe.build()
warped, = pipe.run() The aliasing artifacts are quite obvious when you compare this image to the previous ones, but if it's OK for you, then this method will certainly be the most performant one. The added benefit is that you end up with a complete transformation matrix, so if your labels are in fact some points, you can use this matrix to transform them. See this tutorial to learn how to use a transformation matrix to transform keypoints alongside images. The methods sorted in efficiency order:
|
Describe the question.
I am now working on a training framework for image rotation(0、90、180、270 degree) recognization task. Since my dataset is so large that it is unavailable to rotate every images tothe above four angles, becasue there is not enough space on the machine to store them.
As a result, my approach is to, in the preprocess step, randomly rotate images to one of the above four degrees and change the label accordingly. However, itmakes the time cost of preprocessing be the main part of the total time cost.
I want to use DALI to accelerate preprocessing process, and I wonder whether I could random rotate the image and change the label accordingly in the pipeline?
Check for duplicates
The text was updated successfully, but these errors were encountered: