Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

updated transforms.ToPILImage, see #105 #122

Merged
merged 6 commits into from
Mar 23, 2017

Conversation

bodokaiser
Copy link
Contributor

@bodokaiser bodokaiser commented Mar 23, 2017

PR for #105. Changes should be backward compatible. Types are interfered from PIL.Image.mode when passing single channel images. See PIL modes here

How does ToPILImage handle types?

1- channel:
* FloatTensor -> ByteTensor -> 'L' Image (8-bit black and white)
* ShortTensor -> 'I;16' Image
* IntTensor -> 'I' Image
* analog for the respective numpy types (exception: np.float32 will be 'F' Image)
* other types (e.g. signed ints) -> error

3-channel
* same as before with addition that we will have an error for images with precision over 8-bit (because PIL.Image cannot handle more than 8-bit colors).

How does ToTensor handle types?

* np.ndarrays -> FloatTensor of range [0, 1]
* 'I;16' Image -> ShortTensor
* 'I' Image -> IntTensor
* every thing else -> ByteTensor

@bodokaiser
Copy link
Contributor Author

I want to note that the way I handle single channel images with precision > 8-bit breaks from tradition to scale tensors to [0, 1] and images to [0, 255].
In my opinion this is a necessary change because we would loose precision and / or would need to specify types on every transform call. Nevertheless I would suggest to introduce a breaking api change with the next major release which stops scaling images to [0, 255] and tensors to [0, 1]. This is something the user can easily implement using transforms.Lambda or transforms.Normalize.

return img.float().div(255)
# handle PIL Image
if pic.mode == 'I':
img = torch.from_numpy(np.array(pic, np.int32))

This comment was marked as off-topic.

@fmassa
Copy link
Member

fmassa commented Mar 23, 2017

I agree with you wrt keeping the original types and range of the image.
Not always we want to have images in the [0, 1] range (for example, when the target is an image), but we might still want to apply transforms to it and have it converted to a tensor. Currently, we need a LambdaFunction for that, to avoid this division by 255.
But the thing is, for most of the users, dividing by 255 is fine for them, as they don't have image targets.
Do you think it is a reasonable assumption that all the images in a dataset will have the same number of bytes per pixel? If yes, then we could have a constructor argument for an optional normalization factor?

@bodokaiser
Copy link
Contributor Author

👍 for the idea with the constructor. I you would use nfactor=255 as default to keep BC. Should we disable normalization then on nfactor=0 or nfactor=None?

@fmassa
Copy link
Member

fmassa commented Mar 23, 2017

Yeah, maybe let the normalization factor by default to be 255, and maybe we only convert to float if the normalization factor is not None?
This means that if we convert to float, we need to pass as well a "unnormalization" factor to ToPILImage, and maybe passing as a second argument the type to be converted to?

Also, maybe a more descriptive name, such as normalization_factor? Not sure on this one

@bodokaiser
Copy link
Contributor Author

Yeah, maybe let the normalization factor by default to be 255, and maybe we only convert to float if the normalization factor is not None?

Sounds reasonable.

This means that if we convert to float, we need to pass as well a "unnormalization" factor to ToPILImage, and maybe passing as a second argument the type to be converted to?

I don't think we need to pass the type around. In case of disabled normalization we can interfere type as implemented in the PR. If normalization should be applied we just expect ByteTensor.

@bodokaiser
Copy link
Contributor Author

@soumith @apaszke any additional comments / thoughts?

@soumith soumith merged commit 510f095 into pytorch:master Mar 23, 2017
@soumith
Copy link
Member

soumith commented Mar 23, 2017

looks pretty good, thank you!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants