Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

support int16 grayscale images #105

Closed
bodokaiser opened this issue Mar 14, 2017 · 12 comments
Closed

support int16 grayscale images #105

bodokaiser opened this issue Mar 14, 2017 · 12 comments

Comments

@bodokaiser
Copy link
Contributor

This is often the case with medical (MRI) data.

Required changes would be in ToTensor probably something like:

# PIL image mode: 1, L, P, I, F, RGB, YCbCr, RGBA, CMYK
if pic.mode == 'YCbCr':
  nchannel = 3
else:
  nchannel = len(pic.mode)
# handle PIL Image
buf = pic.tobytes()
if len(buf) > pic.width * pic.height * nchannel:
  img = torch.LongTensor(torch.LongStorage.from_buffer(buf))
else:
  img = torch.ByteTensor(torch.ByteStorage.from_buffer(pic.tobytes()))
img = img.view(pic.size[1], pic.size[0], nchannel)

as well as in ToPILImage (just remove normalization to [0, 255] here?).

However I can't assess possible side effects. int16 support may be not very good in pillow (e.g. plt.imshow(Image.fromarray(int16_np_array)) does not work) also there may be other transforms which depend on [0, 255] byte range.

@fmassa
Copy link
Member

fmassa commented Mar 14, 2017

I'd say as long as the returned tensor is properly converted to float and scaled to [0,1], things should be fine.
But we need to check if standard image transforms (like rotating, cropping, etc) work ok in PIL for int16 type.

Also, LongTensor is actually int64, you might be looking for a ShortTensor instead (which is signed).

@bodokaiser
Copy link
Contributor Author

Thats the problem. I put a minimal example together where you can examine the problem.

from torchvision.transforms import Compose, ToPILImage, ToTensor
from matplotlib import pyplot as plt

import skimage.io
import numpy as np

img = skimage.io.imread('mr.tif')
print('img', img.shape, img.dtype)

plt.imshow(img)
plt.show()

transform = Compose([
    ToPILImage(),
    ToTensor(),
])

timg = transform(np.expand_dims(img, 2))

plt.imshow(timg[0].numpy())
plt.show()

img1
img2

here you find corresponding the tiff file.
mr.tif.zip

@fmassa
Copy link
Member

fmassa commented Mar 14, 2017

The example that you mentioned shows that the current code is not adapted to int16 images, or did you try adding the modifications you mentioned?

@bodokaiser
Copy link
Contributor Author

updated:

# PIL image mode: 1, L, P, I, F, RGB, YCbCr, RGBA, CMYK
if pic.mode == 'YCbCr':
  nchannel = 3
else:
  nchannel = len(pic.mode)
# handle PIL Image
buf = pic.tobytes()
if len(buf) > pic.width * pic.height * nchannel:
  img = torch.ShortTensor(torch.ShortStorage.from_buffer(buf, 'native'))
else:
  img = torch.ByteTensor(torch.ByteStorage.from_buffer(pic.tobytes()))
img = img.view(pic.size[1], pic.size[0], nchannel)

fails with RuntimeError: size '[466 x 394 x 1]' is invalid for input of with 367208 elements at /private/var/folders/y0/d4npmpd50971gpgqxtsvc25m0000gn/T/pip-_fraocf5-build/torch/lib/TH/THStorage.c:59 however changing to:

if len(buf) > pic.width * pic.height * nchannel:
  img = torch.ShortTensor(np.fromstring(buf, dtype=np.int16)[0::2])
else:
  img = torch.ByteTensor(torch.ByteStorage.from_buffer(pic.tobytes()))
img = img.contigouos().view(pic.size[1], pic.size[0], nchannel)

does the job. Furthermore we need to change:

class ToPILImage(object):
    """Converts a torch.*Tensor of range [0, 1] and shape C x H x W
    or numpy ndarray of dtype=uint8, range[0, 255] and shape H x W x C
    to a PIL.Image of range [0, 255]
    """

    def __call__(self, pic):
        npimg = pic
        mode = None
        if not isinstance(npimg, np.ndarray):
            npimg = pic.mul(255).byte().numpy()
            npimg = np.transpose(npimg, (1, 2, 0))

        if npimg.shape[2] == 1:
            npimg = npimg[:, :, 0]
            if npimg.dtype != np.int16:
                mode = "L"

        return Image.fromarray(npimg, mode=mode)

This works but is of course just a quick hack.

@fmassa
Copy link
Member

fmassa commented Mar 14, 2017

Ok, cool.
But I was wondering, does PIL supports natively image operations in int16 image, such as rotate or crop? If it doesn't, then even if we adapt ToPILImage and ToTensor, we still won't be able to perform these operations. Also, as ToTensor converts the image to float, there would be no way of knowing if the original image was int16 or uint8, meaning that applying ToTensor() followed by ToPILImage would not return the identity.

@bodokaiser
Copy link
Contributor Author

bodokaiser commented Mar 14, 2017

According to this issue and this PR it does only for grayscale images.

Regarding the behavior of ToTensor() one way to solve this would be that ToTensor() keeps the data type from PIL.Image but can take the target data type as argument.
Alternatively we could also ignore the fact that ToPILImage(ToTensor()) does not return the identity then we would have no API breaks and I also do do think we loose anything through this?

@soumith
Copy link
Member

soumith commented Mar 23, 2017

Bodo, it looks like you've been making a lot of progress already.

If you want to fire a few PRs to make torchvision work with int16 out of the box, I would love to have them. If not, I will eventually get to this for sure.

@bodokaiser
Copy link
Contributor Author

bodokaiser commented Mar 23, 2017 via email

@soumith
Copy link
Member

soumith commented Mar 23, 2017

0 to 65 sounds fine for int16/uint16. You can remove image scaling if you want too, I dont have experience with this domain, so I'll let you make a call.

In the case of identity preservation, ToPILImage needs to take a kwarg of Int16=True or something for the identity loop to happen. I dont see a better way. Same for ToTensor, taking the target data type as a kwarg seems good.

@bodokaiser
Copy link
Contributor Author

@soumith @fmassa PR #122 is up for discussion!

@alykhantejani
Copy link
Contributor

alykhantejani commented Sep 6, 2017

@fmassa I think this can now be closed as #122 was merged.

@fmassa
Copy link
Member

fmassa commented Sep 6, 2017

Thanks @alykhantejani !

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants