New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

images are types other than unsigned short stay in 0-255 range #976

Closed
kino-dome opened this Issue Nov 22, 2017 · 6 comments

Comments

Projects
None yet
2 participants
@kino-dome
Contributor

kino-dome commented Nov 22, 2017

Hi!

Expected Behavior

As mentioned in pixel_traits comments in http://dlib.net/dlib/pixel.h.html, when using grayscale images with underlying types such as float or unsigned int, the min and max should be std::numeric_limits::min() and std::numeric_limits::max() respectively.

Current Behavior

This is not true when we load a jpg picture (Puppy.jpg https://upload.wikimedia.org/wikipedia/commons/thumb/6/6e/Golde33443.jpg/160px-Golde33443.jpg) with the below code:
dlib::matrix<unsigned short> img;
dlib::load_image(img, getAssetPath("puppy.jpg").string());

here img has a range of 0-255 for its values instead of 0-65535. Is it somehow connected to the source image?

Steps to Reproduce

dlib::matrix<float> img;
dlib::load_image(img, getAssetPath("puppy.jpg").string());
const float* data = reinterpret_cast<const float*>(dlib::image_data(img));
for (int i=0; i< width*height*numChannels; i++){
     std::cout<<*data<<std::endl;
     data++;
}
  • Version: 19.0.7
  • Where did you get dlib: github master
  • Platform: macOS 10.12.6
  • Compiler: Apple LLVM 9.0
@davisking

This comment has been minimized.

Show comment
Hide comment
@davisking

davisking Nov 22, 2017

Owner

This is the intended behavior. The image loading functions don't mutate the values in the image. If a pixel has a value of 123 in the image file then it's going to have a value of 123 in the loaded image object.

As an side, you don't need to do that reinterpret_cast. That's a really bad idea. Just say img(row,col). Or img.begin() go get an iterator pointing at the first pixel.

Owner

davisking commented Nov 22, 2017

This is the intended behavior. The image loading functions don't mutate the values in the image. If a pixel has a value of 123 in the image file then it's going to have a value of 123 in the loaded image object.

As an side, you don't need to do that reinterpret_cast. That's a really bad idea. Just say img(row,col). Or img.begin() go get an iterator pointing at the first pixel.

@davisking davisking closed this Nov 22, 2017

@kino-dome

This comment has been minimized.

Show comment
Hide comment
@kino-dome

kino-dome Nov 22, 2017

Contributor

Thanks Davis for the info, the thing is when converting dlib images to the other library I'm working on (libcinder), I need to know the range of the values since in some cases I need to map it to libcinder's desired range (0.0f - 1.0f in case of floats for instance).
I feel the same as you when you say that if an 8bit image is loaded into an array2d in dlib, the values should stay in the same range as 0-255, but then this is on the user's shoulders to load truly 32 bit images into 32bit containers and so on. So I can't ever know this for sure to always map the right ranges, the best think I can do for floats for examples is this:
float cinderPixel = map(dlibPixel, std::numeric_limits<float>::min(), std::numeric_limits<float>::max(), 0.0f, 1.0f);
Or do you have a better idea? When loading a 32 bit EXR in dlib for example, do the values go through all the available range (+/- 3.4e +/- 38) or are they kept in another range?

Thanks a lot Davis for your help :)

Contributor

kino-dome commented Nov 22, 2017

Thanks Davis for the info, the thing is when converting dlib images to the other library I'm working on (libcinder), I need to know the range of the values since in some cases I need to map it to libcinder's desired range (0.0f - 1.0f in case of floats for instance).
I feel the same as you when you say that if an 8bit image is loaded into an array2d in dlib, the values should stay in the same range as 0-255, but then this is on the user's shoulders to load truly 32 bit images into 32bit containers and so on. So I can't ever know this for sure to always map the right ranges, the best think I can do for floats for examples is this:
float cinderPixel = map(dlibPixel, std::numeric_limits<float>::min(), std::numeric_limits<float>::max(), 0.0f, 1.0f);
Or do you have a better idea? When loading a 32 bit EXR in dlib for example, do the values go through all the available range (+/- 3.4e +/- 38) or are they kept in another range?

Thanks a lot Davis for your help :)

@davisking

This comment has been minimized.

Show comment
Hide comment
@davisking

davisking Nov 22, 2017

Owner

Mapping floats to the range +/- 3.4e +/- 38 is a terrible idea. Numerically, now if you do simple operations like difference pixels you will have the differencing do nothing since subtracting floating point values far apart in value like that is a noop (e.g. 1e30 - 1 == 1e30, which is not what you want). It would break essentially any image processing algorithm when applied to such images.

As for 16bit images, just because an image is 16 bits doesn't mean it has 16bits of information in it. I've encountered plenty of imaging sensors in my career that output images with 13 or 14 bits. These are naturally packed into uint16_t or something similar. You certainly would not want to assume the max value was 65565, because it isn't. You also wouldn't want to remap it to 0 to 65565. A common reason to use higher bit images is to allow you to do arithmetic on an image that needs higher precision because of some aspect of the intervening calculation. Maybe you are multiplying pixels by some number. But if you already saturated the range by stuffing 8 bits into the 8 high order bits of a 16bit integer, now you have lost all room to create larger values. Again, not what you want.

I could go on all day about how this is a bad idea and what kind of trouble you will run into if you try to do this in real image processing code. Realistically, for images most people encounter, the range is always 0 to 255. Some libraries stuff things into floats and in those cases it's generally 0 to 1. But any time you are loading an image from disk it's almost certainly going to have pixels in the range 0 to 255 unless you are dealing with special sensors, which for most users is not the case, and even if it was, automatic scaling would just make your life harder.

I should also point out that I've encountered plenty of 16 bit image formats that, when you look at them, actually contain 8bit data. What happens when you load that kind of image into an 8bit representation if there is automatic scaling? Do all the pixels get set to 0 because the high order bits of the image are all 0? Again, not what you want.

Owner

davisking commented Nov 22, 2017

Mapping floats to the range +/- 3.4e +/- 38 is a terrible idea. Numerically, now if you do simple operations like difference pixels you will have the differencing do nothing since subtracting floating point values far apart in value like that is a noop (e.g. 1e30 - 1 == 1e30, which is not what you want). It would break essentially any image processing algorithm when applied to such images.

As for 16bit images, just because an image is 16 bits doesn't mean it has 16bits of information in it. I've encountered plenty of imaging sensors in my career that output images with 13 or 14 bits. These are naturally packed into uint16_t or something similar. You certainly would not want to assume the max value was 65565, because it isn't. You also wouldn't want to remap it to 0 to 65565. A common reason to use higher bit images is to allow you to do arithmetic on an image that needs higher precision because of some aspect of the intervening calculation. Maybe you are multiplying pixels by some number. But if you already saturated the range by stuffing 8 bits into the 8 high order bits of a 16bit integer, now you have lost all room to create larger values. Again, not what you want.

I could go on all day about how this is a bad idea and what kind of trouble you will run into if you try to do this in real image processing code. Realistically, for images most people encounter, the range is always 0 to 255. Some libraries stuff things into floats and in those cases it's generally 0 to 1. But any time you are loading an image from disk it's almost certainly going to have pixels in the range 0 to 255 unless you are dealing with special sensors, which for most users is not the case, and even if it was, automatic scaling would just make your life harder.

I should also point out that I've encountered plenty of 16 bit image formats that, when you look at them, actually contain 8bit data. What happens when you load that kind of image into an 8bit representation if there is automatic scaling? Do all the pixels get set to 0 because the high order bits of the image are all 0? Again, not what you want.

@davisking

This comment has been minimized.

Show comment
Hide comment
@davisking

davisking Nov 22, 2017

Owner

Maybe I should also say that this is a good question since the right thing to do is not obvious :)

The long and short of it is that you need to know what kind of images you are working with and you can't assume it based on something like the nominal bit value of the source image format. But most image files are 0-255, which is what you should assume unless you know otherwise.

Owner

davisking commented Nov 22, 2017

Maybe I should also say that this is a good question since the right thing to do is not obvious :)

The long and short of it is that you need to know what kind of images you are working with and you can't assume it based on something like the nominal bit value of the source image format. But most image files are 0-255, which is what you should assume unless you know otherwise.

@kino-dome

This comment has been minimized.

Show comment
Hide comment
@kino-dome

kino-dome Nov 22, 2017

Contributor

Hi Davis, Thanks so much for your thorough explanation and the time you took to answer. I know a lot more now with what you said, I think I'll make the conversion function in a way that it accepts source ranges but imagine the 0-255 range in case users say otherwise.

Contributor

kino-dome commented Nov 22, 2017

Hi Davis, Thanks so much for your thorough explanation and the time you took to answer. I know a lot more now with what you said, I think I'll make the conversion function in a way that it accepts source ranges but imagine the 0-255 range in case users say otherwise.

@davisking

This comment has been minimized.

Show comment
Hide comment
@davisking

davisking Nov 22, 2017

Owner
Owner

davisking commented Nov 22, 2017

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment