Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Finalize ColorType enum #855

Closed
fintelia opened this issue Jan 18, 2019 · 29 comments
Closed

Finalize ColorType enum #855

fintelia opened this issue Jan 18, 2019 · 29 comments
Assignees
Milestone

Comments

@fintelia
Copy link
Contributor

The ColorType enum is used to represent the meaning of subcomponents of Pixels and Decoders. However it has a few issues:

  • Many values of it are completely ridiculous and almost surely never used in practice. Why would anyone have an image with 157 bits per pixel BGR?
  • Parts of the library are unimplemented for certain color types. At very least it should be possible to convert between any pair of color types (filling any missing values with some reasonable default).
  • Adding enum variants would be a breaking change to the library, yet some relatively useful/common formats (like floating point RGBA) are missing. This needs to be resolved before a 1.0 release.

One option is to replace the enum with the something similar to the following (based on the supported modes in the Pillow library):

enum ColorType {
    // 8-bits per channel
    Palette,
    Luminance,
    LuminanceAlpha,
    RGB,
    RGBA,

    // Reversed channel order
    BGR,
    BGRA,

    // Exotic single channel bit depths
    U1,
    I32,
    F32,

    // High bit depth color images
    RGBA16,
    RGBA32F,
}

Questions

[ ] Do we want to include BGR / BGRA formats?
[ ] Should we include U1 (packed 1 bit per pixel, 8 pixels per byte)?
[ ] Should we define a smaller set of "core" types, say Luminance + RGB + RGBA + RGBA32F, and then focus on conversions to and between them?

@fintelia fintelia added this to the 1.0 milestone Jan 18, 2019
@fintelia fintelia self-assigned this Jan 18, 2019
@HeroicKatora
Copy link
Member

HeroicKatora commented Jan 20, 2019

Could it be a possibility to differentiate between the format of a pixel, the available color types, and the color space. For example, both RGBA(8) and BGRA(8) offer 8-bit deep rgba data. Similarly, a planar image with these four color channels could offer the same. Therefore, it should be possible to trivially convert between these buffer layouts while preserving–and independently of– the color space. However, a YCbCr color type would require knowing both color spaces for accurate conversion, Palette is somewhere in between if the underlying color type and color space is the same but we should still check color space.

In any case, reinterpretation of an image under a different color space with the same color type and pixel format should also always be a trivial operation. Such that decoders can supply color space information independently and the programmer can overwrite this without effort.

@fintelia
Copy link
Contributor Author

I am in favor of keeping color space information separate from the pixel format and color type. At the same time I'm reluctant to try to divide pixel format and color type. I think it would end up being more trouble than it was worth. This way we have only one enum and all possible variants make sense. It would take some archane type trickery to prevent nonsensical formats like 2 channel color types in BGR order to avoid having to litter the code with unimplemented!() branches everywhere.

@fintelia
Copy link
Contributor Author

Thinking about it more, the Palette type might also be out of place. All of the other color types allow direct interpretation of the color data, but that one needs additional metadata and would very difficult to operate on without conversion. Doing that conversion could be (and maybe already is?) part of the decoding process when loading an image.

@HeroicKatora
Copy link
Member

Doing that conversion could be (and maybe already is?) part of the decoding process when loading an image.

An interesting point. Doing the conversion during decoding is more troublesome to implement for decoders (and currently they decide their output format) while doing it strictly after reading image has less coupling but more resource consumption (memory, cpu) because we store the full intermediate result.

@dobkeratops
Copy link

dobkeratops commented Jan 20, 2019

(minor questions.. does anything use the xbox360's 7e3 rgb HDR these days, how about packed 16bits per pixel RGBA as 1555 _565 4444, i think theres some formats like 10bit per channel r,g,b aswell these days for modern monitors. is the rgb16 fixed or floating; there is a nice 16bit float format
these might be unusual formats but supported by various peices of hardware.
palette: would you support 4bit i.e.16 color aswell .. hardly used today, but there's retro emulators.

Sounds from the original comment like the previous system might have allowed general purpose encoding of the RGB fields.. given all the options that have appeared maybe that isn't such a bad idea)

@fintelia
Copy link
Contributor Author

@dobkeratops The current design isn't that general. It is just layout + a single u8 to indicate bit depth. Thus channels are assumed to be unsigned ints (no floating point formats), and all channels have to be the same depth (so RGB10A2, RGB565, etc are not supported).

Currently, the main use for ColorType is to describe what the raw bytes produced by a decoder mean. Thus, the more variants we have the more different cases that users of the library have to handle. Currently that's so much of a hassle that our own image loading code gives up when encountering anything but the most common color types.

@HeroicKatora
Copy link
Member

HeroicKatora commented Jan 20, 2019

I think the biggest current flaw or constrait to overcome would be to allow pixels to not be packed in memory. This is required by Pixel::from_slice etc. The constraint on our own encoder is due to DynamicImage having this static set of enum variants mostly. Imho, instead of interpreting the raw bytes by the Decoder ourselves it might be better to make conversion to DynamicImage easier for external code and then request DynamicImage from the decoder. Making sure that the color types in the header and the color type in the output are consistent is trivially done afterwards and a bug in decoders (i.e. we could cast the dynamic image down to the actual buffer through as_rgba().expect(…) where necessary)

@fintelia
Copy link
Contributor Author

What do you mean by pixels not being packed? Are you taking about planar images that store the channels separately? We could probably accommodate those by adding PlanarXXX variants to ColorType. Converting to DynamicImage should also definitely be easier. I'd like there to be an infallible from_raw function for that which took image dimensions, a ColorType, and the raw bytes.

@HeroicKatora
Copy link
Member

HeroicKatora commented Jan 20, 2019

We could probably accommodate those by adding PlanarXXX variants to ColorType.

Not without more changes. These could not implement Pixel and that is a strong requirement of GenericImage. Because of https://github.com/PistonDevelopers/image/blob/c3c7d5c9e72d51c7983d65d69d27d3e5c61edeb3/src/buffer.rs#L26 and the requirement to return pixels by &mut. And https://github.com/PistonDevelopers/image/blob/c3c7d5c9e72d51c7983d65d69d27d3e5c61edeb3/src/buffer.rs#L72 also assumes pixels to be present as very strictly typed packed slices. Not speaking of the pixel iterator PixelsMut...

I'd like there to be an infallible from_raw function for that which took image dimensions, a ColorType, and the raw bytes.

Sounds like recipe for disaster, but some (not necessarily internal) unsafe function may be permissible.

@fintelia
Copy link
Contributor Author

We could probably accommodate those by adding PlanarXXX variants to ColorType.

Not without more changes. These could not implement Pixel and that is a strong requirement of GenericImage. Because of...

I think we're talking about different things. ColorType is an enum that isn't related to the Pixel trait. Are you thinking of the Rgb, Rgba, etc. structs?

@HeroicKatora
Copy link
Member

Yes, ColorType is not related through code. But there is one struct for each type, and converters for each variant at DynamicImage for compatibility reasons. I would not see great use in a ColorType variant without a strictly typed corresponding struct for access.

@fintelia
Copy link
Contributor Author

ColorType::Gray(1) has no corresponding struct, yet it can be produced by some decoders. Even though image cannot manipulate such images, it is useful to have because it means you don't have to roll your own file format parsing code to open 1-bit images in your program. An alternative strategy would be to have the decoder immediately unpack the image, but that would increase memory use 8x.

I think we should have a large set of color types that decoders can produce and then a smaller set of "core" types that can be represented by DynamicImage (supporting all of them would be too much implementation effort).

@HeroicKatora
Copy link
Member

HeroicKatora commented Jan 21, 2019

ColorType::Gray(1) has no corresponding struct, yet it can be produced by some decoders.

I believe that to be a symptom of the underlying problem.

Even though image cannot manipulate such images, it is useful to have because it means you don't have to roll your own file format parsing code to open 1-bit images in your program. An alternative strategy would be to have the decoder immediately unpack the image, but that would increase memory use 8x.

What do you mean by this, no current image type can represent this. If this refers to decoder's ability to write to a raw Vec<u8> instead of only being able to output to an image, then yes it's indeed useful. But it would mean that we should think how we can represent this as an GenericImageView, imho, or at the least provide a central means to describe such a buffer. This would make it clear which kinds of data can be unpacked into a usable image in case the memory tradeoff is desired.

Hence the three levels of descriptors:

  • format of the buffer (at the level of rgb565)
  • pixel type (rgb, with red green blue component)
  • color space information (based on one pixel type, e.g. rec.709)

Upgrading from one level to the next is then an explicitely potentially costly and failing operation. E.g. (disregarding potentially future supported pixel types) To represent rgb565 as a mutable image, you need to move its data to a RGB buffer. Trying to interpret pixel type YCbCr for sRGB color data will fail. I'm not sure on the specifics yet.

But for every higher level of description there should be one 'preferred' format in the level below. While it is possible to use planar rgb data for sRGB (through FlatSamples), the preferred and thus only mutable alternative is using one of the standard layouts with packed samples where each channel is its own value.

@fintelia
Copy link
Contributor Author

What do you mean by this, no current image type can represent this. If this refers to decoder's ability to write to a raw Vec instead of only being able to output to an image, then yes it's indeed useful.

Yes, exactly.

Upgrading from one level to the next is then an explicitely potentially costly and failing operation.

I'd really like to have some way to do (potentially lossy) conversions that cannot fail. Even if we have to guess about the color space and do an expensive conversion, it would be really nice for users to have a way to just say "I don't care how, just give me a RGBA8 image I can display on the screen".

Do you have a sense of what you'd like the API to look like? At this point I'm envisioning something like:

// Output from image decoders, doesn't need to actually be grouped into a struct
struct RawOutput {
    width: u32,
    height: u32,
    color_type: ColorType, // enum from original post
    color_space: ColorSpace, // new enum: rec.709, etc.
    data: Vec<u8>,
}

struct DynamicImage(DynamicImageInner, Option<ColorSpace>);
enum DynamicImageInner {
    Rgb8(...)
    Rgba8(...),
    ...
}
impl DynamicImage {
    from_raw(raw: RawOutput) -> Self {
        assert_eq!(raw.width as usize * raw.height as usize == data.len());
        ...
    }
}

@birktj
Copy link
Member

birktj commented Feb 11, 2019

What about some sort of two-tiered ColorType with possibly two different ColorType enums?
One for colortypes that can be used in ImageBuffer without conversion and failure, and another that can maybe be converted with a possibly expensive conversion.

This will allow ImageDecoder::colortype to return a larger space of possible colors, while keeping the rest of the library from having to support thousands of colortypes. This will also allow users to create their own Pixel implementation if they need a colortype that is not supported one a first-tier level.

If something happens with #793 this will be especially important.

@fintelia
Copy link
Contributor Author

fintelia commented Feb 11, 2019

@birktj I'm not sure whether having decoders return more colortypes would be an improvement since it creates more possible cases for any users of the library to handle. Though with suitable conversion support your idea could work. Each variant of ExtendedColorType (better name needed...) would have an associated core color type as well as a batch conversion function that converted a slice in the extended format to one in a core format. In this design, encoders would only support the standard core color types.

@birktj
Copy link
Member

birktj commented Feb 11, 2019

My idea was more that there wouldn't really be any conversion support to begin with. You would probably have an ExtendedColorType::to_colortype(&self) -> ImageResult<Colortype>, but the main reason for the type would be to accommodate the ones that would like to use the decoders directly and work with rarer colortypes without coming in the way of the rest of the library.

I guess that if you want to use the decoders directly one of the reasons would be to get access to the arbitrary byte data, in that case it may be useful to have a larger set of possible colortypes. If you need full library support for your colortype you would simply call to_colortype and maybe get some sort of UnsuportedColortype error.

This may also help with all of the unimplemented! that are scattered around the current ColorType enum.

@fintelia
Copy link
Contributor Author

What should the library do if it is asked to open an image file with a color type that isn't one of the core formats? It sounds like you are suggesting that we just return an error rather than try to convert it, but I'm not sure I understand why. There are tons of formats like 4-bit RGBA or RGBX that can very easily and lossessly be converted to core color types, so why shouldn't we just do that?

@birktj
Copy link
Member

birktj commented Feb 11, 2019

Of course we should convert if we can. But for some formats adding conversion may be harder than others. So I was thinking you could add the enum now and then implement conversion progressively. Given all of the different pixel formats out there, adding support for everything at once is a huge undertaking so you want some sort of simple backwards-compatible way to implement conversion which I think this is.

@HeroicKatora
Copy link
Member

HeroicKatora commented Feb 11, 2019

Exactly this. We know about some exotic color formats existing, but offering all in the same api suggests they are supported equally well, or at least comparably. Returning one of the exotic color types doesn't invalidate the usability of the decoder itself, but the library can not necessarily map all operations, sometimes returning a Err(Error::UnsupportedColor). While in the core set of color types every operation should work without error.

@fintelia
Copy link
Contributor Author

fintelia commented Feb 12, 2019

I started trying to convert image to have two separate color type enums. I very quickly noticed that not just decoders, but also encoders needed to operate on ExoticColorTypes to avoid loss of functionality (some encoders support 1 bit black/white images but the rest of the library doesn't). The only other uses of ColorType in the library are for our "poor man's RTTI" on Pixel, and to indicate what type of data is stored in a DynamicImage -- which is itself already an enum over the color types it supports...

@fintelia
Copy link
Contributor Author

fintelia commented Feb 14, 2019

@birktj @HeroicKatora I tried out implementing a split color type enum, but didn't find the result all that compelling. What are your thoughts?

Through this process, I'm leaning more towards defining a non-exhaustive enum of color types with some programmatic way to see what functionality is supported for each color type + a required canonical form for to convert each color type to/from from the set: {L, LA, RGB, RGBA} x {u8, u16, f32, ...}.

@birktj
Copy link
Member

birktj commented Feb 14, 2019

I understand what you mean, CoreColorType isn't really used that much.

I think a non-exhaustive enum is good, but I would like some more variants,
personally I would really like cmyk.

Could something like this be used to convert between representations? It should
allow to convert between representations that are not memory backed while still
having to potential to be reasonably performant.

trait PixelConvert<To: Pixel> {
    fn convert<R: Read, W: Write>(from: &mut R, to: &mut W) -> ImageResult<()>;
}

impl<T: Primitive> PixelConvert<Bgr<T>> for Rgb<T> {...}

impl ColorType {
    fn can_convert(&self, to: ColorType) -> bool;
    fn can_convert_list(&self) -> Vec<ColorType>; // To enumerate?
    fn convert<R: Read, W: Write>(&self, other: ColorType, from: &mut R, to: &mut W) -> ImageResult<()>;
}

@HeroicKatora
Copy link
Member

HeroicKatora commented Feb 15, 2019

@fintelia I think the reason why it's not compelling enough is the lack of other color representations that do not fit in the core type's list. Add rgb555 to the non core type etc, and I think the picture is much clearer. For a (too) long list, maybe take the opengl/vulkan buffer colors.

@HeroicKatora
Copy link
Member

HeroicKatora commented Feb 15, 2019

Regarding converters, I would have built the api with some ideas from ffmpeg.

struct Converter<From: Pixel, To: Pixel> { _priv: (), /* */ };

trait PixelConvert<To: Pixel>: Pixel {
    fn converter() -> Convert<Self, To>;
}

impl<From: Pixel, To: Pixel> Converter<From, To> {
    fn iterate<I: Iterator<Item=From>>(&self, iter: I) -> ToIter<I, To> /* impl Iterator<Item=To> */;
    fn io(&self, in: impl Read, out: impl Write) -> io::Result<usize>;
    ...
}

@fintelia
Copy link
Contributor Author

@HeroicKatora I like that design though it would have to be in addition to a dynamically typed version like the one @birktj described. (You frequently don't know what type an image will be until you open it...)

Regarding CoreColorType, I'm concerned that anywhere we use it in the library will only constrain functionality rather than expand it. Decoders and encoders can't be limited to only taking core color types. Similarly, making implementations of Pixel return a CoreColorType limits which implementations of Pixel we can have, an issue DynamicImage also runs into. Perhaps all of those cases should actually use a normal ColorType, and the CoreColorType should be reserved only when doing conversions?

Another question: I've been assuming that the CoreColorType enum is exhaustive, meaning that users can match over all its variants, but it can't be expanded once we hit 1.0. Is this consistent with your thinking?

@HeroicKatora
Copy link
Member

Decoders and encoders can't be limited to only taking core color types.

Not at all, but that would be a possibly clean boundary for ImageError::UnsupportedColor(_). Meaning, the non-abstracted version of the decoder can return a buffer with BroadColorType but the DynamicImage representation will return an Err(ImageError::UnsupportedColor(_)) when trying to import the image with the generic method, presumably with the error message hinting at this restriction. And also only when automatic conversion into one of the supported color types is infeasible.

Another question: I've been assuming that the CoreColorType enum is exhaustive, meaning that users can match over all its variants, but it can't be expanded once we hit 1.0. Is this consistent with your thinking?

Yes, more or less. CoreColorType is exhaustive while the other is not. Note that, of course, this doesn't make it impossible to add mutable buffers for other types, just that those would not be part of the 1.0 equivalent of DynamicImage.

@fintelia
Copy link
Contributor Author

So I understand, you are imagining that CoreColorType is really an enumeration over the color types supported by DynamicImage? And that Pixel and by extension GenericImage would allow color types outside the core ones (constrained by which ones could provide the full interface required by the Pixel trait)?

@HeroicKatora
Copy link
Member

Oh right, making it non_exhaustive is a form of finalization ^^

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants