New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Finalize ColorType enum #855
Comments
Could it be a possibility to differentiate between the format of a pixel, the available color types, and the color space. For example, both In any case, reinterpretation of an image under a different color space with the same color type and pixel format should also always be a trivial operation. Such that decoders can supply color space information independently and the programmer can overwrite this without effort. |
I am in favor of keeping color space information separate from the pixel format and color type. At the same time I'm reluctant to try to divide pixel format and color type. I think it would end up being more trouble than it was worth. This way we have only one enum and all possible variants make sense. It would take some archane type trickery to prevent nonsensical formats like 2 channel color types in BGR order to avoid having to litter the code with |
Thinking about it more, the Palette type might also be out of place. All of the other color types allow direct interpretation of the color data, but that one needs additional metadata and would very difficult to operate on without conversion. Doing that conversion could be (and maybe already is?) part of the decoding process when loading an image. |
An interesting point. Doing the conversion during decoding is more troublesome to implement for decoders (and currently they decide their output format) while doing it strictly after reading image has less coupling but more resource consumption (memory, cpu) because we store the full intermediate result. |
(minor questions.. does anything use the xbox360's 7e3 rgb HDR these days, how about packed 16bits per pixel RGBA as 1555 _565 4444, i think theres some formats like 10bit per channel r,g,b aswell these days for modern monitors. is the rgb16 fixed or floating; there is a nice 16bit float format Sounds from the original comment like the previous system might have allowed general purpose encoding of the RGB fields.. given all the options that have appeared maybe that isn't such a bad idea) |
@dobkeratops The current design isn't that general. It is just layout + a single u8 to indicate bit depth. Thus channels are assumed to be unsigned ints (no floating point formats), and all channels have to be the same depth (so RGB10A2, RGB565, etc are not supported). Currently, the main use for ColorType is to describe what the raw bytes produced by a decoder mean. Thus, the more variants we have the more different cases that users of the library have to handle. Currently that's so much of a hassle that our own image loading code gives up when encountering anything but the most common color types. |
I think the biggest current flaw or constrait to overcome would be to allow pixels to not be packed in memory. This is required by |
What do you mean by pixels not being packed? Are you taking about planar images that store the channels separately? We could probably accommodate those by adding PlanarXXX variants to ColorType. Converting to DynamicImage should also definitely be easier. I'd like there to be an infallible from_raw function for that which took image dimensions, a ColorType, and the raw bytes. |
Not without more changes. These could not implement
Sounds like recipe for disaster, but some (not necessarily internal) |
I think we're talking about different things. ColorType is an enum that isn't related to the Pixel trait. Are you thinking of the Rgb, Rgba, etc. structs? |
Yes, |
I think we should have a large set of color types that decoders can produce and then a smaller set of "core" types that can be represented by DynamicImage (supporting all of them would be too much implementation effort). |
I believe that to be a symptom of the underlying problem.
What do you mean by this, no current image type can represent this. If this refers to decoder's ability to write to a raw Hence the three levels of descriptors:
Upgrading from one level to the next is then an explicitely potentially costly and failing operation. E.g. (disregarding potentially future supported pixel types) To represent But for every higher level of description there should be one 'preferred' format in the level below. While it is possible to use planar |
Yes, exactly.
I'd really like to have some way to do (potentially lossy) conversions that cannot fail. Even if we have to guess about the color space and do an expensive conversion, it would be really nice for users to have a way to just say "I don't care how, just give me a RGBA8 image I can display on the screen". Do you have a sense of what you'd like the API to look like? At this point I'm envisioning something like: // Output from image decoders, doesn't need to actually be grouped into a struct
struct RawOutput {
width: u32,
height: u32,
color_type: ColorType, // enum from original post
color_space: ColorSpace, // new enum: rec.709, etc.
data: Vec<u8>,
}
struct DynamicImage(DynamicImageInner, Option<ColorSpace>);
enum DynamicImageInner {
Rgb8(...)
Rgba8(...),
...
}
impl DynamicImage {
from_raw(raw: RawOutput) -> Self {
assert_eq!(raw.width as usize * raw.height as usize == data.len());
...
}
} |
What about some sort of two-tiered ColorType with possibly two different ColorType enums? This will allow If something happens with #793 this will be especially important. |
@birktj I'm not sure whether having decoders return more colortypes would be an improvement since it creates more possible cases for any users of the library to handle. Though with suitable conversion support your idea could work. Each variant of |
My idea was more that there wouldn't really be any conversion support to begin with. You would probably have an I guess that if you want to use the decoders directly one of the reasons would be to get access to the arbitrary byte data, in that case it may be useful to have a larger set of possible colortypes. If you need full library support for your colortype you would simply call This may also help with all of the |
What should the library do if it is asked to open an image file with a color type that isn't one of the core formats? It sounds like you are suggesting that we just return an error rather than try to convert it, but I'm not sure I understand why. There are tons of formats like 4-bit RGBA or RGBX that can very easily and lossessly be converted to core color types, so why shouldn't we just do that? |
Of course we should convert if we can. But for some formats adding conversion may be harder than others. So I was thinking you could add the enum now and then implement conversion progressively. Given all of the different pixel formats out there, adding support for everything at once is a huge undertaking so you want some sort of simple backwards-compatible way to implement conversion which I think this is. |
Exactly this. We know about some exotic color formats existing, but offering all in the same api suggests they are supported equally well, or at least comparably. Returning one of the exotic color types doesn't invalidate the usability of the decoder itself, but the library can not necessarily map all operations, sometimes returning a |
I started trying to convert image to have two separate color type enums. I very quickly noticed that not just decoders, but also encoders needed to operate on |
@birktj @HeroicKatora I tried out implementing a split color type enum, but didn't find the result all that compelling. What are your thoughts? Through this process, I'm leaning more towards defining a non-exhaustive enum of color types with some programmatic way to see what functionality is supported for each color type + a required canonical form for to convert each color type to/from from the set: {L, LA, RGB, RGBA} x {u8, u16, f32, ...}. |
I understand what you mean, I think a non-exhaustive enum is good, but I would like some more variants, Could something like this be used to convert between representations? It should trait PixelConvert<To: Pixel> {
fn convert<R: Read, W: Write>(from: &mut R, to: &mut W) -> ImageResult<()>;
}
impl<T: Primitive> PixelConvert<Bgr<T>> for Rgb<T> {...}
impl ColorType {
fn can_convert(&self, to: ColorType) -> bool;
fn can_convert_list(&self) -> Vec<ColorType>; // To enumerate?
fn convert<R: Read, W: Write>(&self, other: ColorType, from: &mut R, to: &mut W) -> ImageResult<()>;
} |
@fintelia I think the reason why it's not compelling enough is the lack of other color representations that do not fit in the core type's list. Add |
Regarding converters, I would have built the api with some ideas from ffmpeg.
|
@HeroicKatora I like that design though it would have to be in addition to a dynamically typed version like the one @birktj described. (You frequently don't know what type an image will be until you open it...) Regarding CoreColorType, I'm concerned that anywhere we use it in the library will only constrain functionality rather than expand it. Decoders and encoders can't be limited to only taking core color types. Similarly, making implementations of Pixel return a CoreColorType limits which implementations of Pixel we can have, an issue DynamicImage also runs into. Perhaps all of those cases should actually use a normal ColorType, and the CoreColorType should be reserved only when doing conversions? Another question: I've been assuming that the CoreColorType enum is exhaustive, meaning that users can match over all its variants, but it can't be expanded once we hit 1.0. Is this consistent with your thinking? |
Not at all, but that would be a possibly clean boundary for
Yes, more or less. |
So I understand, you are imagining that |
Oh right, making it |
The ColorType enum is used to represent the meaning of subcomponents of Pixels and Decoders. However it has a few issues:
One option is to replace the enum with the something similar to the following (based on the supported modes in the Pillow library):
Questions
[ ] Do we want to include BGR / BGRA formats?
[ ] Should we include U1 (packed 1 bit per pixel, 8 pixels per byte)?
[ ] Should we define a smaller set of "core" types, say Luminance + RGB + RGBA + RGBA32F, and then focus on conversions to and between them?
The text was updated successfully, but these errors were encountered: