Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Let's talk about Pixel #1523

Open
johannesvollmer opened this issue Aug 1, 2021 · 18 comments
Open

Let's talk about Pixel #1523

johannesvollmer opened this issue Aug 1, 2021 · 18 comments

Comments

@johannesvollmer
Copy link
Contributor

johannesvollmer commented Aug 1, 2021

Hey! I've been trying to implement f32 support for a while now, but progress always stops when I try to modify Pixel in a backwards-compatible and sane way. While we could probably think of some way to add f32 support in a non-breaking way, I see no possibility that doesn't drive us further and further into technical debt. I think we can't continue adding yet another hack to the pixel trait forever. Adding new features will become impossible eventually, it's already hard. This is because the original Pixel trait design is not flexible enough.
I have been thinking about the Pixel type for a whole lot of time now. I've seen #1099, #956, #1464, #1212, #1013, #1499, #1516. That's why I want to call for action right now.

With the upcoming 0.24.0 release planned, which allows us to do breaking changes, we finally have the chance to actually improve the design of the trait in a sustainable manner. I propose we redesign the pixel trait right now! The goal of the redesign would be to maximise the flexibility such that it will last for a long time without major modifications.

Limitations of the current design

As far as I understand, these are currently the biggest problems with the pixel trait:

  1. It is not generic enough:
    (F32 Variant of Dynamic Image #1511, Add alpha count associated constant to Pixel trait #956, Pixel::map methods should map to arbitrary subpixel type #1464, Why is the FromColor trait not public? #1212).
    The purpose of a trait is to enable generic code.
    Implementing the pixel trait is problematic for multiple reasons.

    • Color conversions to_rgb is not generic. Rust has traits for conversions, the From and Into traits. These should be used, if at all. Color conversion is more complex than that though, so it should probably be separated.
  2. It has too many responsibilities:
    (F32 Variant of Dynamic Image #1511, Simplify Pixel trait #1099, implement std::convert::From<T: Pixel> for Pixel #1013, Why is the FromColor trait not public? #1212)
    Adding multiple responsibilities to a trait is problematic, as the flexibility is reduced further and further. In the std library, we can observe multiple traits that are split up atomically. Many times, traits contain only a single method definition.
    I identified the following responsibilities in the Pixel trait:

    • Field access. Methods like as_slice, from_slice and friends exist to access the samples in the pixel. Granting access to the sample data should be a separate responsibility, just like image operations and codecs should be separated.
    • Color Type hints. COLOR_TYPE: ColorType is passed to the image encoders.
    • Color conversion. This will be reworked anyways. It should be removed from the pixel trait. This allows us to add color conversion without having to modify the pixel trait, which adds stability to the design.
    • Contains some color operations, such as invert, blend, but no other color operations

    All of these could theoretically be separate traits, especially Blend, Invert, and to_xyz color conversions

Furthermore, there are other minor problems:

  • Many algorithms use channels4 method of the pixel trait.
    • For example, see hue_rotate_in_place. In reality, the hue algorithm will only make sense for Rgb. These algorithms should probably not use the pixel trait in the first place.
    • horizontal_sample should probably use the generic slice methods, and work with every possible number of channels. Even more idiomatic would be to actually treat the pixel as a whole, by not adding up each sample of a pixel individually. Instead, the algorithm should use hypothetical operations such as blend or sum or average on the pixel.

How to start?

Of course, a redesign is a huge task. And even settling on a design will be challenging. Splitting up the pixel into atomic traits would probably help reduce bike shedding, as color conversion and similar tasks could be added independently. To speed up the design process, it could also help to list the new requirements of the pixel trait(s).

Maybe we can start with the following list of requirements:

  • Generic over Sample Type (works with u8, u16, f32, or even f16 if the user wants to implement that)
  • Single Responsibility: Field Access. No Color Conversion. Helper functions and color type declarations are done with additional traits and extension traits.
  • Generic over Alpha Channel.
  • Supports RGB and Luma. Maybe even user-defined colors such as YCbCr? Can we have add and blend for those`?
  • Continuous Memory: Internal storage can be assumed to be a primitive array of fixed size. Therefore, all channels have the same type.
  • Maybe support converting between sample types? pixel.map(|f32| (f32 255.0) as u8)?
  • Supports different color spaces already? What about sRGB vs LinearRGB? From my point of view, this is the only open question.

I propose we remove the color conversion from the pixel trait into a separate module. We could keep the old conversion algorithms until real color spaces are introduced.

Example Design

This is by far not a full design. It should rather be understood as an example. It shows an extreme variant of the trait separation. I just typed this off of my head, so it probably doesn't factor in every requirement yet.

The design focuses on separating concerns and providing default implementations for most methods. The code does not make any assumptions about the sample type, except for it being Primitive. The pixel is never assumed to be a specific color space. The only real assumption made, is that there exists an underlying array of samples. Color space models can be introduced to the crate by adding more traits, probably even without changing the existing traits.

/// The most basic information about a Pixel.
/// Maybe the only information that `ImageBuffer` needs at all!
pub trait PrimitivePixel: Sized + Copy + Clone {

    /// The type of each channel in this pixel.
    type Sample: Primitive;

}

/// Access the samples of the pixel.
pub trait SlicePixel: PrimitivePixel {

    /// A slice of all the samples in this pixel, including alpha.
    fn as_slice(&self) -> &[Self::Sample];

    /// A slice of all the samples in this pixel, including alpha.
    fn as_slice_mut(&mut self) -> &mut [Self::Sample];
    
    fn from_iter(iter: impl Iterator<Item=Self::Sample>) -> Self {
        let mut result = Self::default();
        for (target, source) in result.as_slice_mut().iter_mut().zip(iterator) {
            *target = *source;
        }
        result
    }
    
    fn into_iter(self) -> SamplesIter<Self::Sample> { .. }

    fn apply(&mut self, mapper: impl FnMut(Self::Sample) -> Self::Sample) {
        for value in self.as_slice_mut() {
            *value = mapper(*value);
        }
    }

    fn apply2(&mut self, other: &Self, mapper: impl FnMut(Self::Sample, Self::Sample) -> Self::Sample) {
        for (own, other) in self.as_slice_mut().iter().zip(other.as_slice()) {
            *own = mapper(*own, *other);
        }
    }

    fn map(&self, mapper: impl FnMut(Self::Sample) -> Self::Sample) -> Self {
        let mut cloned = self.clone();
        cloned.apply(mapper);
        cloned
    }

    fn map2(&self, other: &Self, mapper: impl FnMut(Self::Sample, Self::Sample) -> Self::Sample) -> Self {
        let mut cloned = self.clone();
        self.apply2(other, mapper);
        cloned
    }
}


/// Only implemented for Rgba and LumaA
pub trait AlphaPixel: PrimitivePixel {
    fn alpha(&self) -> &Self::Sample;
    fn alpha_mut(&mut self) -> &mut Self::Sample;
}

/// Implemented for all colors manually.
/// This example design is based on the assumption that
/// all colors with alpha will have alpha as the last channel.
/// The assumption is only needed to provide default implementations for the first two methods.
pub trait MaybeAlphaPixel: SlicePixel {
    const HAS_ALPHA: bool;

    fn alpha_channel_count() -> u8 { if Self::HAS_ALPHA { 1 } else { 0 } }

    fn color_channel_count() -> u8 where Self: ArrayPixel {
        Self::CHANNEL_COUNT - Self::alpha_channel_count()
    }

    fn as_color_and_maybe_alpha(&self) -> (&[Self::Sample], Option<&Self::Sample>) {
        let slice = self.as_slice();
        debug_assert_ne!(slice.len(), 0);

        if Self::HAS_ALPHA { (&slice[..slice.len()-1], slice.last()) }
        else { (self.as_slice(), None) }
    }

    fn as_color_and_maybe_alpha_mut(&mut self) -> (&mut [Self::Sample], Option<&mut Self::Sample>) {
        let slice = self.as_slice_mut();

        if Self::HAS_ALPHA {
            let (last, color) = slice.split_last_mut().expect("zero samples found");
            (color, Some(last))
        }
        else { (self.as_slice_mut(), None) }
    }

    fn as_color_slice(&self) -> &[Self::Sample] {
        self.as_color_slice_and_alpha().0
    }

    fn as_color_slice_mut(&self) -> &mut [Self::Sample] {
        self.as_color_slice_and_alpha_mut().0
    }

    fn apply_without_alpha(&mut self, mapper: impl FnMut(Self::Sample) -> Self::Sample) {
        for value in self.as_slice_without_alpha_mut() {
            *value = mapper(*value);
        }
    }

    fn apply2_without_alpha(&mut self, other: &Self, mapper: impl FnMut(Self::Sample, Self::Sample) -> Self::Sample) {
        for (own, other) in self.as_slice_without_alpha_mut().iter().zip(other.as_slice_without_alpha_mut()) {
            *own = mapper(*own, *other);
        }
    }

    fn map_without_alpha(&self, mapper: impl FnMut(Self::Sample) -> Self::Sample) -> Self {
        let mut cloned = self.clone();
        cloned.apply_without_alpha(mapper);
        cloned
    }

    fn map2_without_alpha(&self, other: &Self, mapper: impl FnMut(Self::Sample, Self::Sample) -> Self::Sample) -> Self {
        let mut cloned = self.clone();
        self.apply2_without_alpha(other, mapper);
        cloned
    }
}

/// Add an alpha channel, for example go from RGB to RGBA, or L to LA. No color conversion is happening here.
/// Implemented by all colors.
pub trait IntoWithAlpha: PrimitivePixel {
    type WithAlpha: PrimitivePixel<Sample=Self::Sample>;
    fn with_alpha(self, alpha: Self::Sample) -> Self::WithAlpha;
}

/// Take away an alpha channel, for example go from RGBA to RGB, or LA to L. No color conversion is happening here.
/// Implemented by all colors.
pub trait IntoWithoutAlpha: PrimitivePixel {
    type WithoutAlpha: PrimitivePixel<Sample=Self::Sample>;
    fn without_alpha(self) -> Self::WithAlpha;
}

/// Used to detect the image type when encoding an image.
pub trait InherentColorType {
    const COLOR_TYPE: ColorType;
}

/// Implemented for all colors individually. 
/// Allows converting the sample to a different sample type.
pub trait MapSamples<NewSampleType>: PrimitivePixel {
    type NewPixel: PrimitivePixel<Sample = NewSampleType>;
    fn map_samples(self, mapper: impl FnMut(Self::Sample) -> NewSampleType) -> Self::NewPixel;
}

/// This alias might not be a good idea - it's very monolithic again, just like the old design.
// Most methods should only use the traits they actually need, instead of requiring all these traits.
pub trait Pixel:
    Default + 
    SlicePixel +
    MaybeAlphaPixel +
    IntoWithAlpha + 
    IntoWithoutAlpha
    // not InherentColorType, as it is only useful in a few select circumstances
{ }


// ### example color implementation ###

#[derive(Clone, Copy, Default)]
struct Rgb <T> ([T; 3]);

impl<T> PrimitivePixel for Rgb<T> { type Sample = T; }

impl<T> SlicePixel for Rgb<T> {
    fn as_slice(&self) -> &[Self::Sample] { &self.0 }
    fn as_slice_mut(&mut self) -> &mut [Self::Sample] { &mut self.0 }
}

impl<T> MaybeAlphaPixel for Rgb<T> {
    const HAS_ALPHA: bool = false;
}

// non-generic implementations for the color type, as not all sample types may be supported for a color
impl InherentColorType for Rgb<u8> { const COLOR_TYPE: ColorType = ColorType::Rgb8; }
impl InherentColorType for Rgb<u16> { const COLOR_TYPE: ColorType = ColorType::Rgb16; }
impl InherentColorType for Rgb<f32> { const COLOR_TYPE: ColorType = ColorType::Rgb32F; }

impl<T> IntoWithAlpha for Rgb<T> {
    type WithAlpha = Rgba<T>;
    fn with_alpha(self, alpha: Self::Sample) -> Self::WithAlpha {
        let [r,g,b] = self.0;
        Rgba([r,g,b,a])
    }
}

impl<T> IntoWithoutAlpha for Rgb<T> {
    type WithoutAlpha = Self;
    fn without_alpha(self) -> Self::WithoutAlpha { self }
}

impl<T, D> MapSamples<D> for Rgb<T> {
    type NewPixel = Rgb<D>;

    fn map_samples(self, mapper: impl FnMut(Self::Sample) -> D) -> Self::NewPixel {
        self.as_slice().iter().cloned().map(mapper).collect()
    }
}

// ### inside image buffer and dynamic image ###
struct ImageBuffer<Px, Container> { .. } // no trait bound on struct definition
impl<Px,Container> ImageBuffer<Px, Container> where Px: PrimitivePixel, Container: AsRef<[Px::Sample]> {
    fn pub fn from_raw(width: u32, height: u32, buf: Container) -> Option<ImageBuffer<Px, Container>>
        where Px: ArrayPixel // channel count required for length check
    { ... }

    fn save(&self, write: impl Write) -> ImageResult where Px: InherentColorType { ... }

    fn pixels(&self) -> Pixels<Px> where Px: ArrayPixel { ... } 
}

// ### inside rgb algorithms ###
fn hue_rotate(image: impl GenericImage<Pixel=RGB>){
   let [r,g,b] = color.to_array();
   // modify values
   let color = RGB::from_array([r,g,b]);
}
@johannesvollmer
Copy link
Contributor Author

johannesvollmer commented Aug 2, 2021

Going back a step and merging some of the unnecessarily split traits, this could be an alternative design:

/// A pixel containing a sample for each channel.
/// Grants access to the sample data, and does nothing more. No conversions.
/// Basically nothing but a trait for wrapped primitive arrays, which is aware of alpha channels.
pub trait Pixel: Sized + Copy + Clone + Default {

    /// The type of each channel in this pixel.
    type Sample: Primitive;
    
    const CHANNEL_COUNT: u8;
    
    const HAS_ALPHA: bool; // assumes no more than one alpha channel per pixel
        
    /// A slice of all the samples in this pixel, including alpha.
    fn as_slice(&self) -> &[Self::Sample];

    /// A slice of all the samples in this pixel, including alpha.
    fn as_slice_mut(&mut self) -> &mut [Self::Sample];
    

    // ### default implementations. these could also be put in a separate PixelExt trait ### 

    fn bytes_per_channel() -> usize { ... }
    fn bytes_per_pixel() -> usize { ... }
    fn alpha_channel_count() -> u8 { ... }
    fn color_channel_count() -> u8 { ... }
    fn from_iter(iter: impl IntoIterator<Item=Self::Sample>) -> Self { ... }
    fn apply(&mut self, mapper: impl FnMut(Self::Sample) -> Self::Sample) { ... }
    fn apply2(&mut self, other: &Self, mapper: impl FnMut(Self::Sample, Self::Sample) -> Self::Sample) { ... }
    fn map(&self, mapper: impl FnMut(Self::Sample) -> Self::Sample) -> Self { ... }
    fn map2(&self, other: &Self, mapper: impl FnMut(Self::Sample, Self::Sample) -> Self::Sample) -> Self { ... }
    fn as_color_slice_and_alpha(&self) -> (&[Self::Sample], Option<&Self::Sample>) { ... }
    fn as_color_slice_and_alpha_mut(&mut self) -> (&mut [Self::Sample], Option<&mut Self::Sample>) { ... }
    fn as_color_slice(&self) -> &[Self::Sample] { ... }
    fn as_color_slice_mut(&self) -> &mut [Self::Sample] { ... }
    fn apply_without_alpha(&mut self, mapper: impl FnMut(Self::Sample) -> Self::Sample) { ... }
    fn apply2_without_alpha(&mut self, other: &Self, mapper: impl FnMut(Self::Sample, Self::Sample) -> Self::Sample) { ... }
    fn map_without_alpha(&self, mapper: impl FnMut(Self::Sample) -> Self::Sample) -> Self { ... }
    fn map2_without_alpha(&self, other: &Self, mapper: impl FnMut(Self::Sample, Self::Sample) -> Self::Sample) -> Self { ... }
}


/// Add an alpha channel, for example go from RGB to RGBA, or L to LA. No color conversion is happening here.
/// Implemented by all colors.
// This is a separate trait because it is not simply data access and has an associated type
pub trait WithAlpha: Pixel {
    type WithAlpha: Pixel<Sample=Self::Sample>;
    fn with_alpha(self, alpha: Self::Sample) -> Self::WithAlpha;
}

/// Take away an alpha channel, for example go from RGBA to RGB, or LA to L. No color conversion is happening here.
/// Implemented by all colors.
// This is a separate trait because it is not simply data access and has an associated type
pub trait WithoutAlpha: Pixel {
    type WithoutAlpha: Pixel<Sample=Self::Sample>;
    fn without_alpha(self) -> Self::WithoutAlpha;
}

/// Used to detect the image type when encoding an image.
pub trait InherentColorType {
    const COLOR_TYPE: ColorType;
}

/// Implemented for all colors individually. 
/// Allows converting the sample to a different sample type.
// This is a separate trait because it has an associated type
pub trait MapSamples<NewSampleType>: Pixel {
    type NewPixel: Pixel<Sample = NewSampleType>;
    fn map_samples(self, mapper: impl FnMut(Self::Sample) -> NewSampleType) -> Self::NewPixel;
}



// ### example color implementation ###

#[derive(Clone, Copy, Default)]
struct Rgb <T> ([T; 3]);

impl<T> Pixel for Rgb<T> where T: Primitive { 
    type Sample = T; 
    const HAS_ALPHA: bool = false;
    const CHANNEL_COUNT: u8 = 3;
    
    fn as_slice(&self) -> &[Self::Sample] { &self.0 }
    fn as_slice_mut(&mut self) -> &mut [Self::Sample] { &mut self.0 }
}


// non-generic implementations for the color type, as not all sample types may be supported for a color
impl InherentColorType for Rgb<u8> { const COLOR_TYPE: ColorType = ColorType::Rgb8; }
impl InherentColorType for Rgb<u16> { const COLOR_TYPE: ColorType = ColorType::Rgb16; }
impl InherentColorType for Rgb<f32> { const COLOR_TYPE: ColorType = ColorType::Rgb32F; }


impl<T> WithAlpha for Rgb<T> where T: Primitive {
    type WithAlpha = Rgba<T>;
    fn with_alpha(self, a: Self::Sample) -> Self::WithAlpha {
        let [r,g,b] = self.0;
        Rgba([r,g,b,a])
    }
}

impl<T> WithoutAlpha for Rgb<T> where T: Primitive {
    type WithoutAlpha = Self;
    fn without_alpha(self) -> Self::WithoutAlpha { self }
}

impl<T, D> MapSamples<D> for Rgb<T> where T: Primitive, D: Primitive {
    type NewPixel = Rgb<D>;

    fn map_samples(self, mapper: impl FnMut(Self::Sample) -> D) -> Self::NewPixel {
        Rgb::from_iter(self.as_slice().iter().cloned().map(mapper))
    }
}

@drewcassidy
Copy link
Contributor

I wonder if it might make sense to not use any enumerations for color formats, and instead just have RGB8 et al be type aliases. Const generics could also allow for >4 channel images. Perhaps the need to avoid changing the API is unfounded and its time for a 1.0.0?

@fintelia
Copy link
Contributor

@drewcassidy I'm not sure I follow. The reason that we use enums for color formats is because users can load files at runtime that can be in any format. The PngDecoder::color_type method returns ColorType to signal what format the PNG is. And that determines which variant of DynamicImage the higher level methods like image::open would return.

@ripytide
Copy link
Member

ripytide commented Jun 1, 2024

Furthermore to the excellent suggestions so far we could separate the pixel trait and types into a separate crate so other crates in the ecosystem can also make use of them.

I'd like to work on writing up this V2 pixel trait in a new crate and then attempting to refactor the image crate onto it, @fintelia are you able to create a new repository in the image github organization for it?

At the moment I've started work on my pixeli repo but I'd love to tranfer that repo to the image org if possible.

@fintelia
Copy link
Contributor

fintelia commented Jun 2, 2024

This is certainly a messy part of the API. If you especially want to work on this, I'd encourage you to read through the existing issues on Pixel and GenericImage and so forth, as well as discussion on color spaces, and on color type conversions. Without factoring in those considerations we'd likely end with a lot of the same issues as the current API.

TBH I'm kind of burned out trying to fix this part of the API. It is a huge undertaking that doesn't have any clear best solution. Plus even small changes like removing BGR component order can turn out to be quite disruptive for users. (At the time we didn't even know anyone was using that!) Not to mention that there's a big risk of things stalling out before any changes have landed

@ripytide
Copy link
Member

ripytide commented Jun 2, 2024

I have read through a lot of the issues discussed and while I can't guarantee that my current approach (based on the suggestions given above) will solve all of them I think it is a definite improvement. And moving all pixel-stuff into a separate crate will make this crate a lot cleaner also.

I'll try to send a PR switching this crate to use pixeli tomorrow.

@johannesvollmer
Copy link
Contributor Author

The difficulties mentioned by fintelia are real. I think a good first step might be to come up with a strategy to tackle the difficulties.

Wasn't there a plan to remove the image processing from this library into a separate library, and make this library I/O only? Is this still a goal?

@fintelia
Copy link
Contributor

fintelia commented Jun 8, 2024

Wasn't there a plan to remove the image processing from this library into a separate library, and make this library I/O only? Is this still a goal?

At one point there was though of splitting out ImageDecoder and some of the other related traits and enums into an image-core crate. The idea was that portion could rapidly iterate to 1.0 and then at least part of the API would be fully stable. We kinda gave up on that when realizing that every breaking release of image-core would require a breaking release of image, and thus be even worse than gradually improving those items the way we already were.

More recently, there's been some talk about moving some of the more advanced image processing methods to imageproc. It hasn't been a priority because those methods mostly just live off in their own module and don't require much maintenance, but there isn't much harm either. Under this plan core image processing types/methods like Pixel or crop would stay in this crate

@ripytide
Copy link
Member

ripytide commented Jun 8, 2024

Wasn't there a plan to remove the image processing from this library into a separate library, and make this library I/O only? Is this still a goal?

It's being tracked in #2238, progress is slow but hopefully steady.

@HeroicKatora
Copy link
Member

HeroicKatora commented Jun 8, 2024

From #2255 (comment) @ripytide

The two key points missing are awareness of color spaces and single responsibility. Instead, this design encodes even more information into the type system and it will run into the same troubles.

Good point, I didn't tackle the color space issue at the same time, in that regard the four choices I can think of for attaching color-space to pixels are:

1. Enum in Pixel types:
enum ColorSpace {
    Unknown,
    Linear,
...
}
struct Rgba<T> {
    r: T,
    g: T,
    b: T,
    a: T,
    color_space: ColorSpace,
}
2. Enum in Image Type
struct ImageBuffer<P> {
    pixels: Vec<P>,
    color_space: ColorSpace,
}
3. Same as 1, but using a `PhantomData<U>` with a generic tag-type like `eudlid` [`Point`](https://docs.rs/euclid/latest/euclid/struct.Point2D.html) types.

4. Same as 2, but using `PhantomData` tags as well.

As you say the type-system heavy approaches in 3. and 4. are harder to reason with and learn, (is that similar to how gfx did things?). 1. Seems terrible for storage space which leaves options 2. as probably the best option given all the tradeoffs I think but I don't know if you had something else in mind?

With option 2. I think FromPixelCommon would change to something like:

trait FromPixelCommon<P> {
    fn from_pixel_common(pixel: P, source_color_space: ColorSpace, destination_color_space: ColorSpace) -> Self;
}

Am I understanding correctly?

On the single responsibility aspect, which bit of this design goes did you see going against that principle? Are you suggesting moving the helper functions such as component_array() color_component_array() and the map_*() functions to an extension trait? I think I'd probably be fine with that but I'm not sure as to the advantages/disadvantages of separating them either.

(1.) doesn't work if pixels are to be in-place in a buffer. (2.) implies that the current trait methods for Pixel all need this additional context. Not impossible but definitely requires design work. You might enjoy #1718 as an approach going in this direction. (3.) and (4.) are the rightfully critiqued and unwieldy options that didn't work out in gfx.


What do we have the Pixel trait and strong typing for? It is mostly a matter of beginner friendly input-output design. Some person having a design in mind needs a simple way for defining pixels. Another needs a simple way of getting out pixels in some workable format to pass on (to textures). In both these cases the expert users will look for texel byte buffers; I don't think we need to worry about them too much when desiging the indexing based ImageBuffer.

However, we don't need to break too much either. If we constrain the roles of existing generic buffer types to the 'simplified user interface', we shouldn't have too many problems with giving well-defined constructors for the powerful buffers that will need care and runtime representations.

I'd like to propose the following stance, since I think it can be done incrementally:

  • ImageBuffer and DynamicImage are seen as types to give semantics to a pixel matrix, and nothing more. In particular we'd solve metadata, image sequences, color spaces in a separate type with constructions from each of these.
  • We stick with this above design for it and in particular add Add missing map2/apply2 functions to Pixel trait #2241. There are well-defined operation for all s-RGB like representations and definitely useful.
  • The current implementation of Pixel strongly imply sRGB as a dynamic invariant for its types, that is something that operations one it should assume. This allows it to be closed and non-generic, resolving a few of the design conflicts. For Pixel<f32> we try not to clamp. Out-of-range components should be treated via identity for all transforms (unless someone knows of a standard with another approach).
  • The Pixel type stays used for ImageBuffer, not necessarily in the interface beyond. The main impact is imageops. This is okay, we don't intend imageops to be usable for any kind of color matrix. I don't think we need to deprecate anything in a single release here.
  • We try to contain the generic arguments and particularly those with a Pixel bound to within ImageBuffer (and flat:: for zero-copy mapping but might re-evaluate the need here).

Beyond Pixel, I'd like to just demonstrate that the plan of simplifying responsibility creates breathing room. If you want to debate any of these, different issues might be appropriate:

  • Rename ImageBuffer to PixelImage or PixelBuffer or PixelMatrix, and DynamicImage accordingly. This makes room for ImageBuffer to be a new type that contains opaque, non-generic pixel representation. Basically, the expert type but similar to PathBuf/OsString we don't need to lie about the complexity to the user here either.
  • The ImageDecoder trait gets to return the descriptor object, including metadata. It should be constructible from size and ColorType but not necessarily exclusively from these. It should be possible to fallibly allocate both DynamicImage and ImageBuffer from this descriptor.
  • The descriptor internals are not added as a public interface for the forseeable future. With awareness of pushing my own crate here but, image-canvas already follows the needs and just requires a wrapper following the existing PR. In particular, I hope we can thus design a better descriptor including interaction with real world workable solutions such as 4cc in a separate crate for now, also relieving a bit of pressure here. Also hoping to reduce some pressure from having to deal with IO and operations in image itself.
  • When the new opaque pixel container is added, it must have a conversion to DynamicImage but doesn't promise this to fully preserve semantics. Also it must be constructible from a DynamicImage. It should also be constructible from the descriptor type returned from the decoder.

@ripytide
Copy link
Member

ripytide commented Jun 9, 2024

I'd agree with all you've said.

The pixel trait is used mostly in imageops and imageproc from what I've seen for writing functions which are generic over grayscale and color images (like filter()). In fact most of the functions in imageprocs are pixel-based hence the strong dependency on the Pixel trait and ImageBuffer types. I don't see the harm in also making the non-pixel based canvas which if I'm understanding correctly seems to be more heavily used in image encoding/decoding and GPU interaction which seems like a great use-case also.

I'm thinking such a large change as we are discussing might need quite a bit of planning how best to split it up incrementally as you suggested. Maybe a tracking issue or project board might be the best approach to organize the ordering of all required changes with an issue created for each individual minimal change like renaming a type.

@kornelski
Copy link
Contributor

kornelski commented Jun 10, 2024

I've seen the talk on problems in gfx-rs, but I don't think it applies to just pixel types. It seems it tried to do more complex types, and with harder constraint of not being in charge of its own API due to zero-cost wrapping existing non-Rust ones.

I've successfully written large image processing pipelines using strongly typed pixels. I don't think they are "for beginners". They ensure correct handling of bitmaps, and help with monomorphising code for various layouts (and there's always casting to bytes for code in the style of ptr += bpp).

[u8] in particular is a pain to work with 16-bit data. It has unspecified endianness and wrong alignment that prevents casting to [u16].

I think types like [RGBA<T>] work well. There is however awkwardness of types like DynamicImage:

  • Lack of fat pointers for 2D slices, so a slice of DynamicImage must be another type.
  • There's no nice way to subset an enum for situations like JPEG returning definitely opaque image.
  • Can't extend an enum either for new formats, without burdening all users with exotic ones (YUV, planar).
  • Going from enum to trait-based dispatch requires a boilerplate match

But the alternative of bytes plus some ColorFormat enum has the same problems with enums and traits, minus type safety.

So I think pixel types are fine. A good Pixel trait could help write correct image processing code, beyond 8-bit.

I'm missing a better DynamicImage. The Canvas could be a good replacement, although "just upload to a GPU" is a bit too futuristic for me still, and I'd still like proper APIs for good'ol' CPU processing.

I usually need a separate notion of image file with all the metadata, color profiles, anim frames, and custom extras like file modification timestamps, original file format, and version of the codec. Then a thinner anim frame that knows its color profile, but isn't burdened by non-pixel data, and then 2D slices of pixels at the lowest level for actual byte pushing.
AFAIK this crate lacks the full-fat image type to wrap DynamicImage in, and image decoders serve as a substitute source of metadata/color profiles.

@johannesvollmer
Copy link
Contributor Author

johannesvollmer commented Jun 17, 2024

Thanks for your insights @kornelski :)

This again seems to double down on the theory that a large part of the pixel complexity comes from processing. In my personal experience, the value of image-rs is the ability to decode many formats and unify their contents, using a single crate dependency. The ability to process pixels could be done in a separate crate for all my personal projects so far, without any downside.

The decoding/encoding library could work with a rather dynamic system (an enum for pixel formats) and the imageproc library could provide a statically typed layer on top of the dynamically decoded data. This would also allow users to load obscure data (such as arbitrary exr channels such as depth and motion vectors) without the need for the decoder library to provide static types for these rare use cases.

I really think the most important step to keeping the library maintainable in long term would be to pivot and define one single responsibility: Load various image files using a unified pixel api. Currently, there seem to be two responsibilities, that exponentiate their complexity when combined (first is de/encoding and second is pixel processing). I think it will be practically impossible to come to any conclusion in this discussion without separating these two responsibilities. To go even further, I propose we shift the discussion to this topic of separating these two concerns.

The idea would be the following (up for discussion):

  • image crate provides a runtime-typed container type for all the possible image data, relying on runtime mechanisms. Uses no static typing, offers no pixel processing (similar to the current byte-buffer-based ImageBuffer). Maybe not even a DynamicImage enumeration. Maybe offers conversion from various color formats into a specific requested format?
  • image-proc crate provides a completely separate image container type, using static types supporting only the most commonly used color types. Definitely includes a DynamicImage type.

While in theory, these two container types could also remain in this one crate, separating into two crates will help us to maintain a clean boundary, and not be tempted to blur the separation again.

@kornelski
Copy link
Contributor

kornelski commented Jun 18, 2024

Mingling of the imageproc with this crate I think is a separate problem. Even when all of image processing functionality is removed, this crate should still care about pixel types.

If by runtime-typed you mean returning Vec<u8> and some color type description, then this is ill-suited for 16-bit and floating point types due to lack of alignment. It's UB to cast these to Vec<u16> or &[f32], or variations of these like &[RGBA<u16>].

@johannesvollmer
Copy link
Contributor Author

Yes, we still want some kind of Pixel type, but I still think removing processing would reduce a lot of complexity for the Pixel trait.

sidenote about u8 buffer alignment

@kornelski

If by runtime-typed you mean returning Vec and some color type description, then this is ill-suited for 16-bit and floating point types due to lack of alignment

If i am not mistaken, image crate currently handles this challenge correctly without problems. Here's how: When allocating a u8 buffer for an image, first the color type is checked, and then a strong-typed buffer is allocated (e.g. [u16]), and then this is cast to a u8 buffer.

@kornelski
Copy link
Contributor

kornelski commented Jun 26, 2024

then a strong-typed buffer is allocated (e.g. [u16]), and then this is cast to a u8 buffer.

If you do this with Vec<u8>, it is UB: https://doc.rust-lang.org/stable/std/alloc/trait.GlobalAlloc.html#tymethod.dealloc

layout must be the same layout that was used to allocate that block of memory.

Layout contains alignment, and Vec<u8> will always set it to 1.

@johannesvollmer
Copy link
Contributor Author

johannesvollmer commented Jun 26, 2024

so then, it's only a problem if you don't explicitly handle deallocation of the buffer, which would definitely be possible. If I made another mistake in my thinking, I suggest we continue this discussion elsewhere to avoid cluttering up the discussion :)

@HeroicKatora
Copy link
Member

HeroicKatora commented Jun 26, 2024

The image crate only cast slices behind a reference, not the Vec itself, I believe. This is sound cast. As teased, one could ensure that the buffer is re-interpreted back to its original type before it is dropped and no mutable reference is leaked out, but this is error prone and does not provide much beyond the slice cast. In comparison, image-canvas normalized the type to a highly-aligned buffer in all cases—in the future (higher Rust version) an optimization with an aware allocator might ensure this normalization is zero-copy in most case.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

6 participants