-
Notifications
You must be signed in to change notification settings - Fork 6.2k
8352407: PixelInterleavedSampleModel with unused components throws RasterFormatException: Incorrect pixel stride #24111
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
…sterFormatException: Incorrect pixel stride
|
👋 Welcome back ngubarkov! A progress list of the required criteria for merging this PR into |
|
❗ This change is not yet ready to be integrated. |
Webrevs
|
Just align the buffer size to the pixel stride instead, should be better.
|
|
||
| // Align to the pixel stride. | ||
| size = (size + pixelStride - 1) / pixelStride * pixelStride; | ||
|
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some thoughts I’d like to mention to ensure I understand the problem:
This method should return the number of bytes required to store the image with given parameters—width, height, pixel stride, and scanline. It seems that the calculation could follow this approach:
starting offset + pointer to the last pixel + pixelStride
Is that a correct assumption?
I have doubts that the logic in this method actually calculates the size properly, especially this line:
int val = pixelStride * (width - 1);
size += val;It seems like it should be:
int val = pixelStride + pixelStride * (width - 1);or simply:
int val = pixelStride * width;Or am I missing something?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As I understand, an important thing here is that this calculation needs to be correct for any kind of pixel/scanline/band interleaving.
What if our channels are stored separately (complete image of only R followed by complete image of G and so on) - your formula would only work if we change "pointer to the last pixel" -> "pointer to the last band of the last pixel". But then it would be wrong for the pixel-interleaving case (RGBA for each pixel stored together), because adding pixelStride=4 to the "last band of the last pixel" would actually be 3 bytes more than needed.
The idea of the current approach, as I understand it is that we find the last band of the first pixel, be it 3 for pixel-interleaved RGBA, or 30000 for band-interleaved image, and then add 1 to it, meaning the size sufficient to fit that last band:
int size = maxBandOff + 1;
At this point we can already fit the first pixel, so what's left is to fit the remaining first scanline (hence width - 1):
int val = pixelStride * (width - 1);
size += val;
And the remaining rows (hence height - 1):
val = scanlineStride * (height - 1);
size += val;
This gets us to our subject:
As current calculation only considers used bands by finding maxBandOff, it doesn't really care when we have pixel interleaving and pixelStride > numBands, it will happily cut any unused space at the end of the last pixel, thus causing various problems when we just want to, say, make an RGBx (pixelStride=4, bandOffsets = {0, 1, 2}) raster.
In my approach I just fixed that space which was cut by the previous calculation by aligning the size up to pixelStride. Not that I find this particularly elegant, but it seems to work and not cause regressions in non-pixel interleaved variants.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The idea of the current approach, as I understand it is that we find the last band of the first pixel, be it 3 for pixel-interleaved RGBA, or 30000 for band-interleaved image, and then add 1 to it, meaning the size sufficient to fit that last band:
int size = maxBandOff + 1;
At this point we can already fit the first pixel, so what's left is to fit the remaining first scanline (hence width - 1):
This looks suspicious. Let's try calculating the size or pointer to the last pixel using the formula:
"starting offset + pointer to the last pixel + pixelStride" but in reverse order.
// This moves the pointer to the start of the last row.
size = scanlineStride * (height - 1);
// This moves the pointer to the start of the last pixel in the last row.
size += pixelStride * (width - 1);
Note: At this point, the size is not necessarily aligned to pixelStride because scanlineStride may include a gap larger than pixelStride at the end.
Now, we need to account for the last pixel in the last row. In a theoretical case where the offset is 0, simply adding pixelStride one more time would be sufficient.
However, in our case, we add maxBandOff + 1, which seems odd since maxBandOff represents the offset of the largest component, placing it somewhere within the pixelStride.
To fix this, we should actually add the second part of the pixelStride only if maxBandOff is smaller than pixelStride.
btw, I'm not sure why we can't use the first component instead of the last one:
int minBandOff = bandOffsets[0];
for (int i = 1; i < bandOffsets.length; i++) {
minBandOff = Math.min(minBandOff, bandOffsets[i]);
}
long size = 0;
if (minBandOff >= 0)
size += minBandOff;
if (pixelStride > 0)
size += pixelStride * width;
if (scanlineStride > 0)
size += scanlineStride * (height - 1);
return (int) size;
...Still investigating this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Or maybe we should modify the code that relies on the current size?
Why does our code work fine when skipping the end of the scanline in cases where there is a gap, but it cannot skip the pixelStride when there is a gap as well?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why we can't use the first component instead of the last one
We definitely cannot do that because of the band-interleaved case: imagine if there are like 10000 bytes of red samples followed by another 10000 bytes of green samples - with pixelStride=1 and scanlineStride=100 if we take the first component offset (0), our calculated size would only fit the red channel.
In your calculations using "starting offset + pointer to the last pixel + pixelStride", you assume only pixel-interleaved variant, but this calculation would break with other interleaving modes. Actually, for pixel-interleaved layout there is a simple commonly-used formula, like: size = startingOffset + scanlineStride * height. But the problem is that we don't really know the startingOffset, we only know offsets of the individual components, which leads to the other issue...
the size is not necessarily aligned to pixelStride because scanlineStride may include a gap larger than pixelStride at the end.
Good point, and because of the way pixelStride and scanlineStride are formulated (offsets between same-band samples of the adjacent pixels/scanlines), they don't tell us anything about the alignment. And moreover, in the case of unused components, where our bandOffsets don't fully fill the pixelStride, there is an ambiguity:
Consider bandOffsets = {1, 2, 3} and pixelStride=4, is that a xRGB layout with startingOffset=0, or RGBx with startingOffset=1? The answer is - we don't have enough info as far as I'm concerned...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In your sample you provided an example, when raster fits whole pixels (scanline * (height - 1) + pixelStride * width), which, I agree, is a perfectly fine case and must be supported by accelerated pipelines. But the question is: should partial pixels be handled by our pipelines too (scanline * (height - 1) + pixelStride * (width - 1) + maxBandOffset + 1 where maxBandOffset + 1 < pixelStride), or should whole pixels be enforced? I am for the latter.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
However, this is not a case of a "partial pixel"- the data for the pixel itself is present. Therefore, the user can still manually create a buffer that discards the tail of the image and it will be accepted:
DataBuffer manualBuffer = new DataBufferByte(
scanlineStride * (SIZE - 1) + pixelStride * (SIZE - 1) + 2/*maxBandOffset*/ + 1 /*size of last component*/
);The current logic seems to imply that:
- The scanline stride doesn't matter for a single-line image, or for the last line in the image.
- The pixel stride is irrelevant if there's only one pixel in the image, or for the last pixel in a row.
This behavior is validated in src/java.desktop/share/classes/sun/awt/image/ByteComponentRaster.java#verify, with the exception of the single-pixel image we mentioned earlier.
I believe that exception for the 1-pixel image is a bug (pixelStride > data.length) that was overlooked during the work on commit 86104df, where a similar check for scanlineStride > data.length was fixed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, I found another argument as to why I think we should allocate memory according to pixelStride, including padding. I am currently optimizing the Vulkan blit and specifically reducing the amount of extra copies, so the issue is real.
In order to do draw an image, we need a Vulkan image. To bring raster data into Vulkan image we are currently doing 2 copies: software raster -> VkBuffer -> VkImage. We could get rid of one copy if we imported raster memory as an external allocation and bind VkBuffer to it directly: VkBuffer(software raster) -> VkImage. In theory we could even wrap host memory as an image and use it directly, but I doubt that would be more performant.
So, if we try to import raster data as an external allocation, it becomes useless for the purposes of copying or sampling, because Vulkan requires the whole memory region to be resident, it will copy height rows of rowLength bytes and there is no way it could work in that strange scenario when we have RGBx raster with 1 byte of padding in the end, mapping to RGBA Vulkan format, and there is just a single missing byte in the end of the whole raster ruining our optimization, forcing us to memcpy the whole raster into a temporary buffer, which is just 1 byte larger so that we could properly copy that into Vulkan image.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But even if we start allocating raster memory with padding internally, you will still need to support both cases anyway, right? So for now, maybe it's best to implement two versions of the blit: one optimized (assuming properly padded memory) and one fallback (handling the edge case with the missing padding).
At runtime, we can select the appropriate version depending on the actual memory layout. Then we can compare performance and see whether supporting both is actually worth it.
If the performance difference is significant, we can update our internal allocation strategy to always use the optimized layout, while still supporting custom rasters (with non-standard strides) using the slower path.
Just to clarify - are we talking only about padding per pixel, or do we also need full row padding to meet Vulkan’s alignment requirements?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
even if we start allocating raster memory with padding internally, you will still need to support both cases anyway, right?
I don't think so. If we accept a user-created raster, it will not have a native structs initialized due to #24378 anyway, so it will be converted into a known type first, before being used in Vulkan loops. So if we create rasters with optimal layouts internally, we should mostly be fine. Similarly for accepting external rasters (#24378), we could only initialize native data if the memory layout matches the requirements of our accelerated loops.
are we talking only about padding per pixel, or do we also need full row padding to meet Vulkan’s alignment requirements?
Sorry, not "it will copy height rows of rowLength", but "it will copy height rows of width texels with rowLength steps".
We are talking about per-pixel padding, not full row one.
Here's the spec on that: https://registry.khronos.org/vulkan/specs/latest/html/vkspec.html#copies-buffers-images
There is also an alignment limitation: buffer offset and row length must be multiple of texel block size - for our rasters it means that startingOffset and scanlineStride must be aligned to pixelStride, in order to be able to copy from this memory into Vulkan image directly. Again, this requirement is easily fulfilled when creating rasters internally, and can be checked when accepting external rasters.
|
One definite issue with this fix is that it assumes pixelStride !=0 So this following code is broken with this change import java.awt.image.*; public class Z { % java Z |
|
@YaaZ This pull request has been inactive for more than 4 weeks and will be automatically closed if another 4 weeks passes without any activity. To avoid this, simply issue a |
|
/touch me, how can it be? I'm gonna find some time to refresh all the details, consider all concerns and move this and #24378 further |
|
@YaaZ The pull request is being re-evaluated and the inactivity timeout has been reset. |
|
@YaaZ This pull request has been inactive for more than 4 weeks and will be automatically closed if another 4 weeks passes without any activity. To avoid this, simply issue a |
|
@YaaZ This pull request has been inactive for more than 8 weeks and will now be automatically closed. If you would like to continue working on this pull request in the future, feel free to reopen it! This can be done using the |
Progress
Error
Issue
Reviewing
Using
gitCheckout this PR locally:
$ git fetch https://git.openjdk.org/jdk.git pull/24111/head:pull/24111$ git checkout pull/24111Update a local copy of the PR:
$ git checkout pull/24111$ git pull https://git.openjdk.org/jdk.git pull/24111/headUsing Skara CLI tools
Checkout this PR locally:
$ git pr checkout 24111View PR using the GUI difftool:
$ git pr show -t 24111Using diff file
Download this PR as a diff file:
https://git.openjdk.org/jdk/pull/24111.diff
Using Webrev
Link to Webrev Comment