Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update ImageUtils to support resizing one, three, or four channel images #94

Merged
merged 13 commits into from Jan 30, 2018

Conversation

smurching
Copy link
Collaborator

@smurching smurching commented Jan 24, 2018

This PR:

  • Updates conversions from row image to/from Buffered image (spImageToBufferedImage and spImageFromBufferedImage) to support one, three, and four channel images
  • Updates resizeImage to use the tgtChannels parameter to determine the number of channels in the output image instead of defaulting to three output channels
  • Updates existing tests to verify that resizing, conversions to/from BufferedImage work for one, three, and four-channel images

Copy link
Collaborator Author

@smurching smurching left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is ready for a pass, there are some issues described in my review comments

@@ -72,15 +90,40 @@ private[sparkdl] object ImageUtils {
image
}

/** Returns the number of channels in the passed-in buffered image. */
Copy link
Collaborator Author

@smurching smurching Jan 24, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This method and getOCVType invert the mapping done in ImageSchema.scala

val color = new Color(image.getRGB(w, h), image.getColorModel.hasAlpha)
channels match {
case 1 =>
decoded(offset) = color.getBlue.toByte
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note: Unfortunately this doesn't always yield the original blue byte written in spImageToBufferedImage. In general the setRGB and getRGB methods don't yield consistent results when working with grayscale BufferedImages (BufferedImages with image type TYPE_BYTE_GRAY), see this code snippet:

https://gist.github.com/smurching/d6e918a84c79155b616f9339b30fdfca

It makes sense to me that this happens, since it doesn't really make sense to tell a grayscale image "set the pixel at 0, 0 to RGB (x, y, z)" and expect the RGB channel values to actually be (x, y, z) - if this were the case, the image might not be gray. I'm not sure how to work around it...

Copy link
Collaborator Author

@smurching smurching Jan 25, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually I think this can be fixed by stealing some code from ImageSchema that directly manipulates the Image raster, will push an update...

val bufferedImage = ImageUtils.spImageToBufferedImage(spImage)
val testImage = ImageUtils.spImageFromBufferedImage(bufferedImage)
assert(spImage === testImage, "Image changed during conversion.")
for (channels <- Seq(3, 4)) {
Copy link
Collaborator Author

@smurching smurching Jan 24, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note: This test will fail for channels = 1, see my comment in spImageFromBufferedImage

val testImage = ImageUtils.resizeImage(tgtHeight, tgtWidth, tgtChannels, biggerImage)
assert(testImage === smallerImage, "Resizing image did not produce expected smaller image.")
for (channels <- Seq(1, 3, 4)) {
val smallerImage = ImageUtilsSuite.getImageRow(getImagePath("small", channels))
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note: I cheated here & had the test compare the original image to a smaller image generated via the resizeImage code path (smallerImage is an image file generated by saving the result of calling resizeImage on biggerImage). @MrBago do you remember how you generated the original image pair for this test?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you mean you compare against image resized using BufferedImage or ImageUtils.resizeImage?

Just a suggestion, why don't we store only the original image, and compare against the image resized in java using BufferedImage on the fly?

@codecov-io
Copy link

codecov-io commented Jan 25, 2018

Codecov Report

Merging #94 into master will increase coverage by 0.13%.
The diff coverage is 95%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master      #94      +/-   ##
==========================================
+ Coverage   82.44%   82.57%   +0.13%     
==========================================
  Files          34       34              
  Lines        1937     1946       +9     
  Branches       36       40       +4     
==========================================
+ Hits         1597     1607      +10     
+ Misses        340      339       -1
Impacted Files Coverage Δ
...main/scala/com/databricks/sparkdl/ImageUtils.scala 92.18% <95%> (+1.27%) ⬆️
.../scala/org/apache/spark/ml/image/ImageSchema.scala 80.64% <0%> (+1.61%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 12b2697...3c757ca. Read the comment docs.

val isGray = image.getColorModel.getColorSpace.getType == ColorSpace.TYPE_GRAY
val hasAlpha = image.getColorModel.hasAlpha
var offset = 0
// Grayscale images in Java require special handling to get the correct intensity
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The logic here is basically the inverse of Spark's logic for reading Grayscale images into a data array (this code sets the pixels of a grayscale image from a data array): https://github.com/apache/spark/blob/master/mllib/src/main/scala/org/apache/spark/ml/image/ImageSchema.scala#L159

For context, if we set grayscale image data using image.setRGB(w, h, rgbInt), calling image.getRGB(w, h) won't necessarily return the same result (see this gist). Hence we set grayscale image data directly in the raster here.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice work , thanks for the explanation! Did you consider making the 3/4 channel images also use raster directly? The code would be more consistent that way.

@smurching
Copy link
Collaborator Author

Tests failed for Python 3.5 because curling travis failed, retriggering the tests...

Copy link
Collaborator

@tomasatdatabricks tomasatdatabricks left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice work! I left few comments.

  1. The cheating test should be fixed (if I understood the comment :)

  2. I think we should check that the image is one of the supported OpenCvTypes and throw meaningful exception otherwise.

  3. I would consider using the direct raster access for both grayscale and colored images.

require(false, s"`Channels` must be 1 or 3, got $channels.")
}
val image = channels match {
case 1 => new BufferedImage(width, height, BufferedImage.TYPE_BYTE_GRAY)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Technically there can be images with 2 channels. We should throw UnsupportedOperationException or something like that and inform user that resize does not work for 2 channel images.

Actually, thinking about it, we should probably check that the open cv type is one of the supported ones - e.g. one of CV_8UC{1,3,4}, so how about something something like this :

val mode = ImageSchema.getMode(rowImage)

val image = mode match {
case imageSchema.ocvTypes("CV_8UC1") => new BufferedImage(width, height, BufferedImage.TYPE_BYTE_GRAY)
....
case _ => throw new UnsupportedOperationException("Can not resize images with mode = " + mode)
}

val isGray = image.getColorModel.getColorSpace.getType == ColorSpace.TYPE_GRAY
val hasAlpha = image.getColorModel.hasAlpha
var offset = 0
// Grayscale images in Java require special handling to get the correct intensity
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice work , thanks for the explanation! Did you consider making the 3/4 channel images also use raster directly? The code would be more consistent that way.

image
}

/** Returns the number of channels in the passed-in buffered image. */
private def getNumChannels(img: BufferedImage): Int = {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not call image.getColorModel.getNumComponents?

for (h <- 0 until height) {
for (w <- 0 until width) {
val color = new Color(image.getRGB(w, h), hasAlpha)
decoded(offset) = color.getBlue.toByte
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same as above, I wonder if it would be more consistent to use raster directly for both grayscale and BGR(A).

val testImage = ImageUtils.resizeImage(tgtHeight, tgtWidth, tgtChannels, biggerImage)
assert(testImage === smallerImage, "Resizing image did not produce expected smaller image.")
for (channels <- Seq(1, 3, 4)) {
val smallerImage = ImageUtilsSuite.getImageRow(getImagePath("small", channels))
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you mean you compare against image resized using BufferedImage or ImageUtils.resizeImage?

Just a suggestion, why don't we store only the original image, and compare against the image resized in java using BufferedImage on the fly?

val tgtImg = new BufferedImage(tgtWidth, tgtHeight, BufferedImage.TYPE_3BYTE_BGR)
val tgtImgType = tgtChannels match {
case 1 => BufferedImage.TYPE_BYTE_GRAY
case 3 => BufferedImage.TYPE_3BYTE_BGR
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would add
case _ => throw meaningful exception.

@smurching
Copy link
Collaborator Author

smurching commented Jan 26, 2018

Just pushed an update that hopefully addresses the ^comments :)

The cheating test should be fixed (if I understood the comment :)

Addressed by adding a helper method that performs an image resize using as few spark-deep-learning utilities as possible (we still call into spImageFromBufferedImage) and comparing ImageUtils.resizeImage output to the result of the helper.

I think we should check that the image is one of the supported OpenCvTypes and throw meaningful exception otherwise.

Done

I would consider using the direct raster access for both grayscale and colored images.

Updated the spImagetoBufferedImage, spImagetoBufferedImage methods to use the raster code path

Also another clarifying comment: the test images 1_channels/00074201.png, 3_channels/00074201.png, and 4_channels/00074201.png are variants of the three-channel 00074201.jpg image. The alpha channel of 4_channels/00074201.png is set to random values between 122 and 255.

require(false, s"`Channels` must be 1 or 3, got $channels.")
}
val imageType = openCVModeToImageType.getOrElse(mode,
throw new UnsupportedOperationException("Cannot convert row image with " +
Copy link
Collaborator Author

@smurching smurching Jan 26, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@tomasatdatabricks As of right now, this error can only occur if users try to pass an image row with an unsupported OpenCV mode to DeepImageFeaturizer. I wonder if the current error message is vague/unhelpful to the user in this case...maybe the image validation should be pushed into DeepImageFeaturizer (but it also seems useful to validate the row image here)?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes I think it should definitely be here as well. I believe you should always try to throw informative exception. Deep ImageFeaturizer is just one of the possible users of this function. Also, it does not have to have the same restrictions.

Copy link
Collaborator

@tomasatdatabricks tomasatdatabricks left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Jus one nit - extra blank line, otherwise LGTM.

}
Row(origin, height, width, channels, ImageSchema.ocvTypes("CV_8UC3"), decoded)
Row(origin, height, width, channels, getOCVType(image), decoded)

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: extra blank line

Copy link
Collaborator

@sueann sueann left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm. just one question on rasterizing.

val image = new BufferedImage(width, height, imageType)
var offset = 0
val raster = image.getRaster
// NOTE: This code assumes the raw image data in rowImage directly corresponds to the
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this change the data compared to the previous way?

Copy link
Collaborator Author

@smurching smurching Jan 29, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good Q, it doesn't change the data layout compared to the previous way. The BufferedImage types supported by our utilities (TYPE_BYTE_GRAY, TYPE_3BYTE_BGR, TYPE_4BYTE_AGBR) have a fixed BGR(A) channel ordering so this approach should also be consistent across Java versions (i.e. the internal raster representation of the BufferedImage data won't suddenly change on us, see BufferedImage docs).

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah I was wondering more about the content, whether what we're doing manually produced the exact same result as Color & Color.getRGB. The tests look like they maintain the same resizing results as before the code change here but wanted to confirm, since doing anything different to the image can affect the results of applying models to it. (Part of it is, I'm not sure how reasonable this assumption stated here is.)

Copy link
Collaborator Author

@smurching smurching Jan 29, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ahh I see. Actually the resize results may change when feeding grayscale images to DeepImageFeaturizer, but not for three-channel images (our DIF tests only run against three-channel images so they don't catch the change).

The difference for grayscale images is that running

val b = rowImageData(offset)
val color = new Color(b, b, b)
image.setRGB(x, y, color.getRGB)

as the original code did doesn't necessarily set the grayscale pixel at (x, y) to b. This happens because Java takes a weighted combination of the (r, g, b) channels when setting the grayscale pixel value. However with the raster approach the pixel at (x, y) is actually directly set to b. Given that b was likely obtained by calling raster.getSample (how ImageSchema.readImages reads the data of a grayscale image), the raster approach seems ok to me.

That being said I could see changes in model behavior being sufficiently undesirable that we wouldn't want to make this change.

Another thing: just realized this PR no longer validates that the input images to DIF are either 1 or 3 channels, I'm assuming we should add this validation back? The practical implication being that as it stands, this PR allows users to feed 4-channel images to DIF without hitting an error.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the direct raster access is the the right way, ImageSchema.decode does the same and using it for the BGR(A) images makes the code more consistent.

Why do you want to restrict the number of channels on DIF?

Copy link
Collaborator Author

@smurching smurching Jan 30, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@tomasatdatabricks mainly I just didn't want to inadvertently release support for four-channel inputs in DIF without adding unit tests :P. Although I suppose we already supported one-channel inputs without specifically testing DIF on one-channel images.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok the current logic looks okay to me. the most important thing is not to change the logic if possible so there is no breaking change for users. i doubt this is a problem at this point.

for the one- and four- channel input stuff -- if we want any testing, it seems enough to have unit tests to make sure this resizing function returns the correct number of channels given targetChannels. Just checking the number of channels seems good enough - we could also check the content of the image if we want to catch any breaking changes in the future.

Copy link
Collaborator Author

@smurching smurching Jan 30, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

SGTM. Should we add the unit test in this PR or in a follow-up? I'm happy to do either, it's a pretty small test. Thanks again for the reviews :)

@sueann
Copy link
Collaborator

sueann commented Jan 30, 2018

@smurching I'm happy to merge this. You can add the tests in this PR if it's useful for your Databricks PR, but can delay to another PR if it's not.

@smurching
Copy link
Collaborator Author

smurching commented Jan 30, 2018

@sueann thanks, let's go ahead & merge this and add the tests in a future PR :)

@sueann sueann merged commit 9273b0f into databricks:master Jan 30, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants