-
Notifications
You must be signed in to change notification settings - Fork 3.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ImageRecordReader crashes JVM with loaded Keras model in 1.0.0-beta7 #8976
Comments
Any update on this? Would you like more information for reproducing it? |
If you can provide a small demo project that we can clone and run directly, in order to reproduce the crash, it would be very helpful. |
So it turns out the problem is somewhat platform dependant (didn't occur on Mac). A fix for this is to duplicate the array: Would you have any idea why duplicating the input array would stop the crash? |
I've made a simple Gradle project to demonstrate this and help you reproduce it. Instructions
In Hopefully this can be reproduced on your machine, let me know if there's any other info you'd like :) Attached Files |
Mine's crashing on 1.0.0 beta-7 with certain images, regardless of whether cpu or gpu is used. Try reproducing by grabbing the dataset: Put the first four pets into separate sub folders and move their corresponding images. Use the AnimalClassifier image classification example, replace the download path with a local path pointing to the pets folder. change: in order to get disable the max path per label Partial crash dump below. A fatal error has been detected by the Java Runtime Environment:EXCEPTION_ACCESS_VIOLATION (0xc0000005) at pc=0x00007ff94c81a799, pid=14192, tid=30636JRE version: Java(TM) SE Runtime Environment (14.0+36) (build 14+36-1461)Java VM: Java HotSpot(TM) 64-Bit Server VM (14+36-1461, mixed mode, sharing, tiered, compressed oops, g1 gc, windows-amd64)Problematic frame:C [KERNELBASE.dll+0x3a799]urrent thread (0x000002104c301800): JavaThread "main" [_thread_in_native, id=30636, stack(0x000000b1c6500000,0x000000b1c6600000)] Stack: [0x000000b1c6500000,0x000000b1c6600000], sp=0x000000b1c65fbe50, free space=1007k Java frames: (J=compiled Java code, j=interpreted, Vv=VM code) |
This crash means OpenCV has thrown C++ exception which wasn't caught properly. C interface can't process those. |
Thanks. I looked into this further. I can confirm that my particular is 8-bit color depth images are causing this line to fail (other images were 24-bit depth) in opencv_core and ultimately causing a kernalbase.dll crash. org.bytedeco.opencv.global.opencv_imgproc.cvtColor(Lorg/bytedeco/opencv/opencv_core/Mat;Lorg/bytedeco/opencv/opencv_core/Mat;I)V+0 Sample 8-bit image causing the crash: I converted the 8-bit image to 24 bit and the crash upon loading went away, however this is different issue from the topic, but the takeaway from this is that we should consider adding color depth checking in to the ImageRecordReader class and warn the user about the incompatible images along with the offending filename, until either opencv fixes this issue, or ImageRecordReader is updated to support different color depths by calling the appropriate function in opencv if there is one. |
@raver119 I'm pretty sure C++ exceptions are getting caught and rethrown as Java exceptions. Something else is going on here... |
@saudet After more investigation, I mistakenly thought that 8-bit images was the issue, but problem appears to be related to some gif images that cause opencv to seg fault, especially the animated ones. But I still can't work out what's wrong with the cat gif, although when trying to read the first pixel, opencv crashes, probably reading the wrong memory address. Not a biggie, but anyone wants to have a crack at solving why the above gif crashes, here's an isolated test:
|
Can you please share the image as well?
With best regards,
raver119
14 июня 2020 г., 17:13 +0300, phong-phuong <notifications@github.com>, писал:
… @saudet After more investigation, I mistakenly thought that 8-bit images was the issue, but problem appears to be related to some gif images that cause opencv to seg fault, especially the animated ones. But I still can't work out what's wrong with the cat gif, although when trying to read the first pixel, opencv crashes, probably reading the wrong memory address.
Not a biggie, but anyone wants to have a crack at solving why the above gif crashes, here's an isolated test:
import java.io.File;
import java.io.IOException;
import org.datavec.image.loader.NativeImageLoader;
import org.nd4j.linalg.api.ndarray.INDArray;
import org.nd4j.linalg.factory.Nd4j;
public class OpenCVTest {
public static void main(String[] args) throws IOException {
NativeImageLoader imageLoader = new NativeImageLoader();
String filenameAndPath = "cat.gif";
int channels = 4;
int imageHeight = 202;
int imageWidth = 250;
INDArray view = Nd4j.create(new int[] {channels, imageHeight, imageWidth});
File file = new File(filenameAndPath);
imageLoader.asMatrixView(file, view);
System.out.println(view);
}
}
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub, or unsubscribe.
|
@raver119 it's above, a few posts up, the cat one but it labelled as a jpg but it's really a gif. |
Thanks |
Ah, I see, a GIF file, that's not actually supported by OpenCV, but it looks like imread() returns an empty array instead of null, so we should check for that... |
It does check for empty arrays, so it's using Leptonica to load this. It's probably related to issue #8785. |
Closing in favor of sam's linked issue with more details: #8785 |
Issue Description
I encountered a strange problem in 1.0.0-beta7 while trying to run a Keras model loaded from a .h5 file (e.g., VGG16.h5 from here) - this model previously ran fine in 1.0.0-beta6.
Calling
computationGraph.feedForward(features, false)
would crash the JVM (error log, using this code snippet:The crash would happen specifically within the
ComputationGraph
class at line 1976 - figured this by stepping through the code in IntelliJ.Strangely though, the code snippet above runs fine if you use a random numpy array of the same shape (so the issue isn't caused by the
features
shape). Looking into the values of thefeatures
given by theDatasetIterator
, there aren't any NaNs or weird values (all are between 0 and 1).Also interesting to note is that the .h5 model can be saved in beta6 to a zip using
model.save(new File("VGG.zip"))
, then loaded in beta7, and the above snippet works fine (swapping theKerasModelImport...
forComputationGraph.load(new File("beta6KerasVGG.zip"), true);
Another note, the above snippet works fine if using a different model (e.g.,
ResNet50.h5
) - so it's not all Keras models that this problem occurs with.Conclusion
On one hand, it seems like the problem is caused by updates to the
KerasModelImport
process - a.h5
file which loaded and ran fine in1.0.0-beta6
now no longer works in1.0.0-beta7
. Additionally, saving a.zip
file of the beta6 version and loading a newComputationGraph
in beta7 circumvents the above problem.However, it also seems like the
ImageRecordReader
orDataSetIterator
could be the culprit - when those are taken out of the equation (by using a randomINDArray
) no errors occur.Attached files
Version Information
Please indicate relevant versions, including, if relevant:
The text was updated successfully, but these errors were encountered: