DL4J Keras model import with CUDA backend strange behavior (0.9.2-SNAPSHOT) #4558

emc5ud · 2018-01-26T17:23:30Z

This issue is a little strange.

So I trained a model in keras with tensorflow as a backend and would like to import it to DL4J. This works when I set my backend to be "nd4j-native", but fails when I set by backend as "nd4j-cuda-8.0".

I create the model in the following way: unet.py (training omitted). And load it into DL4J as so

  def getModel() = {
    KerasModelImport.importKerasModelAndWeights("src/main/resources/unet.h5", false)
  }

When I load the model in DL4J with GPUs enabled, process hangs at the above step. I've waited for about 10 minutes with no luck, and all the while the process uses a steady 6-7 GB of my GPU's memory.

Now the strange part is that when I create the model in Keras with "tf" ordering (aka channels last) the hang up no longer happens and everything is great (unet_tf_ordering.py). When I use CPU it doesn't care what ordering I use.

Relevant python libraries:

keras==1.2.2
tensorflow-gpu==1.4

Relevant part of my build.sbt:

libraryDependencies ++= Seq(
  "org.deeplearning4j" % "deeplearning4j-core" % "0.9.2-SNAPSHOT",
  "org.deeplearning4j" % "deeplearning4j-modelimport" % "0.9.2-SNAPSHOT",
  "org.nd4j" % "nd4j-cuda-8.0" % "0.9.2-SNAPSHOT" classifier "" classifier "linux-x86_64",
  "org.bytedeco.javacpp-presets" % "cuda" % "8.0-6.0-1.3" classifier "" classifier "linux-x86_64"
//  "org.nd4j" % "nd4j-native" % "0.9.2-SNAPSHOT" classifier "" classifier "linux-x86_64",
//  "org.bytedeco.javacpp-presets" % "mkl" % "2017.3-1.3" classifier "" classifier "linux-x86_64",
//  "org.bytedeco.javacpp-presets" % "openblas" % "0.2.20-1.3" classifier "" classifier "linux-x86_64"
)

The text was updated successfully, but these errors were encountered:

maxpumperla · 2018-03-29T14:31:01Z

Hey @emc5ud we had an HDF5 fix recently (here: #4870), would you mind checking the import again with CUDA? I have a hunch this will fix it.

maxpumperla · 2018-04-19T15:59:08Z

@emc5ud could you please give it a try on your end? thanks

emc5ud · 2018-04-19T17:08:07Z

Hi Max, sorry for the delay. I'll give it a try this weekend.

maxpumperla · 2018-05-15T07:59:28Z

@emc5ud hey, any feedback for me on this? This seems resolved, but feel free to reopen, OK? Thanks

lock · 2018-09-22T01:24:32Z

This thread has been automatically locked since there has not been any recent activity after it was closed. Please open a new issue for related bugs.

emc5ud changed the title ~~DL4J Keras Model Import with CUDA backend strange behavior (0.9.2-SNAPSHOT)~~ DL4J Keras model import with CUDA backend strange behavior (0.9.2-SNAPSHOT) Jan 26, 2018

maxpumperla self-assigned this Jan 26, 2018

maxpumperla added the DL4J Keras Issues related to Keras import label Jan 26, 2018

maxpumperla added the Bug Bugs and problems label May 15, 2018

maxpumperla closed this as completed May 15, 2018

lock bot locked and limited conversation to collaborators Sep 22, 2018

eclipsewebmaster unassigned maxpumperla Jun 12, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

DL4J Keras model import with CUDA backend strange behavior (0.9.2-SNAPSHOT) #4558

DL4J Keras model import with CUDA backend strange behavior (0.9.2-SNAPSHOT) #4558

emc5ud commented Jan 26, 2018

maxpumperla commented Mar 29, 2018

maxpumperla commented Apr 19, 2018

emc5ud commented Apr 19, 2018

maxpumperla commented May 15, 2018

lock bot commented Sep 22, 2018

DL4J Keras model import with CUDA backend strange behavior (0.9.2-SNAPSHOT) #4558

DL4J Keras model import with CUDA backend strange behavior (0.9.2-SNAPSHOT) #4558

Comments

emc5ud commented Jan 26, 2018

maxpumperla commented Mar 29, 2018

maxpumperla commented Apr 19, 2018

emc5ud commented Apr 19, 2018

maxpumperla commented May 15, 2018

lock bot commented Sep 22, 2018