Question about applying trained model over image and its performance for real time processing #351

archenroot · 2017-01-23T10:36:57Z

Hi guys,

I trained a network for AnimalClassification from examples, the result is:

-rwxr-xr-x 1 zangetsu users  98973270 23. Jan 11.44 org.deeplearning4j.nn.multilayer.MultiLayerNetwork@aa0aa6.bin
-rwxr-xr-x 1 zangetsu users     23818 23. Jan 11.44 org.deeplearning4j.nn.multilayer.MultiLayerNetwork@aa0aa6-conf.json
-rwxr-xr-x 1 zangetsu users 198915443 23. Jan 11.44 org.deeplearning4j.nn.multilayer.MultiLayerNetwork@aa0aa6updators.bin

How can I apply this network to image taken from OpenCV for example in object of type Frame or Mat?

Thanks a lot for this amazing work. I went trough Torch and Tensor flow, but as being Java developer, this is so far most most easy to adopt.

The text was updated successfully, but these errors were encountered:

archenroot · 2017-01-23T11:37:40Z

Ok, I found this article, so will try to adopt it:
http://stackoverflow.com/questions/37318692/using-dl4j-for-evaluating-an-image-kind-of-like-in-alphago

archenroot · 2017-01-23T15:46:06Z

Ok, I realized by myself, by storing first the model into ZIP file:

 log.info("Save model....");
File locationToSave = new File("MyMultiLayerNetwork.zip");      //Where to save the network. Note: the file is in .zip format - can be opened externally

//Updater: i.e., the state for Momentum, RMSProp, Adagrad etc. Save this if you want to train your network more in the future
boolean saveUpdater = true;   

ModelSerializer.writeModel(network, locationToSave, saveUpdater);

Instead of drop the files into folder by following code from one of examples:

NetSaverLoaderUtils.saveNetworkAndParameters(network, basePath);
NetSaverLoaderUtils.saveUpdators(network, basePath);

Then I did following:

public void run() throws IOException {
        String basePath = FilenameUtils.concat(System.getProperty("user.dir"), "MyMultiLayerNetwork.zip");
        MultiLayerNetwork network = ModelSerializer.restoreMultiLayerNetwork(basePath);
        log.info("Number of layers within loaded network: " + String.valueOf(network.getLayers().length));
        log.info("Going to test single image");
        NativeImageLoader loader = new NativeImageLoader(height, width, channels);

        // Get the image into an INDarray
        File imageFile = new File(System.getProperty("user.dir"), "neuralnetworktrainer/src/main/resources/animals/duck/duck_01.jpg");

        INDArray image = loader.asMatrix(imageFile);

        // 0-255
        // 0-1
        DataNormalization scaler = new ImagePreProcessingScaler(0,1);
        scaler.transform(image);
        // Pass through to neural Net

        INDArray output = network.output(image);
        INDArray labels = network.getLabels();


        log.info("## The FILE CHOSEN WAS " + imageFile.getAbsolutePath());
        log.info("## The Neural Nets Pediction ##");
        log.info("## list of probabilities per label ##");
        log.info("## List of Labels in Order## ");
        log.info(output.toString());

        //log.info(labelList.toString());

    }

and got following result:

INFO  LoadAndTestNetwork - ## The FILE CHOSEN WAS /mnt/data/proj/open-source/opencv-examples-javafx/javafxopencvhelloworld/neuralnetworktrainer/src/main/resources/animals/duck/duck_01.jpg
INFO  LoadAndTestNetwork - ## The Neural Nets Pediction ##
INFO  LoadAndTestNetwork - ## list of probabilities per label ##
INFO  LoadAndTestNetwork - ## List of Labels in Order## 
INFO  LoadAndTestNetwork - [0.22, 0.04, 0.72, 0.02]

[0.22, 0.04, 0.72, 0.02] numbers represents probablity for each of 4 tested labels (bear, deer, duck and turtle). As suggested in the manual, this network example is not much accurate, but I finished my first example.

Long live DL4J :-)

The only thing what worry me is the performance of image processing:
17:40:49,721 INFO LoadAndTestNetwork - Network loaded.
17:40:50,164 INFO LoadAndTestNetwork - Image processed: [0.22, 0.04, 0.72, 0.02]

This is about 0.5 second for single image with 3 channels (RBG) of size 100x100, this is really bad for real-time video processing :-(, I also not my GPU is not enabled due to some bug reported yesterday nad possibly fixed today.

I am closing this issue and will reopen new regarding the throughput.

tomthetrainer · 2017-01-23T15:46:53Z

You do know that you can perform OpenCV operations in dl4j, See
https://deeplearning4j.org/simple-image-load-transform#specifying-particulars-for-your-image-pipeline-transformation

archenroot · 2017-01-23T15:50:07Z

@tomthetrainer I know, but how OpenCV will help me with image comparision against trained network, I thought OpenCV is used mainly for image transformations in preparing datasets for network training, am I wrong on this? If yes, how could OpenCV speedup image labeling from existing network?

.... Anyway I am going to read that article, thanks a lot! And enable GPU and do same dummy test on single image.

tomthetrainer · 2017-01-23T15:59:05Z

OpenCV won't help with speedup, really I just saw you mention openCV and thought you should know that you could use it inside DL4J. I thought perhaps you wanted to extract a feature using OpenCV and ship that to the network for inference.

archenroot · 2017-01-23T15:59:28Z

Well I also doesn't like log4j version 1, better to implement log4j2 with asynchronous appenders everywhere.

Anyway I switched from native to CUDA nd4j backend and run few executions to see something like AVG value of exec time:
7:54:47,025 INFO LoadAndTestNetwork - Network loaded.
17:54:47,298 INFO LoadAndTestNetwork - Image processed: [0.22, 0.04, 0.72, 0.02]
So, now I have about 0.25s for single image, with what you can get 4fps LOL, so this is not much usable for real-time processing, or it will need to be really used accross 4 images per second only with algorithms of predicted movement, hm....

Well, my minimal requirement is 20-30 FPS. I will also test on multiple GPUs if better numbers.

archenroot · 2017-01-23T16:02:16Z

@tomthetrainer Thanks for hints Tom, I am now doing lots of tests regarding real-time processing of streams where task is to track players and ball over playfield, so I am not sure now if this is ok to do with dl4j in online mode.... I of course could process the stored video later for analysis, but one of features should be also online mode...

I will try to examine some tuning and lets see what number I will be able to obtain, the minimum number to be accepted is 50ms for single image...

Anyway it is really fun working with this stack 👍

archenroot · 2017-01-23T16:04:04Z

As opposite to this I might fall back to OpenCV Classifier class.

archenroot · 2017-01-23T16:25:24Z

Ok, I extended the logging for each and every from those few commands and the output is following:

 INFO  LoadAndTestNetwork - Going to load network.
 INFO  Reflections - Reflections took 1613 ms to scan 223 urls, producing 2428 keys and 15413 values 
 INFO  Reflections - Reflections took 82 ms to scan 9 urls, producing 388 keys and 1483 values 
 INFO  LoadAndTestNetwork - Network loaded in 23491.022179ms.
 INFO  LoadAndTestNetwork - Image loader initiated in 1.508399ms.
 INFO  LoadAndTestNetwork - Image loaded in 117.421528ms.
 INFO  LoadAndTestNetwork - Scaler initiated in 0.522743ms.
 INFO  LoadAndTestNetwork - Image scaled in 1.633077ms.
 INFO  Nd4jBlas - Number of threads used for BLAS: 0
 INFO  LoadAndTestNetwork - Image processed by network in 101.961865ms.
 INFO  LoadAndTestNetwork - Image processed: [0.22, 0.04, 0.72, 0.02]

Ok, so I identified the most consuming tasks:

1. Image loaded in 117.421528ms.
I will try for example to optimize the throughput here by:

Trying to use different object for image processing (will look into API)
Will store the image in RAMDISK -> /dev/ramdisk0
I will create object pool of initiated "Image' object instances, so I will only rewrite RAM to RAM

2. Image processed by network in 101.961865ms
Well, here it is about the network performance and I don't understand much in the moment the internals of dl4j. In general trying to train network with less layers might help, I really don't know and will be happy to hear any suggestion.

Still I will run on multiGPU system with for example 2 titan X 12GB Pascal/Maxwell instead of on 960M on my laptop in the moment.

Any suggestions on this performance appreciated.

archenroot · 2017-01-23T16:53:30Z

Here is interesting comparison of different object recognition methods.
https://handong1587.github.io/deep_learning/2015/10/09/object-detection.html

So I will give it a try to investigate/implement the most performant approaches by using DL4J.

Happy to hear any hints here

archenroot · 2017-01-23T20:53:47Z

Regarding the image loading time, i am going to look deeper tomorrow at:

@Override
    public INDArray asMatrix(InputStream is) throws IOException {
        byte[] bytes = IOUtils.toByteArray(is);
        Mat image = imdecode(new Mat(bytes), CV_LOAD_IMAGE_ANYDEPTH | CV_LOAD_IMAGE_ANYCOLOR);
        if (image == null || image.empty()) {
            PIX pix = pixReadMem(bytes, bytes.length);
            if (pix == null) {
                throw new IOException("Could not decode image from input stream");
            }
            image = convert(pix);
            pixDestroy(pix);
        }
        return asMatrix(image);
    }

Or more preciously at public INDArray asMatrix(Mat image) throws IOException

From:
https://github.com/deeplearning4j/DataVec/blob/master/datavec-data/datavec-data-image/src/main/java/org/datavec/image/loader/NativeImageLoader.java

At first look I am having some ideas about how to possibly speed this process up, but will test tomorrow first.

RobAltena · 2017-12-17T08:23:02Z

This issue has been inactive for a while. can we close it?

archenroot changed the title ~~Question about applying trained model over image~~ Question about applying trained model over image and its performance for real time processing Jan 23, 2017

RobAltena closed this as completed Dec 19, 2017

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Question about applying trained model over image and its performance for real time processing #351

Question about applying trained model over image and its performance for real time processing #351

archenroot commented Jan 23, 2017

archenroot commented Jan 23, 2017

archenroot commented Jan 23, 2017

tomthetrainer commented Jan 23, 2017

archenroot commented Jan 23, 2017 •

edited

Loading

tomthetrainer commented Jan 23, 2017

archenroot commented Jan 23, 2017

archenroot commented Jan 23, 2017 •

edited

Loading

archenroot commented Jan 23, 2017

archenroot commented Jan 23, 2017 •

edited

Loading

archenroot commented Jan 23, 2017 •

edited

Loading

archenroot commented Jan 23, 2017 •

edited

Loading

RobAltena commented Dec 17, 2017

Question about applying trained model over image and its performance for real time processing #351

Question about applying trained model over image and its performance for real time processing #351

Comments

archenroot commented Jan 23, 2017

archenroot commented Jan 23, 2017

archenroot commented Jan 23, 2017

tomthetrainer commented Jan 23, 2017

archenroot commented Jan 23, 2017 • edited Loading

tomthetrainer commented Jan 23, 2017

archenroot commented Jan 23, 2017

archenroot commented Jan 23, 2017 • edited Loading

archenroot commented Jan 23, 2017

archenroot commented Jan 23, 2017 • edited Loading

archenroot commented Jan 23, 2017 • edited Loading

archenroot commented Jan 23, 2017 • edited Loading

RobAltena commented Dec 17, 2017

archenroot commented Jan 23, 2017 •

edited

Loading

archenroot commented Jan 23, 2017 •

edited

Loading

archenroot commented Jan 23, 2017 •

edited

Loading

archenroot commented Jan 23, 2017 •

edited

Loading

archenroot commented Jan 23, 2017 •

edited

Loading