Difference between CPU and GPU generated models? #1346

mortengryning · 2017-01-23T14:41:13Z

Hi,

I'm trying to evaluate a custom RCNN model using the new managed C# library.

When I use an old model that I have trained in version 3 on the CPU, my code runs and I get the expected results.

However, when I run my code on a model trained by the GPU, the code runs but I get strange results back. This applies on both version 3, version 8 and version 9 of CNTK.

When I evaluate the network on the GPU using the python scripts B3_EvaluateOutput... I get the same results as a model trained on the CPU. I have also checked that the ROI are the same.

Are there some differences between the models that would explain why I can only evaluate the model trained on the CPU?

My code is the following:

`
DeviceDescriptor device = DeviceDescriptor.GetCPUDevice();

        //Fail-safes
        if (!File.Exists(imagePath))
        {
            return new RCNNReturn(null, string.Format("Error: The test image file '{0}' does not exist.", imagePath));
        }

        Bitmap bmp = null;
        Image img = null;
        List<Box> matches = new List<Box>();

        try
        {
            modelFunc = Function.LoadModel(modelFileBase, device);
        }
        catch (Exception ex)
        {
            throw ex;
        }

        try
        {
            //Get the inputs
            Variable featuresInputvar = modelFunc.Arguments[0];
            Variable roisInputVar = modelFunc.Arguments[1];
            Variable roisLabelInputVar = modelFunc.Arguments[2];

            Variable ceOutputVar = modelFunc.Outputs[0];
            Variable errsOutputVar = modelFunc.Outputs[1];
            Variable zOutputVar = modelFunc.Outputs[2];

            var inputDataMap = new Dictionary<Variable, Value>();
            var outputDataMap = new Dictionary<Variable, Value>();
            List<List<float>> outputBuffer;

            img = Image.FromFile(imagePath);
            bmp = new Bitmap(img);

            Bitmap padding = bmp.ResizeTo1000WithPadding();
            Bitmap resized = padding.Resize(1000, 1000, true);
            List<float> resizedCHW = resized.ParallelExtractCHW();

            List<float> roiCustom = roi.GetROISForImage(imagePath, roiPath, (int)(roisInputVar.Shape.Dimensions[0] * roisInputVar.Shape.Dimensions[1]));

            if (roiCustom == null || roiCustom.Count == 0)
            {
                throw new Exception("Could not generate ROI's for image");
            }

            //Put 2 inputs into the network, features & ROI
            var input1 = Value.CreateBatch(featuresInputvar.Shape, resizedCHW, device);
            var input2 = Value.CreateBatch(roisInputVar.Shape, roiCustom, device);

            inputDataMap.Add(featuresInputvar, input1);
            inputDataMap.Add(roisInputVar, input2);

            //Put 1 output into the network
            outputDataMap.Add(zOutputVar, null);

            // Start evaluation on the device
            modelFunc.Evaluate(inputDataMap, outputDataMap, device);

            // Get evaluate result as dense output
            outputBuffer = new List<List<float>>();
            outputDataMap[zOutputVar].CopyVariableValueTo(zOutputVar, outputBuffer);

            if (outputBuffer.Count == 0)
            {
                throw new Exception("Output from neural network is empty");
            }

            List<float> floatOutput = outputBuffer[0];

            // the object classes used in the grocery example
            var labels = new[] { "__background__", "rectangle", "triangle", "circle"};
            int numLabels = labels.Length;
            int numRois = floatOutput.Count / numLabels;

            for (int i = 0; i < numRois; i++)
            {
                var outputForRoi = floatOutput.Skip((i * numLabels)).Take(numLabels).ToList();

                int roiIndexStart = i * numLabels;

                // Retrieve the predicted label as the argmax over all predictions for the current ROI (background i 0)
                var max = outputForRoi.Select((value, index) => new { Value = value, Index = index })
                    .Aggregate((a, b) => (a.Value > b.Value) ? a : b)
                    .Index;

                var maxVal = outputForRoi.Select((value, index) => new { Value = value, Index = index }).Aggregate((a, b) => (a.Value > b.Value) ? a : b);

                if (max > 0)
                    matches.Add(new Box() { X = roiCustom[roiIndexStart] * 1000, Y = roiCustom[roiIndexStart + 1] * 1000, Width = roiCustom[roiIndexStart + 2] * 1000, Height = roiCustom[roiIndexStart + 3] * 1000, Category = max, Val = maxVal.Value });
            }

            //Perform non maximam supression
            matches = nms.Non_max_supression(matches, 0.01f);
        }
        catch (Exception ex)
        {
            return new RCNNReturn(null, "Exception occured in RCNN Evalution: " + ex.Message);
        }
        finally
        {
            //Clean up
            if (bmp != null)
                bmp.Dispose();

            if (img != null)
                img.Dispose();
        }

        return new RCNNReturn(matches);
    }`

The text was updated successfully, but these errors were encountered:

zhouwangzw · 2017-01-23T16:33:43Z

@mortengryning I am not sure why you closed the issue. Is it resovled? Is your model using Batch Normalization, Convolution engine? What is the difference in results between GPU and CPU?

Thanks,

patykov · 2017-01-23T19:44:16Z

I also want to know the difference between CPU and GPU generated models. I've used the pre-trained ResNet model on the CPU-only machine and got terrible results but it worked as expected on a GPU machine.
I want to know why this happened and if there is a way to run this model on CPU.

I've used the ResNet-34 model from: https://github.com/Microsoft/CNTK/tree/v1.7.2/Examples/Image/Miscellaneous/ImageNet/ResNet

Thanks!

mortengryning · 2017-01-23T23:15:01Z

I closed it because the error was my own fault :-). The models works fine both on CPU and GPU. So no issue for me atleast.

wendy7707 · 2017-09-04T07:12:37Z

@mortengryning Hello, when I run the above coding in CPU by cents 2.1 ,there is an error said"TensorOp (binary): The only permitted binary reduction operation is opSum." Would you please tell me how to solve it? Thanks!

mortengryning closed this as completed Jan 23, 2017

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Difference between CPU and GPU generated models? #1346

Difference between CPU and GPU generated models? #1346

mortengryning commented Jan 23, 2017 •

edited

zhouwangzw commented Jan 23, 2017

patykov commented Jan 23, 2017

mortengryning commented Jan 23, 2017

wendy7707 commented Sep 4, 2017

Difference between CPU and GPU generated models? #1346

Difference between CPU and GPU generated models? #1346

Comments

mortengryning commented Jan 23, 2017 • edited

zhouwangzw commented Jan 23, 2017

patykov commented Jan 23, 2017

mortengryning commented Jan 23, 2017

wendy7707 commented Sep 4, 2017

mortengryning commented Jan 23, 2017 •

edited