Skip to content
This repository has been archived by the owner on May 31, 2024. It is now read-only.

[TinyYolo v2] Bug in maxpool? #73

Closed
owulveryck opened this issue Jun 4, 2019 · 9 comments
Closed

[TinyYolo v2] Bug in maxpool? #73

owulveryck opened this issue Jun 4, 2019 · 9 comments
Labels
bug Something isn't working

Comments

@owulveryck
Copy link
Owner

owulveryck commented Jun 4, 2019

This commit allows the model tiny Yolo v2 to be compiled and executed With Gorgonia.

Sadly the execution does not give the expected result:

➜  model_zoo_executor git:(tiny-yolov2) ✗  export MODELDIR=~/Documents/tiny_yolov2
➜  model_zoo_executor git:(tiny-yolov2) ✗ go run main.go -model $MODELDIR/model.onnx -input $MODELDIR/test_data_set_0/input_0.pb -output $MODELDIR/test_data_set_0/output_0.pb

        Error Trace:    main.go:72
                                                proc.go:200
                                                asm_amd64.s:1337
        Error:          Max difference between -0.17929432 and 0.056231752 allowed is 0.005, but difference was -0.23552606999874115
        Messages:       the two tensors should be equal.
exit status 1

According to this blog post the architecture should be:

Layer         kernel  stride  output shape
---------------------------------------------
Input                          (416, 416, 3)
Convolution    3×3      1      (416, 416, 16)
MaxPooling     2×2      2      (208, 208, 16)
Convolution    3×3      1      (208, 208, 32)
MaxPooling     2×2      2      (104, 104, 32)
Convolution    3×3      1      (104, 104, 64)
MaxPooling     2×2      2      (52, 52, 64)
Convolution    3×3      1      (52, 52, 128)
MaxPooling     2×2      2      (26, 26, 128)
Convolution    3×3      1      (26, 26, 256)
MaxPooling     2×2      2      (13, 13, 256)
Convolution    3×3      1      (13, 13, 512)
MaxPooling     2×2      1      (13, 13, 512)
Convolution    3×3      1      (13, 13, 1024)
Convolution    3×3      1      (13, 13, 1024)
Convolution    1×1      1      (13, 13, 125)
---------------------------------------------

After setting some logs, the architecture of the decoded network is:

+Convolution             (3, 3)          [1 1]           (1, 16, 416, 416)
+MaxPooling              (2, 2)          [2 2]           (1, 16, 208, 208)
+Convolution             (3, 3)          [1 1]           (1, 32, 208, 208)
+MaxPooling              (2, 2)          [2 2]           (1, 32, 104, 104)
+Convolution             (3, 3)          [1 1]           (1, 64, 104, 104)
+MaxPooling              (2, 2)          [2 2]           (1, 64, 52, 52)
+Convolution             (3, 3)          [1 1]           (1, 128, 52, 52)
+MaxPooling              (2, 2)          [2 2]           (1, 128, 26, 26)
+Convolution             (3, 3)          [1 1]           (1, 256, 26, 26)
+MaxPooling              (2, 2)          [2 2]           (1, 256, 13, 13)
+Convolution             (3, 3)          [1 1]           (1, 512, 13, 13)
-MaxPooling              (2, 2)          [1 1]           (1, 512, 14, 14)
-Convolution             (3, 3)          [1 1]           (1, 1024, 14, 14)
-Convolution             (3, 3)          [1 1]           (1, 1024, 14, 14)
-Convolution             (1, 1)          [1 1]           (1, 125, 14, 14)

The last layer using the Maxpool operator does not give the correct output size.
The padding used is computed from the auto_pad argument but seems ok (padding is [1,1]).

It requires more investigation; maybe a bug in Gorgonia.

Note : the computation is slow, but Make it work, then Make it fast

cc @chewxy

@owulveryck
Copy link
Owner Author

Maybe related: PR 274 from Gorgonia (from @lynic)

@chewxy
Copy link

chewxy commented Jun 4, 2019

Does yolo use overlapping pooling? i.e. maxpool with a stride of other than size

@owulveryck
Copy link
Owner Author

owulveryck commented Jun 5, 2019

@chewxy the stride of the operator is [1 1] (the third column)

@owulveryck
Copy link
Owner Author

I think I got it; the ONNX spec specifies that a ceil_mode attribute can change the behavior of the shape calculus:

ceil_mode : int (default is 0)
Whether to use ceil or floor (default) to compute the output shape.

The calcShape method in Gorgonia does not take this into account; the shape calculation is probably different than what onnx expects.

I will do different tests, but If this is the case, I will have to find a smart solution to implement this without breaking Gorgonia's API 🧐

@owulveryck
Copy link
Owner Author

I did not notice that the implementation of the SAME_UPPER was breaking the unit tests of gorgonnx:

--- FAIL: TestONNX (0.04s)
    --- FAIL: TestONNX/TestMaxpool2dSameUpper (0.00s)
        test_structure.go:78:
                Error Trace:    test_structure.go:135
                Error:          Max difference between 1.7640524 and 0.978738 allowed is 1e-06, but difference was 0.7853143811225891
                Messages:       the two tensors should be equal.
    --- FAIL: TestONNX/TestMaxpool2dPrecomputedSameUpper (0.00s)
        test_structure.go:78:
                Error Trace:    test_structure.go:135
                Error:          Max difference between 1 and 7 allowed is 1e-06, but difference was -6
                Messages:       the two tensors should be equal.
FAIL

Let's fix this before going further.

@owulveryck
Copy link
Owner Author

Probably related to ONNX issue #1113

@owulveryck
Copy link
Owner Author

The problem linked to #74 and because the padding is asymmetric;
this is not implemented in Gorgonia.

@owulveryck
Copy link
Owner Author

Should be fixed with PR #80 and PR #295 from Gorgonia

@owulveryck
Copy link
Owner Author

Tiny yolo v2 is working, let's make it fast now!

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants