-
-
Notifications
You must be signed in to change notification settings - Fork 430
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Optimisation in NewUniqueNode may be buggy #384
Comments
Hmm, I thought I added the garbage cleanup on return to pool. Let me look into this further |
You're right, the cleanup happens here: Line 16 in 2d4605d
This is really weird. |
See on my debugging screenshot:
How is this possible? Note: the panic message is coherent with this, it displays 4 dimensions and a shape equal to ():
|
That message means that the |
the issue is likely in |
What do we do with this issue? I'd like to close it, but I do not want to lose your (very valuable comment) comment. |
The comments are mostly opinion. Should not be taken to be gospel. We can write a better version on the docs site once the ideas have stabilized. In the meantime this has gone into my personal org files. So... there. |
Besides problem in reductionInferShape, WithShape seems to fail whenever the node is created using NewUniqueNode, since newNode will borrow a node with node.shape initialized to nil. This will always make the nd==0. See the flow
|
* Make reductionInferShape conservative to fix #384 The reductionInferShape currently doesn't respect along initially. It aggressively squeezes dimensions. Not only does it affect normal tensor operation, but also it breaks the backprop autoDiff algorithm sometimes when the network containing BroadcastAdd, resulting in crash when calling Grad(). The change tries to strictly respect the parameter along, e.g., (100, 1) along 0, reduction to shape (1) instead () (1, 64, 1, 64) along 3 will reduce to (1, 64, 1) (64, 1, 3, 2) along (2,3) will reduce to (64, 1). Fixed unit tests. * Remove inconsistent dimention for Sum op After changing reductionType to subtract len(along) from reduction op, SumOp's dimension need to be adjusted with it. Otherwise SymDiff will crash for Sum in calcBroadcastShap.
I think this is related to #373 and #375
I made a simple test, as described in #375:
After some investigation, the code is panicking because of a call to
WithShape
here:gorgonia/node.go
Lines 210 to 212 in 2d4605d
which is called here:
gorgonia/op.go
Line 202 in 2d4605d
The problem is in the function
NewUniqueNode
which callsnewNode
... which callsn.borrowNode
for optimization:gorgonia/node.go
Line 259 in 2d4605d
It applies the
WithShape
function here ...gorgonia/node.go
Line 264 in 2d4605d
therefore, all the checks are performed on the node that is borrowed and full of rubbish!
In the example, there is only one node to be borrowed; therefore, the panic always returns the same result, but with a more significant code, the problem could appear sporadically.
The text was updated successfully, but these errors were encountered: