Batch normalization gradients #6398

botev · 2017-09-08T20:27:18Z

Trying to do adversarial attacks on some batch normalized net I somehow got this:

Traceback (most recent call last):
  File "scripts/adverserial_samples_targeted_sfgs.py", line 213, in <module>
    main(**vars(parser.parse_args()))
  File "scripts/adverserial_samples_targeted_sfgs.py", line 146, in main
    grad = T.grad(objectives.sum(), adv_samples)
  File "/home/abotev/work/python/Theano/theano/gradient.py", line 605, in grad
    grad_dict, wrt, cost_name)
  File "/home/abotev/work/python/Theano/theano/gradient.py", line 1371, in _populate_grad_dict
    rval = [access_grad_cache(elem) for elem in wrt]
  File "/home/abotev/work/python/Theano/theano/gradient.py", line 1371, in <listcomp>
    rval = [access_grad_cache(elem) for elem in wrt]
  File "/home/abotev/work/python/Theano/theano/gradient.py", line 1326, in access_grad_cache
    term = access_term_cache(node)[idx]
  File "/home/abotev/work/python/Theano/theano/gradient.py", line 1021, in access_term_cache
    output_grads = [access_grad_cache(var) for var in node.outputs]
  File "/home/abotev/work/python/Theano/theano/gradient.py", line 1021, in <listcomp>
    output_grads = [access_grad_cache(var) for var in node.outputs]
  File "/home/abotev/work/python/Theano/theano/gradient.py", line 1326, in access_grad_cache
    term = access_term_cache(node)[idx]
  File "/home/abotev/work/python/Theano/theano/gradient.py", line 1162, in access_term_cache
    new_output_grads)
  File "/home/abotev/work/python/Theano/theano/scan_module/scan_op.py", line 2126, in L_op
    dC_dinps_t = compute_all_gradients(known_grads)
  File "/home/abotev/work/python/Theano/theano/scan_module/scan_op.py", line 2048, in compute_all_gradients
    null_gradients='return')
  File "/home/abotev/work/python/Theano/theano/gradient.py", line 605, in grad
    grad_dict, wrt, cost_name)
  File "/home/abotev/work/python/Theano/theano/gradient.py", line 1371, in _populate_grad_dict
File "/home/abotev/work/python/Theano/theano/tensor/nnet/bn.py", line 598, in make_node
    dy = as_tensor_variable(dy)
  File "/home/abotev/work/python/Theano/theano/tensor/basic.py", line 158, in as_tensor_variable
    "Variable type field must be a TensorType.", x, x.type)
theano.tensor.var.AsTensorError: ('Variable type field must be a TensorType.', <DisconnectedType>, <theano.gradient.DisconnectedType object at 0x7f17d56766a0>)

Any ideas where to look on ideas in general?

The text was updated successfully, but these errors were encountered:

nouiz · 2017-09-08T20:54:57Z

The problem seem to come from your wrt parameter to grad. Can you confirm this?

…

On Fri, Sep 8, 2017 at 4:27 PM Alexander Botev ***@***.***> wrote: Trying to do adversarial attacks on some batch normalized net I somehow got this: Traceback (most recent call last): File "scripts/adverserial_samples_targeted_sfgs.py", line 213, in <module> main(**vars(parser.parse_args())) File "scripts/adverserial_samples_targeted_sfgs.py", line 146, in main grad = T.grad(objectives.sum(), adv_samples) File "/home/abotev/work/python/Theano/theano/gradient.py", line 605, in grad grad_dict, wrt, cost_name) File "/home/abotev/work/python/Theano/theano/gradient.py", line 1371, in _populate_grad_dict rval = [access_grad_cache(elem) for elem in wrt] File "/home/abotev/work/python/Theano/theano/gradient.py", line 1371, in <listcomp> rval = [access_grad_cache(elem) for elem in wrt] File "/home/abotev/work/python/Theano/theano/gradient.py", line 1326, in access_grad_cache term = access_term_cache(node)[idx] File "/home/abotev/work/python/Theano/theano/gradient.py", line 1021, in access_term_cache output_grads = [access_grad_cache(var) for var in node.outputs] File "/home/abotev/work/python/Theano/theano/gradient.py", line 1021, in <listcomp> output_grads = [access_grad_cache(var) for var in node.outputs] File "/home/abotev/work/python/Theano/theano/gradient.py", line 1326, in access_grad_cache term = access_term_cache(node)[idx] File "/home/abotev/work/python/Theano/theano/gradient.py", line 1162, in access_term_cache new_output_grads) File "/home/abotev/work/python/Theano/theano/scan_module/scan_op.py", line 2126, in L_op dC_dinps_t = compute_all_gradients(known_grads) File "/home/abotev/work/python/Theano/theano/scan_module/scan_op.py", line 2048, in compute_all_gradients null_gradients='return') File "/home/abotev/work/python/Theano/theano/gradient.py", line 605, in grad grad_dict, wrt, cost_name) File "/home/abotev/work/python/Theano/theano/gradient.py", line 1371, in _populate_grad_dict File "/home/abotev/work/python/Theano/theano/tensor/nnet/bn.py", line 598, in make_node dy = as_tensor_variable(dy) File "/home/abotev/work/python/Theano/theano/tensor/basic.py", line 158, in as_tensor_variable "Variable type field must be a TensorType.", x, x.type) theano.tensor.var.AsTensorError: ('Variable type field must be a TensorType.', <DisconnectedType>, <theano.gradient.DisconnectedType object at 0x7f17d56766a0>) Any ideas where to look on ideas in general? — You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub <#6398>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AALC-wBF9WMUGHTOEmsCl6GPMnPamuiQks5sgaMqgaJpZM4PRp16> .

botev · 2017-09-08T22:16:41Z

The what I'm taking the gradient of is a shared variable containing a small number of images. Technically that should not be the issue. Also, it is strange that this comes all the way in one of the BNGrad Ops not somewhere early if that was the issue right? I think that I'm using the theano batch_norm_test is it possible that it does not have a gradient and only the train does?

nouiz · 2017-09-08T23:50:42Z

That is possible, but normally, you should have another more informative error. Is there a way for me to reproduce easily?

…

On Fri, Sep 8, 2017 at 6:17 PM Alexander Botev ***@***.***> wrote: The what I'm taking the gradient of is a shared variable containing a small number of images. Technically that should not be the issue. Also, it is strange that this comes all the way in one of the BNGrad Ops not somewhere early if that was the issue right? I think that I'm using the theano batch_norm_test is it possible that it does not have a gradient and only the train does? — You are receiving this because you commented. Reply to this email directly, view it on GitHub <#6398 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AALC-4wRs-NdHPoA1sRUV1pnST1mm50pks5sgbzKgaJpZM4PRp16> .

botev · 2017-09-09T02:12:01Z

Not really, at least I don't have a simple example at all. I can try to get this, but in short, I have a wide res net which is evaluated in a scan for doing MC-dropout. I'll try set the samples to 1 and unroll it instead of scan it to see if I get anything more informative.

botev · 2017-09-11T10:15:19Z

@nouiz Hmm, this looks to be the same issue as like the OpFromGraph error here #6400 . A code to reproduce this:

x = T.fmatrix("x")
gamma = T.fvector("g")
beta = T.fvector("beta")
mean = T.vector("mean")
var = T.vector("var")
bn, m, _, _, _ = T.nnet.bn.batch_normalization_train(x, gamma, beta, running_mean=mean, running_var=var)
s = T.grad(T.sum(bn), x)
k = T.Lop(s, x, x)
# Or this
s = T.grad(T.sum(m), x)

I think this might be a general issue with Theano when you take a gradient of an Operator which has multiple outputs but only one the outputs plays a role in the cost.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Batch normalization gradients #6398

Batch normalization gradients #6398

botev commented Sep 8, 2017

nouiz commented Sep 8, 2017 via email

botev commented Sep 8, 2017

nouiz commented Sep 8, 2017 via email

botev commented Sep 9, 2017

botev commented Sep 11, 2017 •

edited

Batch normalization gradients #6398

Batch normalization gradients #6398

Comments

botev commented Sep 8, 2017

nouiz commented Sep 8, 2017 via email

botev commented Sep 8, 2017

nouiz commented Sep 8, 2017 via email

botev commented Sep 9, 2017

botev commented Sep 11, 2017 • edited

botev commented Sep 11, 2017 •

edited