Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add make_confidence_report_spsa.py #794

Closed
wants to merge 40 commits into from

Conversation

goodfeli
Copy link
Contributor

No description provided.

@goodfeli
Copy link
Contributor Author

@juesato @nottombrown @carlini

I'm trying to clean up SPSA and integrate it into our newest multi-GPU, multi-attack, confidence aware eval system. I made a few basic changes like having it use the same argument names as ProjectedGradientDescent, support clip_min and clip_max, etc.

Unfortunately, the core SPSA algorithm itself doesn't really seem to work. It results in NaNs.

Steps to reproduce:

  1. Check out this branch
  2. cd cleverhans_tutorials
  3. python mnist_tutorial_picklable.py
  4. cd ../scripts
  5. python make_confidence_report_spsa.py ../cleverhans_tutorials/clean_model.joblib

This results in a NaN. Any suggestions?

@goodfeli
Copy link
Contributor Author

I think the NaN may have been because the VM I was using had a GPU hardware or driver problem. After rebooting it, I no longer get the NaN. I do get a different error message though:

File "/home/goodfellow/cleverhans/cleverhans/attacks.py", line 1991, in generate
clip_max=clip_max
File "/home/goodfellow/cleverhans/cleverhans/attacks_tf.py", line 1842, in pgd_attack
back_prop=False)
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/ops/control_flow_ops.py", line 3232, in while_loop
return_same_structure)
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/ops/control_flow_ops.py", line 2952, in BuildLoop
pred, body, original_loop_vars, loop_vars, shape_invariants)
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/ops/control_flow_ops.py", line 2887, in _BuildLoop
body_result = body(*packed_vars_for_body)
File "/home/goodfellow/cleverhans/cleverhans/attacks_tf.py", line 1816, in loop_body
wrapped_loss_fn, [perturbation], optim_state)
File "/home/goodfellow/cleverhans/cleverhans/attacks_tf.py", line 1583, in minimize
grads = self._compute_gradients(loss_fn, x, optim_state)
File "/home/goodfellow/cleverhans/cleverhans/attacks_tf.py", line 1693, in _compute_gradients
x[0] = tf.reshape(x[0], [1] + static_x_shape[1:])
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/ops/gen_array_ops.py", line 6199, in reshape
"Reshape", tensor=tensor, shape=shape, name=name)
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/framework/op_def_library.py", line 787, in _apply_op_helper
op_def=op_def)
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/util/deprecation.py", line 454, in new_func
return func(*args, **kwargs)
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/framework/ops.py", line 3155, in create_op
op_def=op_def)
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/framework/ops.py", line 1717, in init
self._traceback = tf_stack.extract_stack()

InvalidArgumentError (see above for traceback): Input to reshape is a tensor with 1003520 values, but the requested shape has 784
[[Node: while_1/Reshape = Reshape[T=DT_FLOAT, Tshape=DT_INT32, _device="/job:localhost/replica:0/task:0/device:GPU:1"](while_1/Identity_1, while_1/Reshape/shape)]]
[[Node: while_3/assert_greater_equal/Assert/AssertGuard/Assert/Switch_1/_349 = _Recvclient_terminated=false, recv_device="/job:localhost/replica:0/task:0/device:CPU:0", send_device="/job:localhost/replica:0/task:0/device:GPU:3", send_device_incarnation=1, tensor_name="edge_1182_while_3/assert_greater_equal/Assert/AssertGuard/Assert/Switch_1", tensor_type=DT_FLOAT, _device="/job:localhost/replica:0/task:0/device:CPU:0"]]

@juesato
Copy link
Contributor

juesato commented Oct 16, 2018

Is this currently using a batch size of 1280 by any chance?

SPSA assumes a batch size of 1, because internally, the implementation uses batching by evaluating the model on many noised versions of the current input to do the finite difference gradient estimates. Since this already maxes out GPU utilization, I just didn't both implementing the reshaping necessary to support attacking a batch of multiple inputs. This used to error out, but I changed it in #509 to assume the Tensor being passed in has batch size 1 and then reshape appropriately.

Harini pointed out to me today that SPSA only supporting batch size 1 isn't actually in the documentation. That's totally my bad, I'll send in a PR tomorrow.

@goodfeli
Copy link
Contributor Author

OK, thanks. Do you think it would be possible to support a batched interface by just wrapping what's currently there in a tf.while_loop that makes one adversarial example at a time?

@juesato
Copy link
Contributor

juesato commented Oct 16, 2018

My original reasoning was (copying from #509):

"""
The TF implementation still only supports batch size 1. It's implemented this way just for simplicity - when I used it I didn't need more than batch size 1. There's wouldn't really be any gain in efficiency with batching, since SPSA is internally batching the function evaluations of f(x + delta) for all the different delta values at each iteration.

The TF implementation could of course support a tf.while_loop on the outside to loop over the batch dimension, but that seems overly complex to me, since the computation is really handling each image separately, and it's easy to handle outside TF. I think it makes sense for the numpy interface to support batches, so I changed that.
"""

I'm happy to implement this though, if you feel strongly about it.

@goodfeli goodfeli changed the title WIP: Add make_confidence_report_spsa.py Add make_confidence_report_spsa.py Oct 16, 2018
@goodfeli
Copy link
Contributor Author

This is no longer WIP / this is done once the tests pass.

@goodfeli
Copy link
Contributor Author

Looks like I'll need some more help with this.

The generate_np caching system is a big mess. While trying to get the tests to pass, I've cleaned up some problems with it, but I'm still left with two failures.

The SPSA attack strength test for the generate method is passing but both tests for the generate_np method are failing:

FAIL: test_attack_strength_np (test_attacks.TestSPSA)

Traceback (most recent call last):
File "/home/goodfellow/cleverhans/tests_tf/test_attacks.py", line 304, in test_attack_strength_np
self.assertLess(np.mean(feed_labs == new_labs), 0.1)
AssertionError: 0.4 not less than 0.1

======================================================================
FAIL: test_attack_strength_np_batched (test_attacks.TestSPSA)

Traceback (most recent call last):
File "/home/goodfellow/cleverhans/tests_tf/test_attacks.py", line 318, in test_attack_strength_np_batched
self.assertLess(np.mean(feed_labs == new_labs), 0.1)
AssertionError: 0.5 not less than 0.1

That being said, the right arguments seem to be coming through to generate, it's not getting a false cache hit, etc. Any ideas what's wrong with it?

@goodfeli
Copy link
Contributor Author

Nevermind, I found why the tests were failing. I was detecting the format of the labels wrong. I was testing their shape and assuming int labels would have only one dimension, but they were actually getting passed in with an extra axis of dim 1. I changed it to assume that if the data is ints we need to call one_hot and otherwise we don't.

@juesato
Copy link
Contributor

juesato commented Oct 17, 2018

Just looked at the change, that seems good to me - thanks!

@goodfeli
Copy link
Contributor Author

@juesato : do you mean go ahead and merge the branch? I wasn't sure if you meant for your comment to be a whole review, or just commenting on one thing.

@juesato
Copy link
Contributor

juesato commented Oct 18, 2018

I'm happy with the changes related to SPSA (in attacks.py attacks_tf.py test_attacks_tf.py utils_tf.py). I haven't looked at the other additions. Do you want me to review them?

@goodfeli
Copy link
Contributor Author

Thanks!

You're certainly welcome to review the rest of the PR if you're feeling charitable, but that seems like a lot to ask.

@carlini , can you find someone to review the remaining files? (attack_bundling.py and utils_tf.py
scripts/make_confidence_report_spsa.py tests_tf/test_attacks.py tests_tf/test_attacks_tf.py)

@goodfeli
Copy link
Contributor Author

Closing this, and re-opening with a rebased version: #820

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants