Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BadOptimization when using gpuarray backend in DebugMode #5216

Closed
bbudescu opened this issue Nov 11, 2016 · 10 comments · Fixed by #5387
Closed

BadOptimization when using gpuarray backend in DebugMode #5216

bbudescu opened this issue Nov 11, 2016 · 10 comments · Fixed by #5387

Comments

@bbudescu
Copy link
Contributor

When using the 'cuda' gpu backend in DebugMode function compilation fails here https://github.com/Theano/Theano/blob/c0548210e11603141d70250871de8191e0a37037/theano/gpuarray/dnn.py#L216, because both expected and obtained outputs are PyCapsule objects containing NULL pointers (which, I assume, are supposed to be cudnnHandle_t's), which compare as different. BadOptimization is raised.

@nouiz
Copy link
Member

nouiz commented Nov 14, 2016

Can you give the full error message?
Do you have code to reproduce that?

@abergeron, do you have time to look at this?

On Fri, Nov 11, 2016 at 4:54 PM, bbudescu notifications@github.com wrote:

When using the 'cuda' gpu backend in DebugMode function compilation fails
here https://github.com/Theano/Theano/blob/c0548210e11603141d70250871de81
91e0a37037/theano/gpuarray/dnn.py#L216 http://url, because both
expected and obtained outputs are PyCapsule objects containing NULL
pointers (which, I assume, are supposed to be cudnnHandle_t's), which
compare as different. BadOptimization is raised.


You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub
#5216, or mute the thread
https://github.com/notifications/unsubscribe-auth/AALC-0al00PeEw7er5gfYdg7JiHRr8o-ks5q9OQHgaJpZM4KwLXp
.

@Sentient07
Copy link
Contributor

From what is reported (with reference to the line number of the error ) it seems that @bbudescu is having problem with CUDNN_STATUS_INTERNAL_ERROR. Did you try what @nouiz suggested in this issue, #5206 ?

@bbudescu
Copy link
Contributor Author

Hi, sorry for the delays (both the one until answering now, and the one until I'll be able to come up with a minimal scenario in which the error is raised). I'll get back on it as soon as I have the time. This will be another few days, I guess. The model I'm currently trying out is based on this: https://github.com/jocicmarko/ultrasound-nerve-segmentation.

As to #5206, as far as I understand, that one happens all the time. Mine only surfaces in DebugMode. If it's of any help, I'm currently experimenting on a ubuntu machine with cuda 7.5 and cudnn 5005 (ubuntu 16.04).

@nouiz
Copy link
Member

nouiz commented Nov 25, 2016

Do you still have the full error message? Without this, we can't do something about that.

@bbudescu
Copy link
Contributor Author

bbudescu commented Dec 6, 2016

Hi, as I anticipated, I'd like to apologize for the delay with which I'm writing. Compiling the net in debug mode and with verbose output takes a lot of time, and I didn't get to doing that lately. I didn't take the time again to debug-step to the location I pointed out before, but I assume it's the same, as the stderr output looks the same.

Note that I also have saved the whole stdout / stderr output and theano logging verbose output, but the two files have about 14MB, so I just copied the exception's message and a few frames above. If you also need the verbose output or my theano config flags, please let me know.

This is the error I get (ignore the "timestamp - WARNING:" and the "(std streams)") :
...
2016-12-06 19:38:26,925 - WARNING: File "/home/bbudescu/build/src/keras/keras/engine/training.py", line 1374, in fit_generator (std streams)
2016-12-06 19:38:26,926 - WARNING: self._make_train_function() (std streams)
2016-12-06 19:38:26,927 - WARNING: File "/home/bbudescu/build/src/keras/keras/engine/training.py", line 726, in _make_train_function (std streams)
2016-12-06 19:38:26,927 - WARNING: **self._function_kwargs) (std streams)
2016-12-06 19:38:26,928 - WARNING: File "/home/bbudescu/build/src/keras/keras/backend/theano_backend.py", line 821, in function (std streams)
2016-12-06 19:38:26,928 - WARNING: return Function(inputs, outputs, updates=updates, **kwargs) (std streams)
2016-12-06 19:38:26,929 - WARNING: File "/home/bbudescu/build/src/keras/keras/backend/theano_backend.py", line 807, in init (std streams)
2016-12-06 19:38:26,929 - WARNING: **kwargs) (std streams)
2016-12-06 19:38:26,930 - WARNING: File "/home/bbudescu/build/src/theano/theano/compile/function.py", line 326, in function (std streams)
2016-12-06 19:38:26,930 - WARNING: output_keys=output_keys) (std streams)
2016-12-06 19:38:26,931 - WARNING: File "/home/bbudescu/build/src/theano/theano/compile/pfunc.py", line 486, in pfunc (std streams)
2016-12-06 19:38:26,931 - WARNING: output_keys=output_keys) (std streams)
2016-12-06 19:38:26,932 - WARNING: File "/home/bbudescu/build/src/theano/theano/compile/function_module.py", line 1784, in orig_function (std streams)
2016-12-06 19:38:26,933 - WARNING: defaults) (std streams)
2016-12-06 19:38:26,933 - WARNING: File "/home/bbudescu/build/src/theano/theano/compile/debugmode.py", line 2570, in create (std streams)
2016-12-06 19:38:26,934 - WARNING: storage_map=storage_map) (std streams)
2016-12-06 19:38:26,935 - WARNING: File "/home/bbudescu/build/src/theano/theano/gof/link.py", line 699, in make_thunk (std streams)
2016-12-06 19:38:26,936 - WARNING: storage_map=storage_map)[:3] (std streams)
2016-12-06 19:38:26,936 - WARNING: File "/home/bbudescu/build/src/theano/theano/compile/debugmode.py", line 1855, in make_all (std streams)
2016-12-06 19:38:26,937 - WARNING: no_recycling) (std streams)
2016-12-06 19:38:26,937 - WARNING: File "/home/bbudescu/build/src/theano/theano/gof/op.py", line 824, in make_c_thunk (std streams)
2016-12-06 19:38:26,938 - WARNING: no_recycling=e_no_recycling) (std streams)
2016-12-06 19:38:26,938 - WARNING: File "/home/bbudescu/build/src/theano/theano/gof/cc.py", line 563, in accept (std streams)
2016-12-06 19:38:26,939 - WARNING: self.fetch_variables() (std streams)
2016-12-06 19:38:26,939 - WARNING: File "/home/bbudescu/build/src/theano/theano/gof/cc.py", line 589, in fetch_variables (std streams)
2016-12-06 19:38:26,940 - WARNING: params = node.run_params() (std streams)
2016-12-06 19:38:26,940 - WARNING: File "/home/bbudescu/build/src/theano/theano/gof/graph.py", line 129, in run_params (std streams)
2016-12-06 19:38:26,941 - WARNING: return self.op.get_params(self) (std streams)
2016-12-06 19:38:26,941 - WARNING: File "/home/bbudescu/build/src/theano/theano/gpuarray/dnn.py", line 216, in get_params (std streams)
2016-12-06 19:38:26,942 - WARNING: res = handle_type.make_value(ptr) (std streams)
2016-12-06 19:38:26,942 - WARNING: File "/home/bbudescu/build/src/theano/theano/gof/type.py", line 698, in make_value (std streams)
2016-12-06 19:38:26,943 - WARNING: return self._get_func()(ptr) (std streams)
2016-12-06 19:38:26,943 - WARNING: File "/home/bbudescu/build/src/theano/theano/compile/function_module.py", line 873, in call (std streams)
2016-12-06 19:38:26,944 - WARNING: self.fn() if output_subset is None else\ (std streams)
2016-12-06 19:38:26,944 - WARNING: File "/home/bbudescu/build/src/theano/theano/compile/debugmode.py", line 2305, in deco (std streams)
2016-12-06 19:38:26,945 - WARNING: return f() (std streams)
2016-12-06 19:38:26,946 - WARNING: File "/home/bbudescu/build/src/theano/theano/compile/debugmode.py", line 2230, in f (std streams)
2016-12-06 19:38:26,947 - WARNING: r_vals) (std streams)
2016-12-06 19:38:26,947 - WARNING: File "/home/bbudescu/build/src/theano/theano/compile/debugmode.py", line 1078, in _find_bad_optimizations0 (std streams)
2016-12-06 19:38:26,948 - WARNING: new_graph=new_graph_str) (std streams)
2016-12-06 19:38:26,948 - WARNING: theano.compile.debugmode (std streams)
2016-12-06 19:38:26,948 - WARNING: . (std streams)
2016-12-06 19:38:26,949 - WARNING: BadOptimization (std streams)
2016-12-06 19:38:26,950 - WARNING: : (std streams)
2016-12-06 19:38:26,950 - WARNING: BadOptimization Error
Variable: id 140324431886352 _make_cdata{rtype=<theano.gof.type.CDataType object at 0x7f9ffce8eb10>}.0
Op _make_cdata{rtype=<theano.gof.type.CDataType object at 0x7f9ffce8eb10>}(Cast{uint64}.0)
Value Type: <type 'PyCapsule'>
Old Value: <capsule object NULL at 0x7f9fd4d5b7e0>
New Value: <capsule object NULL at 0x7f9fd2fb8fc0>
Reason: GraphToGPU
Old Graph:
_make_cdata{rtype=<theano.gof.type.CDataType object at 0x7f9ffce8eb10>} [id A] ''
|Cast{uint64} [id B] ''
| [id C]

New Graph:
_make_cdata{rtype=<theano.gof.type.CDataType object at 0x7f9ffce8eb10>} [id D] ''
|Cast{uint64} [id E] ''
| [id C]

Hint: relax the tolerance by setting tensor.cmp_sloppy=1
or even tensor.cmp_sloppy=2 for less-strict comparison (std streams)

@bbudescu
Copy link
Contributor Author

bbudescu commented Dec 8, 2016

Would any other output help for identifying the problem (e.g. optimizer verbose log / optimizer profiling etc.)?

@abergeron
Copy link
Member

abergeron commented Jan 5, 2017

There are two different problems here:

  • DebugMode is applied to an internal CData function.
  • GraphToGPU does a replacement even when nothing changes in the graph.

This should prevent DebugMode from applying to that internal function: #5387

@nouiz
Copy link
Member

nouiz commented Jan 9, 2017

I merged the PR, so updating Theano to the dev version should fix it. If you still see the problem, tell us.

@nouiz
Copy link
Member

nouiz commented Jan 9, 2017

@abergeron do you have a fix for the second part? That GraphToGPU do a replacement when nothing is changed?

@abergeron
Copy link
Member

No I didn't touch that.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants