Join GitHub today
Workspace optimizations and refactoring #4900
COMPLETE BUT REQUIRES ADDITIONAL QA BEFORE MERGING
ND4J PR to go with this: deeplearning4j/nd4j#2823
This PR overhauls workspaces in DL4J, to improve performance and memory use.
In terms of design:
This was referenced
Apr 10, 2018
left a comment
LGTM overall (of course, what else!) The only real criticism I have is that changing the API so that "almost everything" takes a workspace manager (WSM) seems a little excessive at first glance. I'm sure you had your reasons, maybe it doesn't really work any other way. My hunch says it should've been possible to set a WSM config at layer and network level.
For instance, in the Keras preprocessor changes you had to change the signature of
Another idea worth exploring is to keep the old methods and wrap them like this:
This way you don't have to provide no workspaces manually all the time.
It isn't the greatest for (developer) usability, sure - though doesn't impact end users unless they write custom layers etc as it's internal.
I initially considered a "Layer.setWorkspaceManager" type design, but eventually rejected this as likely to introduce bugs. We'd be continually swapping out workspace managers in the layers (note different feed-forward methods have very different workspace configs), and if we forget to do it at some point, we can end up with crashes and really hard to track down bugs and performance issues.
Good question, but I also considered and rejected it. I did it in one place (time series utils) but I'm not adding this to the layers, again to avoid hidden performance/memory issues. Also we have too many Layer methods already, and I'm very hesitant to add even more :)
That was actually a bug (well, incomplete implementation) on my part, for those keras preprocessors. I've fixed it now. :)