Skip to content

Conversation

@danielgordon10
Copy link

… time on repeated calls.

@danielgordon10 danielgordon10 changed the title Using lazy initialization to reuse ordered dict/list creations to save… Use lazy initialization to reuse ordered dict/list creations to save… Mar 25, 2016
@seanbell
Copy link

Is it possible to make this return either an immutable object, or make a copy each time?
This change will break any code that relies on the original behavior (of always returning a copy), in exchange for a very small gain.

@danielgordon10
Copy link
Author

There would be no point in this code if it made a copy each time. I can look at making it immutable. But do you know of any code that relies on it always being a new copy? Because I didn't see any point in this behavior, which is why I made the change. In fact, it caused a lot of trouble because I expected it to be just accessing, when in reality, it's making new copies every time.

@ajtulloch
Copy link
Contributor

Do you have any workloads where the time spent on these Python operations is nontrivial? In practice, you spend all your time in C++/CUDA for any real-world workload right?

@danielgordon10
Copy link
Author

Yes I do. I use python for doing training and testing as well, so my only real interface with the C++/CUDA stuff is through calling forward and backward. For extremely large networks like ResNet or what I work on, calling net.blobs repeatedly to fill in values for a forward pass can take seconds per forward pass. True enough I could store the net.blobs dict as a variable on my end, but this seems like a really unobvious interface. Most property accesses you would expect to be constant time operations, which is why I was so confused when calling net.blobs seemed to give massive slow downs.

The point is that the interface isn't written as net.get_blobs_dict(), it is written as net.blobs. One is a method that should return a dictionary which could/maybe should be new each time, and one is accessing an internal variable of the underlying structure. There is no reason why net.blobs should give a new copy other than bad naming conventions.

@danielgordon10
Copy link
Author

I'm not alone in thinking this way. Take a look at the Google Python style guide to see what I mean: https://google.github.io/styleguide/pyguide.html?showone=Access_Control#Access_Control

If an accessor function would be trivial you should use public variables instead of accessor functions to avoid the extra cost of function calls in Python. When more functionality is added you can use property to keep the syntax consistent.
On the other hand, if access is more complex, or the cost of accessing the variable is significant, you should use function calls (following the Naming guidelines) such as get_foo() and set_foo(). If the past behavior allowed access through a property, do not bind the new accessor functions to the property. Any code still attempting to access the variable by the old method should break visibly so they are made aware of the change in complexity.

So perhaps the best thing would be to change the name of these variables and break everyone's code to make sure they're all on the same page. But that doesn't sound super friendly.

And for the record, Caffe ostensibly follows the Google Python style guide, so this isn't just blowing smoke: http://caffe.berkeleyvision.org/development.html

def _Net_inputs(self):
return [list(self.blobs.keys())[i] for i in self._inputs]
if not hasattr(self, '_input_list'):
self._input_list = [list(self.blobs.keys())[i] for i in self._inputs]

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

While you're considering optimizations to this function, you might as well pull out list(self.blobs.keys()) as a separate line (and same for _Net_outputs).

@seanbell
Copy link

Now that I think about it more, you're right, it doesn't matter whether or not this is a copy, since the only part that is being copied is the OrderedDict object (and not the underlying Blob objects).

I'm not aware of any code that relies on the copy behavior of the OrderedDict itself. Usually the only copy-related issue that comes up is the underlying Blob data (which is orthogonal to this PR).

@danielgordon10
Copy link
Author

@seanbell What do you mean pull out as a separate line? Like put that in a method and do the same lazy instantiation? Or just break it into two lines. I'm not sure I see the point in the second one, and the first one I would call overkill, as it really doesn't save that much computation time.

The point of the PR is exactly the one you make in your previous comment. I'm not sure anyone cares about it being a new copy each time, and it's only a shallow copy, so the blobs will still act the same. However, it does save lots of time on repeated calls to the self.blobs property, which is a nice simple interface that really shouldn't be taking a long time to run, and users shouldn't have to worry about saving the result themselves.

@seanbell
Copy link

Regarding splitting list(self.blobs.keys()) into two lines, my thinking was that it is unnecessarily O(N^2) when it could be O(N). That sort of thing can get slow if you have many layers. For now I guess it's not a bottleneck anywhere.

@danielgordon10
Copy link
Author

Oh, I see. That's a valid point, and I can make that change.

Do you know how to edit a PR without adding another commit? I got a "talking to" before for having multiple commits, and I was told to use git squash, but when I try to squash, I can't commit because I'm behind the previously pushed version. If I pull, I'll need to merge which will add a commit.

@seanbell
Copy link

You can make sure you're up to date with git fetch --all and then rebase your other branch on top of origin/master with git rebase origin/master. You can then squash the commits with git rebase -i origin/master; in the text editor indicate that you want to keep only the first commit by writing pick for the first line and f (fixup) for the rest (fixup means that you keep the content of the commit but throw away its message). There's a comment at the bottom explaining the different options, inside the editor.

Then once you're done, force push push -f to your remote branch.

@danielgordon10 danielgordon10 force-pushed the pycaffe-multi-instantiation-fix branch from 5d1b704 to 7deb6fd Compare March 30, 2016 22:37
@danielgordon10 danielgordon10 force-pushed the pycaffe-multi-instantiation-fix branch from 7deb6fd to 7a81836 Compare March 30, 2016 22:38
@danielgordon10
Copy link
Author

Done. Thanks!

@ajtulloch
Copy link
Contributor

FWIW, you can DRY yourself a bit by reusing something like cached_property from https://github.com/pallets/werkzeug/blob/master/werkzeug/utils.py,

class cached_property(property):
    def __init__(self, func, name=None, doc=None):
        self.__name__ = name or func.__name__
        self.__module__ = func.__module__
        self.__doc__ = doc or func.__doc__
        self.func = func

    def __set__(self, obj, value):
        obj.__dict__[self.__name__] = value

    def __get__(self, obj, type=None):
        if obj is None:
            return self
        value = obj.__dict__.get(self.__name__, _missing)
        if value is _missing:
            value = self.func(obj)
            obj.__dict__[self.__name__] = value
        return value

and then just s/@property/@cached_property/g for the properties you care about.

@danielgordon10
Copy link
Author

@ajtulloch Adding another dependency to Caffe for these few changes seems like a bad idea.

This PR feels ready for a merge. @seanbell what do you think?

@seanbell seanbell added the Python label Apr 1, 2016
@seanbell
Copy link

seanbell commented Apr 1, 2016

LGTM

@longjon -- thoughts?

@longjon longjon merged commit 389db96 into BVLC:master Apr 1, 2016
@longjon
Copy link
Contributor

longjon commented Apr 1, 2016

Yes, looks great, and I agree with the conclusions of this discussion. In particular:

  • This issue has been on my mind since I wrote the original version; I agree that the overhead can be unacceptable in some cases.
  • I agree that changing the copy behavior of the output is okay; modifying a dict property like these ones would be poor form indeed.
  • I agree with @ajtulloch that we could define a helper for this functionality, and with @danielgordon10 that we shouldn't add a dependency to do so. It ought to be possible with a shorter version of the code that @ajtulloch has posted, which we could add directly, although I'm happy to leave this PR as-is.

Of course, the alternative to this is to figure out some way to actually create these objects at initialization time, which has been obstructed from the beginning by the awkward (but fun) boost-plus-python class definition. I think a workaround might be possible, but it's okay with me to keep doing it this way right now.

Code looks good, history looks good, no additional tests are needed, so merging, thanks!

@danielgordon10 danielgordon10 deleted the pycaffe-multi-instantiation-fix branch April 1, 2016 20:38
fxbit pushed a commit to Yodigram/caffe that referenced this pull request Sep 1, 2016
…tiation-fix

Use lazy initialization to reuse ordered dict/list creations to save…
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants