Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

copy.copy() isn't happy with DataProxy under Python 3 #426

Closed
bitprophet opened this issue Feb 10, 2017 · 5 comments
Closed

copy.copy() isn't happy with DataProxy under Python 3 #426

bitprophet opened this issue Feb 10, 2017 · 5 comments

Comments

@bitprophet
Copy link
Member

bitprophet commented Feb 10, 2017

(Note: this is notes-to-self; it's predicated partly on some deeper changes to Config that I haven't merged to master just yet, though I plan to soon.)

Discovered this when implementing configuration support for Fabric 2 ProxyJump style gateways (i.e. connection objects, which inherit from invoke's Context, being stored in other connection objects' configuration - which then get copy'd when the config re-merges).

Not 100% sure why (I probably knew at some point) but under Python 3 only, hasattr(<newly created but not __init__'d Context>, 'anything') RecursionErrors, while Python 2 happily continues. And copy.copy at one point on both interpreters checks for __setstate__ (as it uses that part of the pickle protocol.)

Unfortunately it's not as simple as defining a dummy __setstate__ as copy.py doesn't do any checking besides "do you exist? ok I'm calling you" and so if we define it we must define (AFAICT) the entire pickling suite, or at least __getstate__ too.

I quickly poked at my usual workaround of "call the object method with self", hoping to tell our usual kit "hey if you don't even have _config yet, treat that as a miss" but that doesn't fly with __getattribute__ for reasons.

Feels like the Right Thing is to, you know. Actually implement an explicit "this is what I means to copy a Context/DataProxy type of object" set of methods.

I was hoping I could just pass through to .clone but...that's only on Config, not on DataProxy itself. So now I have to figure out if it's sensible to split up .clone (and/or restate it in terms of the pickling methods) between the two classes, or what.

@bitprophet
Copy link
Member Author

Yea, seems like we could:

  • implement __copy__ directly
  • implement __reduce__ (or __reduce_ex__) plus __setstate__ (to be called with the object-state member of the __reduce__ tuple; doesn't seem like it actually uses __getstate__?)

Figure the former is gonna be simpler as it's basically clone() and arguably I should be reworking clone as __copy__ and having "cloning" be "call copy()" anyways. The problem of that work being mostly Config centric remains.

@bitprophet
Copy link
Member Author

Yea, a simple DataProxy.__copy__ appears to function basically:

def __copy__(self):                      
    new = self.__class__()               
    new._config = copy.copy(self._config)
    return new                           

However, I anticipate that things will get clunky in the corner cases - e.g. this totally skips around Config.clone, so for e.g. Context, this means copying it will not properly copy its inner Config re: all its source dicts, path attrs, etc.

This is also a confusing spot because Context._config is a Config, but an actual Config's _config is just a dict; having two objects nestled within one another which are both DataProxy subclasses is odd like that and probably a code smell...

May make more sense to require the 'concrete' subclasses of DataProxy be the ones implementing their own __copy__, partly for that very reason, and partly because it then dovetails w/ just moving Config.clone to be copy-driven.

Arguably, though, it'd be better to leverage __reduce__ and __setstate__ after all, so that these classes can be actually-pickled...especially for the inevitable "some folks will still want to use multiprocessing for parallelism" problem I haven't tackled yet.

@bitprophet
Copy link
Member Author

Yup, in fact even under Python 2 we get our friendly RecursionErrors just trying to pickle.loads(pickle.dumps(Context())). Might as well try to tackle this as low level as I can...fuggen rabbit holes...

@bitprophet
Copy link
Member Author

Keep going in circles though; the core problem is that all attempts to unpickle or copy at some point must be able to ask our objects if they have a __setstate__, and this is true even if we define __reduce__, which otherwise would, by itself, be sufficient for correct pickling.

Wonder if I'm overlooking a really stupid simple solution like "if somebody is literally asking for our __setstate__, raise AttributeError"? Because...that might just-work (meaning I don't actually need to bother super hard with the rest of this for now either, if default pickling behavior - namely "copy over the object's __dict__" - suffices.

@bitprophet
Copy link
Member Author

Yea...that works fine and the result seems to be healthily copied. Will consider expanding on this for replacing Config.clone later perhaps, but for now, rabbit hole averted.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant