Simplify loader functionality #92

wants to merge 4 commits into


None yet

2 participants

Ivoz commented Oct 22, 2013

I looked into issue #91

Firstly, updating releases to v0.2.4 might fix the broken python 3.2 test on travis for pull #88.

From there I looked at the loader code. It seems the loader's use of sys.path may be causing the user's issue. Although I haven't worked with invoke that much, I don't see a tempting use case for needing to load a that is on the sys.path somehow, that wasn't somewhere else already. Moreover, this loader modifies the sys.path.

If you know a good use case for finding tasks on sys.path could you share that with me? If so I will endeavour to reintegrate that.

Since the loader is a very simple class, with only three methods that really only call eacho ther sequentially, I thought this could be refactored out. You don't really need a class for this.

I added the code to the Collections class as a class method. I also implemented searching in parent directories, using _parent_dirs(). sys.path is no longer implicitly modified. The code thought that name could use being configurable, so I thought a rudimentary mechanism to allow for this could be as a class variable on Collection (default_module_name). Finally, the new class method load_collection() now closes the file returned by find_module() as recommended.

I haven't reimplemented tests yet as they need to be redesigned to accommodate for the api change.

I thought maybe you'd like to review the changes, and if you like the design please say so and I will add commits adding tests for this.


Ivoz added some commits Oct 22, 2013
@Ivoz Ivoz Ignore pyc files 1805022
@Ivoz Ivoz Update releases to version 0.2.4 e0053a3
@Ivoz Ivoz Move loader functionality to collection
Class was extremely simple, limited functionality, and created
side effects (modified sys.path).

New functionality implements loading collections from parent

First, I haven't compared them yet but the parent directory searching likely duplicates #87 (which precedes this ticket by a few weeks.)

Secondly and more importantly, task module loading is a complicated topic with varied use cases - any change is likely to ruin somebody's day somewhere. This was true for Fabric 1.x (Invoke's predecessor) and while I'm trying to find less-error-prone solutions in Invoke, clearly it's always a clusterfsck :(

I say that to explain why I'm likely to kick this PR around some before merging or rejecting it.

Third, re: loading from sys.path, IIRC it was put in place for a few reasons, though none of them are super straightforward:

  • If we loaded tasks via direct filesystem path, it's both less portable (for folks distributing shared modules) and more verbose (must specify abs or rel path, etc) than specifying a Python import name.
  • That's doubly true when one wants to load multiple tasks into one namespace.
    • Counterpoint: since the loader was written, the namespacing feature got much better & is likely the primary method of tying modules together into one entrypoint / loaded module.
  • In the implicit (vs explicit) scenario of loading a "local", fully FS driven solutions have just as many gotchas as sys.path ones (as far as I can tell) so the previous point(s) give the "Python-level" approach an upper hand.

tl;dr we need to jot down as many use cases as we can & find the solution best combining 'meets most/all use cases' & 'is least surprising'. So far in Invoke, I've had success breaking things down & exposing both lower level features & high level ones combining the former for maximum usability.

Loading isn't an issue for pure-library use cases, so that may mean we expose more loading-related CLI flags and/or document the default behavior more thoroughly.


Took another gander at this & #87; thoughts:

  • The reason I had a Loader class was, as in my earlier comment, partly to encourage the idea of user-selectable loading behaviors.
    • It also feels like good practice - loading a Collection from $somewhere is orthogonal to what it means to be a Collection, but if we make loading a Collection classmethod, the only way for users to override is to subclass Collection itself. Doesn't feel right to me.
  • Re: loading from sys.path, I do now agree it should not be the default any longer:
    • Non-user-controllable path collisions have already caused a number of headaches for people;
    • The counterpoint I made above is definitely true - the "maybe we load multiple names from sys.path at runtime" idea of stitching namespaces together has been wholly subsumed by the much more explicit namespace construction API built afterwards.
      • That also holds true for the "using distributed task modules" use case - that's something I've done a lot of since, and it works fine when you supply your own, single entry point & import arbitrary third party modules by hand.
      • Using nothing but a single distributed tasks module (i.e. install Invoke, install hypothetical third party package, and use Invoke only to load that other package's feels like a very minor use case to me, definitely not something to factor into the default behavior.
  • But to combine those two top level points, I think users should have the option of loading off sys.path if they find the benefits are worthwhile - and thus having an overrideable loader concept still makes sense.

Anyway - I'm going to (finally) take your PR into a branch, add my own changes on top re: these thoughts, and merge it. End result should be your recursive loader + retaining use of a Loader class + probably shoving the sys.path code into a subclass.


Bah, except I forgot the whole thing where one may want sys.path modified during the load process, for purposes of loading local-to-tasks-module code. Guess it depends whether imp.load_module implicitly performs that for you, or not - I just looked and it doesn't sound that way.

Still, we can perform the sys.path tweak while still telling imp.find_module to search only our parent directories, getting the best of both worlds. ❤️ imp

@Ivoz Ivoz changed the title from [WIP] Simplify loader functionality to Simplify loader functionality May 1, 2014
Ivoz commented May 1, 2014

So if simply line 331, here was changed to append sys.path after those roots, would that be the solution you're looking for?


@Ivoz I don't think so, the distinction here is subtle but important - there's a set of paths to be searched when we import the tasks module itself; and there's also the paths to search when anything in that tasks module calls import. The former is that 2nd arg to imp.find_module, the latter is sys.path. They don't appear to affect one another.

It's actually nice that they're separate because it makes the distinction clearer & lets us implement things exactly how we want.

I've now got the code set up in a manner I am happy with, and it still at its core uses your discovery algorithm :) will be merging it momentarily...

@bitprophet bitprophet added a commit that closed this pull request May 3, 2014
@bitprophet bitprophet Changelog, closes #92, closes #87 f75f3d3
@bitprophet bitprophet closed this in f75f3d3 May 3, 2014

All done. If interested you can see the actual diff here: pyinvoke:0bcf45a...pyinvoke:f75f3d3 (scroll down.)

I did not end up writing a second subclass that preserved the old behavior, but it'll now be very easy to do so (and to add a CLI flag for selecting which subclass to use, I guess) should anybody care at some point.

Ivoz commented May 4, 2014

Looks good 👍

@bitprophet bitprophet referenced this pull request May 4, 2014

Fix update_path #102

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment