Expanded configuration mechanisms #147

Closed
sophacles opened this Issue Jun 13, 2014 · 66 comments

Projects

None yet

3 participants

@sophacles

Maintainer edits

Overall TODO generated by huge arse discussion thread.

  • Add new concepts doc on configuration, including a clearly marked hierarchy definition, and a big WARNING about how multiple files in same location w/ diff extensions are not merged.
  • Make sure to clarify .yaml, not .yml.
  • Probably also needs an update to the CLI docs section re: flags and env vars (new subsection) punting on user-driven config updates that aren't env vars; env vars have their own section in config so no need to duplicate
  • Re: env vars, need to figure out best way for user-driven config settings to work too, eventually punting
  • Brief mention of concepts doc in the tutorial Actually don't see an elegant way to work it in - feels best just as a standalone regular concepts doc for now
  • Make sure API in docs gets updated to be consistent & 'modern' - e.g. obj['foo.bar'] should turn into obj['foo']['bar'] or ideally with the option of obj.foo.bar.
  • Update concepts/contexts to adapt, possibly paring it way down or removing entirely, tl;dr it's just about explaining how contexts perform shared state.
    • The 'nested collection configs get merged' explanation could still live here but probably lives better in the new config page, or the namespacing page if it exists.
  • Vendorize etcaetera
  • Use its Config object in place of Context's existing inner dicts, get tests passing again referencing it (it's dict-like).
    • Decide whether the current run/general/etc splits become sub-config objects or just nested namespaces within the primary config object
    • Leaning towards the latter so people can override them everywhere
  • Add new CLI flags for setting config options (primarily -f or whatnot for "specific config file")
  • Implement hierarchy re: multiple conf files + env vars + cli flags + in-module configs
  • Conf file loading of different extensions at the same 'location' is exclusive - no merging of eg /etc/invoke.yaml and /etc/invoke.json.
  • Debugging of conf option overrides must be easy
    • either via a no-op CLI flag showing what the tree would be for a given task module (or task invocation?), or
    • via regular debug output (ideally + a no-op option)
    • really, those go to the same place
  • If not covered in the previous point: it must be easy to troubleshoot stupid typos in config file paths.
    • e.g. if you accidentally created a /etc/invok.yaml, or fatfingered the arg to invoke -f <path> - it should be clear when running with -d which config files it attempted to load and which ones failed to load.
    • Main problem is that for the first 3-4 levels we have 3x file paths we end up searching for, so it will be somewhat verbose :( but no obvious solution offhand, better to add it to debug output (or some other level?) than try to be clever.
  • Update invocations to use this new setup instead of dotted names - they still technically work but are invalid for eg env var translation.
  • Context objects (& subclasses like Fabric2's Connection) need a clean API for this configuration stuff - either Context directly grows all the config related methods (perhaps becoming a etcaetera.Config subclass itself?) or it accepts some config object (again perhaps etc.Config) as an instance member.
    • The current driving need here is actually testing related, I need to inject a config value override or two in Connection tests.
  • Fill in any remaining skipped tests related to this feature set.

Original ticket description from sophacles

It would be nice to be able to pass context variables on the command line to invoke scripts. Something like:

> inv -X var1=val1 -Xvar2=val2 target --target1opt=val

And perhaps even allow targeting variables to specific namespaces, ala:

inv -X ns1.var1=val1 -Xvar2=val2 ns1.target  target2

for those cases where target2 should use the default for var1, but ns.target1 needs the overridden value.

There are of course concerns about how command line set options override things in a call to Collection.configure... take for instance this script:

@ctask
def task1(ctx):
      run("echo {val1}".format(**ctx.config))

@ctask
def task2(ctx):
    run("echo {var1} and {var2}".format(**ctx.config))

ns1 = Collection(task1)
ns1.configure({"var1":"ns1var1"})
ns = Collection(ns, task2)
ns.configure({"var1":"globalvar1","var2":"globalvar2"})

Which is the right output:

option 1:

# inv -X ns1.var1=clival1 -X var2=clivar2 ns1.task1 task2
clival1
globalvar1 and clivar2

(note - because generally it's considered cli options override anything in-program)

option 2:

# inv -X ns1.var1=clival1 -X var2=clivar2 ns1.task1 task2
globalvar1
globalvar1 and clivar2

(note because of the way overrides work in invoke already)

I think everyone would agree that:

# inv -X var1=clival1 ns1.task task.2
clival1
clival1 and globalvar2

is correct.

@bitprophet
Member

I haven't yet sat down and pondered how I want ye olde config override hierarchy to work here, but this is definitely part of it. Then there's also the possibility of env vars (#22) and config files, possibly multiple layers of those as well (system/global, per user, local to project, etc).

I'm already semi unhappy with how the existing config mechanism can be unintuitive and hard to reason about, so I'd prefer to really hash this out sometime with pros/cons instead of iterating. Would welcome more ideas if you have them, of course.

@sophacles

Generally speaking.. (leaving aside for a moment the namespace stuff in inv) in unix land the hierarchy goes something like this (lowest priority to highest priority):

  • default value
  • general environment variable (e.g. $EDITOR)
  • configuration file (first global, then local - e.g. ~/.gitconfig is overriden by $project/.git/config)
  • program specific environment variables - e.g. $GIT_EDITOR
  • command line options.

Of course, some programs swap the config files with one of the environment variables - because strong consistency is not a unixy concept.

So I think in the large, that would be a good idea to go with as a start.

Now, about configs files, and how they override each other, and so on - for an extreme case of what could happen I like to refer to vim's runtimepath. It does (lowest precedence to highest):

  • /usr/share/vim (app defaults)
  • /etc/vim (system defaults)
  • ~/.vim (local settings)

(it also then has "after/" stuff for which it does in the reverse order, with all sorts of other exceptions and other rules (in vim do - :he runtimepath for the gory details). I think that's not the goal here :))

So about invoke... The way collection namespacing and config overriding works now is pretty nice (once the multiple paths to a collection issue is figured out that is). It's basically the same as the ones mentioned above...

  • submodule
  • importing module
  • importing importing module... etc

The common thing about all of these is they basically order precedence on a concept of "locality" or "immediacy". Effectively the, the precedence is given to the option that was most likely to have been changed most recently. In the invoke case this is "i imported already written code, then called configure on a collection that had the imported stuff"... it's most likely I won't import a library that isn't written yet.

All that is a really long-winded way of suggesting this type of "how to set variables" precedence order:

  • module level settings
  • importing module settings
  • importing importing module settings... (etc)
  • /etc/invoke.conf
  • ~/.invoke set variables
  • ${DIRECTORY_CONTAINING_TASKFILE}/config.py (or whatever)[1]
  • environment vars
  • command line -f path/to/config.py
  • command line -X var=val options (with the semantics from above hashed out for namespace overrides)

[1]With the option of having ~/.invoke or /etc/invoke.conf specify a search path for relevant config files. Also note - then of course naming config files conf.py is not a good idea anymore.

Anyway that turned out longer than expected. hah!

@bitprophet
Member

Hah, no, that's perfect - overall it obviously fits what I or any other longtime Unix user would have come up with if we'd spent a few minutes thinking about it. Thanks!

The only parts I have any quarrel with:

  • The 'local to tasks file config.py' - I'm not sure what this gives users over setting the equivalent values within their tasks.py instead? If it were a YAML or JSON file, maybe, but even then I wonder if it is worth the conceptual/implementation overhead.
  • The config.py's vs invoke.conf's - shouldn't those all be the same type/format? (and curious if you've thoughts on what format - personally I strongly prefer YAML despite its shortcomings, but barring that I'd use JSON; ini and configobj are both terribad at nested dicts.)
@matleh
matleh commented Jun 16, 2014

+1 for this - to be able to override configuration settings and for the use of JSON and/or (JSON-subset style) YAML for configuration files.

@sophacles

Not sure why I put config.py instead of invoke.conf - I'm just going to blame "Friday afternoonism". I agree that they should all be the same format. (And no preference on what the actual format is).

As for the use case of task local invoke.conf vs having it built into the tasks.py: for me it comes down to separation of concerns - I just like having config details different than the code that executes them. It's not a strong technical argument, just a preference - in the end either way would be fine.

This was referenced Jun 19, 2014
@bitprophet bitprophet changed the title from Pass context variables on the command line. to Expanded configuration mechanisms Aug 15, 2014
@bitprophet
Member

Overriding this ticket to be a general config overhaul/expansion one. I need to start poking at this to support Fabric 2 development, where things like "storing the default SSH port and local username, then allow overriding in various ways" come into play.

I skimmed tickets on Fabric's tracker and most PRs or issues mentioning the topic, and specifically JSON/YAML support (which I think is absolutely required) just tie such file loads into Fab 1's single level env dict.

I googled for Python libs dealing with this to see if we can avoid reinventing the wheel (tho as usual I worry that we'd have the usual fork/collaborate/fuckit problem trying to hone the inevitable 10% of functionality that we need to change).

I'll edit this post to add more links, but so far I've found:

@sophacles

+1 for etcaetera but all of them look pretty good. (note: my vote is based only on random details in my first impression and is totally subjective. I have not arguments for or against any of them)

@bitprophet
Member

Cool, thanks @sophacles. Digging deeper into etcaetera now (tho partly just as it's first). For any of these, our specific needs seem to be:

  • Ability to handle arbitrarily nested config to match nested namespaces (unless we can make dotted names work well enough or whatnot)
  • At least one level of nesting to handle different areas of config (general, run() related, etc)
  • Strong control over hierarchy, as per earlier comments about flags vs env vs multiple config locations - we want it to work our way and not be foisted on us.
  • JSON support
  • YAML
  • environment variables
  • some way of manually injecting values from our CLI parser at a specific hierarchy level
  • (ideally/optionally) Python file support)
@bitprophet
Member

Notes from etcaetera:

  • Typing the name is a pain :)

  • It interprets a single nested config file (eg YAML) as nested subdicts, which is good.

  • You can add additional config objects (each with their own set of adapters) as sub-configs; these do not get "merged" but are instead available as sub-objects via attribute. E.g.:

    config['foobar']
    config['foobar']['bizbaz']
    config.subconfig['whee'] # This comes from some other source, eg another conf file
    # NOT VALID
    config['subconfig]['whee']
    
    • This may or may not be useful for us depending how much we directly expose Etcaetera objects to our users.

So far it fits all of my previous bullet point needs except possibly the "inject our own CLI flag values" - I need to see if we can stick those in the middle or if they can only be added via the 'defaults' adapter. Optimistic tho.

EDIT: looks like the only builtin way to do that is the Overrides adapter which always comes last in the hierarchy, but that may be appropriate for CLI flags anyway. Gotta doublecheck against the above notes. We could also probably make a custom adapter, have to look into that.

@bitprophet
Member

There are no explicit docs on custom adapters, but it's pretty basic, the AdapterSet accepts any number of adapter objects between the defaults (always forced first in the deque) and overrides (always last) and as long as one implements a few basic "here's my settings" methods, should be easy to subclass. And again that's if Overrides don't work for us.

@bitprophet
Member

Taking the above hierarchy list, here's how it could work with an etcaetera based Config object + adapters. Note: it might replace the existing namespace driven hierarchy, or it might augment it, I need to revisit the corner cases of that design to see if it works ok in what etc gives us.

  • module level settings - anything defined in Invoke itself in the default context/collection objects; could be an etc Defaults adapter obj
  • importing module settings - 'importing module' means a sub-collection, so that falls under the 'note' above. Could be we just take our existing "stuff from the namespace" and end up putting it all in the initial Defaults once we've internally collapsed it.
  • importing importing module settings... (etc) - ditto
  • /etc/invoke.conf - setting aside default format & multiple formats for now (tho see below) this would simply be the next adapter in the set, a File.
  • ~/.invoke set variables - again setting aside specifics, this would be another File adapter
  • ${DIRECTORY_CONTAINING_TASKFILE}/config.py (or whatever) - yet another File adapter
  • environment vars - an Env adapter
  • command line -f path/to/config.py - another File adapter
  • command line -X var=val options - this would be an Overrides adapter (or a custom adapter) generated out of our existing parser stuff. Presently, these values are (like the Collection namespace configs) self-merged into the core Context object, so this is another spot where we need to figure out how much we delegate to etcaetera.
@bitprophet
Member

Re: file formats, random thoughts:

  • Use of etc or the other libs lets us deal with Python module files, YAML and JSON roughly equally/transparently (etc's File adapter selects how to load data based on file extension, for example)
  • The easiest-implementation option is to say "we only support one type, we will seek out only config.<that type>, deal with it" and then there's no ambiguity about ordering.
  • That's not too friendly and etc makes it easy to load more than one, so we could instead add adapters for each possible extension - i.e. the /etc/invoke.conf "item" becomes 2 or 3 adapters, i.e. File('/etc/invoke.yml'), File('/etc/invoke.json'), File('/etc/invoke.py').
    • HOWEVER the problem is if some poor sod ends up with >1 of those on the FS - due to etc's implementation (really - any implementation) there has to be some sort of order and there's no clear one right intuitive answer.
    • Feel it's best to go for practicality over purity (hi Zen of Python) so off the cuff I'd just go with yaml -> json -> py (if we offer py).
  • At first I thought etcaetera's Python-module support was single-level only (module level constants) but no, it happily consumes dict values. So Python ends up being just as flexible as YAML/JSON in terms of nesting; we might as well support all 3.
@sophacles
  • Some thoughts on file format things -

If it's decided that invoke should handle all the formats, then I agree there should be a strict preference ordering. That is: /etc/invoke.yml always wins, if it isn't present, then /etc/invoke.json always wins, and so on. (note: the actual ordering choice is irrelevant to me, the one I presented was example only). When wearing my devops and/or sysadmin hats, I really really want this to be deterministic and simple.

This should be true any time invoke finds a file rather than being explicitly told "use this file".

Additionally, whichever format gets the "always wins" choice, should be the format distributed with the pypi and github packages for things like default values and primary example scripts. Secondary examples in the other formats can be provided in a contrib area, and maybe in one documented example each.

I can see the strong ordering being a bit confusing at first, and there being a bit of pushback and FAQ answering, but in the long run, I think it's far simpler than trying to deal with >1 "winning" config files and all the weird merging cases.

  • Some thoughts in general on the above discussion:
  • I want to retract my preference for ${DIRECTORY_CONTAINING_TASKFILE}/config.py in favor of an invoke approved method of Defaults(). See my next comment for my thinking...
  • I am beginning to wonder if we want an "include" directive for each of the supported formats. This way we could allow people like me, who have a large collection of tasks, modularized heavily, to not need a new full config file for different formats. For example, I use the exact same nginx and uwsgi deployment options/variables in development, testing, and deployment, so including my sophacles-nginx.conf would be nice in my scripts. (other devs have different development settings for thier environment, so they would not want this... for example the guy to my left uses the same nginx options, but different uwsgi options in development). Generally speaking, this would probably simplify a lot of usage scenarios more than some of the other options discussed.
@sophacles

Deeper on: ${DIRECTORY_CONTAINING_TASKFILE}/config.py being a potentially bad route...

  • The idea of an include directive in .conf files is better.
  • Having only a Defaults() object in any given tasks module also allows discoverability:
    • as sysadmin/devops person I know what I can do easily with inv --list, just like with per-task arguments
    • as a developer, It keeps everything clean, and since they are called defaults (semantics i know, but sometimes it matters...) I know that they may change, and should actually work with that in mind, just like with kwargs.
  • It makes overrides of defaults by containing modules more difficult. (Conversely, overrides are discoverable by --list on the overriding taskfile too. Again, sysadmin and devops me is very happy about this thought, not to mention the zen of python stamp of explicit approval)
  • Speaking of the zen - I'm pretty sure not having a separate .conf is more zeny (particularly with the include directive mentioned above - explicit, simple, obviously right, etc)
@bitprophet
Member

[...] I agree there should be a strict preference ordering [...]

To clarify, do you mean that if there's both a /etc/invoke.yaml and /etc/invoke.json that we load invoke.yaml and do not load invoke.json at all?

I had been imagining we'd load anything that exists but allow yaml to 'win' over json where they conflict - which is what etcaetera's API makes easiest to do.

If you did mean the 'winning' should be more exclusive, I'm torn, I could see users being confused with either implementation. Leaning towards the exclusive option though - it is simpler to reason about as you said.

[...] whichever format gets the "always wins" choice, should be the format distributed with the pypi and github packages for things like default values and primary example scripts [...]

Excellent point, I agree. I've always (lately anyway) been of the opinion one should 'bless' a single option in any situation when there's multiple equally-valid options. Makes it easier for folks who have no strong pre-existing preference, and for us.

stuff about local-to-project conf files and includes

Random response thoughts:

  • Includes, in general, are always awesome. We as sysadmins know this :)
  • Having includes in this situation requires custom logic (and probably hacks to etcaetera) because JSON doesn't support it by default, and I don't know if YAML does either (tho it's more likely to.)
    • This could be a small or large stumbling block, depending. Off the cuff I'm thinking a "special" top level key like _include that we look for & honor, and if users needed to use that key themselves, tough noogies. (This approach is inspired by Chef role files, for better or worse...)
  • That said, I do not quite grok your meaning re: using Defaults objects in tasks modules instead of local-to-project conf files. Can you give a concrete example of what you're thinking?
    • Generally, local-to-proj conf files strikes me as a useful thing to have; they aren't strictly necessary because they serve identical function to in-tasks-module configuration (which may be what you mean?) but a lot of the time it's nice to have those defaults expressed in a non-Python config format, and this would enable that option.
  • I also don't get how you feel this ties into inv --list since currently that only displays the tasks that exist, not their config values.
    • Unless you mean the kwargs/flags specifically, which is a valid point - but those are only one part of what can be configured at this level (i.e. all kwargs can be considered "config options" but not all config options are, or should be, exposed as kwargs.)
@sophacles

Oh man, so I made a bunch of notes a couple weeks back, which I thought I had fleshed out into a comment here. Since there is no evidence of such a comment, I'll do it here. Hopefully that will help clarify all of the above. I'll also try and address the things specifically mentioned above. (Warning - this will probably be long).

The way I'm using invoke is this:

  • There are several python modules in a python package with some subpackages. They are grouped by concept. So within the package I have a subpackage called build, and a module for python, a module for java, a module for C and so on. The tasks in each module correspond to the relevant tasks - the maven calls for each package in java, the cmake (etc) commands for the C code, and so on. Similarly i have subpackages for install, deploy, test and so on. (Not necessarily organized by language, but by some scheme consistent within the subpackage).
  • Each python module exports a single invoke collection/namespace.
  • Each subpackage exports a "composite" collection (e.g. the build collection is composed of the java, python, c, etc collections)
  • There is a utils directory in our tree that contains "top-level" invoke scripts. These are a bit ad-hoc, in that they define some common groups of tasks, based on actual deployment needs, or build needs or whatever. So I have task file called ci-tasks.py which imports the entire build sub package, and the entire test subpackage, and has a few targets, e.g. buildall that has all of the build tasks listed in pre=[...], and e.g. run_travis which has some setup stuff and has pre=[buildall, testall]. A different taskfile has all the build steps and some installation steps, and so on.
  • Each of my per-module collections heavily use context variables. Some of course only matter to the module itself (e.g. pip_binary) others are used nearly universally (e.g. source_tree_root).
  • the very commonly used context variables are actually defined in a commonly imported dict that keeps defaults consistent, but i consider this outside the scope of invoke - just mentioning it here for clarity.
  • all the of context variables are given a default value in a dict, and passed to some_collection.configure()
  • each of the taskfiles in utils above will also call configure with relevant "secondary defaults". (e.g. the source directory on test instances is /usr/local/src/our_stuff/, but the build modules define the source directory as os.path.expanduser('~/src/our_stuff') since the most common use case of individual build tasks is on developers' machines)
  • since we're still working out various deployment options, I have defined an invoke script in utils/ for each thing we try. That is: deploy_ubuntu_tasks.py, deploy_redhat_tasks.py, deploy_docker_tasks.py, and so on. This is because each has differences for some (~10%) of the context variables.

Ok - so thats some background on how we currently use invoke. There are a few major pain points:

  • If a common variable changes it can be a pain to track down where it is set, and where it is overridden.
  • similarly with many possible collections to invoke via invoke -c foo_tasks, it is hard to remember what each one has set as a "script default" for the various variables. (the variables though are super useful, using contexts has reduced my code a significant amount, so it's the right thing), so I don't know what to override where.
  • It's annoying to figure out each variable that exists without going through each python module, or getting a bunch of specific documentation listings.

With those pain points in mind, there are a few things I actually see getting worse by adding configuration files as an option:

  • Knowing or discovering what configuration directives are even available.
  • Knowing the value of a context variable after invoke pulls in defaults (as set by collection.configure()), the various config files, and environment variables.
  • Understanding the results of various overrides in a "multiple inclusion" setting for collections.

So in order to mitigate these I propose that invoke --list starts displaying all of the known context for the relevant collection. This includes every variable that is in a context based on the environment and command line options that are passed. Alternately another command called invoke --show-vars or similar do that job.

OK - that covers most of the things I think are implicit to my comments from yesterday (whew). To address your specific points (in a slightly different than asked order, because I feel they build on each other a bit):

  • When i mention "using Defaults objects in tasks modules instead of local-to-project conf files", I had in mind my library of tasks mentioned above. Rather than have config file for each of the modules (and associated tasks and collections) imported from various packages above, I meant just have a default configuration similar to collection.configure. One way to make this work relatively simply is to have a class of defaults that is defined to go with each collection. (or just use the dict as the object containing the defaults).
  • I feel this ties to --list because of the stuff expanded above, additionally, because without a target to show all the context (or other config variables available) we have a bunch of implicit knowledge required. Further, the origins of this ticket are to pass context variables on the command line, so those should have a method of discovery similar to the tasks which are passed on the command line.
    • I do also really like the idea of current --list behavior expanding to show each task's kwargs tho!
  • I agree with your assessments and comments on include directives.

About config file format and naming:

  • I did mean "if there's both a /etc/invoke.yaml and /etc/invoke.json that we load invoke.yaml and do not load invoke.json at al"
    • in a larger context, I have no problem having yaml in one place, and JSON in another and merging, that is explicitly:
# Files in /etc:
/etc/invoke.yml
/etc/invoke.json

# files in my home dir:
~/.invoke/invoke.json

# files in my task dir:
/path/to/tasks/invoke.yml

Then this happens within invoke:

load_file('/etc/invoke.yml')
load_file('~/.invoke/invoke.json')
load_file('/path/to/tasks/invoke.yml')

The excepion to the above is when there is an explicit command line arg stating which invoke file to use, at the path/to/tasks level. That is - if I pass e.g. inv -f /path/to/tasks/invoke.py it will be loaded rather than /path/to/tasks/invoke.yml. But the same files from /etc and ~/.invoke would be loaded.

I hope this giant wall of text makes my thoughts a bit clearer for you.

@bitprophet
Member

Oh boy oh boy reply time~

The tl;dr I hear for your setup is:

  • You've got the meat of your task definitions in a set of 'topic' module trees / namespaces.
  • Those tasks are prepared for invocation by inclusion in user-facing top level task files that collect the task namespaces in various cross-sections.
  • Both the inner namespaces and the top level task files set config values, with the latter overriding the former.

That strikes me as a reasonable enough 'big site deployment' of invoke.

It mirrors how I perform code reuse in Invoke to a degree - the invocations set of modules is imported into my various OSS projects' top level tasks files. See especially invocations.docs and its use in the tasks.py of fabric/invoke/paramiko, which was the impetus for the entire existing config override setup.


What I hear for your pain points is really one thing: it's difficult to introspect the overall state of the configuration. What are the defaults, and where; how are they overridden, and by whom; what is the final state?

I agree that exposing that info would be useful, IMHO as a separate flag (bundling into --list feels like it'd be too noisy for the common case) or alternately just as debug level logging output.

A sub-issue there is your mention of knowing "what settings are available" - which really means "what settings are being consumed anywhere in the code". Dunno if there's any good way to check on this that isn't static analysis? Unless we were to set up a declaration style mechanism where you must declare settings at a high level (which could be framed as "you must always have some default value for every setting"?)

Finally, re: task kwargs, I totally flaked and forgot that we already do that, inv --help <taskname> displays all the flags for the task named. :)


Re: config file loading, I agree with most of your points re: selecting files when there are conflicts, that's all what was in my head too.

Your points re: conf files vs in-python structures, I think you may have misinterpreted something, my intention here was to add config files as another option for configuration. In-Python dicts would not be removed in any way; for folks who find conf files (or env vars or etc) an easier alternative (or who want the 'extra levels' of merging) they'd simply gain the option whereas right now it's not possible w/o ugly manual hacks.

@sophacles

re: conf files vs in-python-structures - yeah, I was confusing myself - rereading this thread seems to have put me on the same page with what you're saying.

re: --help, --list and kwargs, I think the idea in #163 might actually cover what I was thinking.

re: introspection things and showing the configuration options and what's available in contexts.
and
re:

A sub-issue there is your mention of knowing "what settings are available" - which really means "what settings are being consumed anywhere in the code"

I didn't intend that - bad wording on my part. I really just meant introspection of what a context would be given the "current" tree. Assuming that there's an argument --show-config, and that there are several task files, as well as tasks imported from python packages here's a set of examples (some output from existing commands is not real, so much as just to get the idea across, not suggesting we change --list):

$ inv -c taskfile_A --list
foo.collection1.task1
foo.collection1.task2
bar.collection2.task3
bar.collection2.task4
$ inv -c taskfile_A --show-config
foo.ctx_var1 = "value" [set in collection: foo]
bar.ctx_var1 = "different" [set in collection: collection2]
ctx_var2 = "value" [set in collection: collection1]
ctx_var3 = "value" [set in config: /etc/invoke.yml]

$ inv -c taskfile_A --show-config foo.collection1
ctx_var1 = "value" [set in collection: foo]
ctx_var2 = "value" [set in collection: collection1]
ctx_var3 = "value" [set in config: /etc/invoke.yml]

$inv -c taskfile_A --show-config bar
ctx_var1 = "different" [set in collection: collection2]
ctx_var2 = "value" [set in collection: collection1]
ctx_var3 = "value" [set in config: /etc/invoke.yml]

$inv -c taskfile_A -f configfile.yml
ctx_var1 = "value" [set in config: configfile.yml]
ctx_var2 = "value" [set in collection: collection1]
ctx_var3 = "value" [set in config: /etc/invoke.yml]

$inv -c taskfile_B --list
collection1.task1
collection1.task2
collection3.task5

$inv -c taskfile_B --show-config
ctx_var2 = "value" [set in collection: collection1]
ctx_var3 = "value" [set in config: /etc/invoke.yml]
ctx_var4 = "value" [set in collection: collection3]
ctx_var5 = "value" [set in collection: ns]

This would require some merge tracking, but the way overrides work in the "etc" package would facilitate this.

If I use a variable ctx_var_hidden in some of the tasks, and don't define it anywhere, it will runtime error either way, and it won't be listed in the --show-config output.

@bitprophet
Member

OK, I get all that. We've definitely words'd enough and I think we're mostly on the same page now too, so here's what I think needs doing for this (so I can get down to brass tacks): EDIT: moved to description, derp.

@bitprophet bitprophet added the Feature label Aug 25, 2014
@bitprophet
Member

As per the just-closed #101 we need to make sure that defaults for things like run(echo=True/False) are considered here. In fact that need is what prompted me to return to this ticket, since I need Fabric 2 to have a way of setting defaults for things like SSH port number, user name, etc.

@bitprophet bitprophet added this to the 0.10 milestone Sep 10, 2014
@bitprophet
Member

Moved the checklist up to the description & fleshed it out.

@bitprophet
Member

@sophacles I did some DDD and posted a draft of the doc for this crap over here: https://github.com/pyinvoke/invoke/blob/dc6977b7eb53265efe4afa49daf0440b8d62fc4b/sites/docs/concepts/configuration.rst

Questions which fell out of writing down specifics (god I love DDD...) are below - don't feel pressured to respond to them but I need to figure them out myself so writing is helpful, and if you have the time...:)

  • What specifically do we call the local-to-project conf file?
    • Calling it invoke.* matches the other two 'layers' of config file, /etc/invoke.* and ~/.invoke.*
    • However it feels like it doesn't then match up with tasks.py - somebody new to an Invoke using project might see tasks.py and invoke.yaml and not realize they are closely connected.
    • So maybe tasks.(yaml|json|etc) instead?
      • Except, counterpoint, then that implies it is another task definition, which it is not.
    • Another alternative is what you (and maybe me too?) used by instinct in the above discussion, config.*, which is a little clearer.
      • Dunno if this clarity is enough to offset the inconsistency with the global and per-user filenames tho...
  • Do we standardize on .yaml or .yml?
    • I'm leaning towards .yaml if only because there's no 3-char version of .json, so using the 4-char version is nicely consistent.
  • How to handle nested config options in environment variable format?
    • Feels like a global prefix is wise to avoid collision with (or importing of!) random other env vars. I.e. INVOKE_*.
    • For some reason etcaetera's docs mention FOO.BAR but this is not valid in either zsh or bash
    • Single underscore per level is "easiest", e.g. INVOKE_RUN_ECHO=1 to set {'run': 'echo': True}
    • But that fucks up any identifiers that natively have underscores (e.g. {'some_namespace': {'option': 'value'}} -> introduces ambiguity when we see INVOKE_SOME_NAMESPACE_OPTION.
    • We could then use double-underscores for levels, eg INVOKE__RUN__ECHO, but this feels kinda ugly?
    • Another option is to make the env var parser "smart" like our CLI parser and say "I will tell you what valid config levels are (by looking at the other config levels' sources), I will use that to determine what an underscore means at any given point".
      • This means it's not possible to define 'new' config options via env var, but I don't think that's a huge deal, anybody making up their own config options should be defining defaults at the Collection level.
  • How should arbitrary/user-driven CLI flag configuration work exactly?
    • I may be missing something we went over above...
    • In Fabric 1 we semi recently added a single flag --set key=val,otherkey=otherval,... for updating Fab's single env config dict. We could go this route, e.g. if we weren't offering -e already one could conceivably do inv --set run.echo=1 mytask or some such.
    • Per-task flags could also tie into this but I'm not sure how because right now the intent is that those match up to task-function arguments. Having values enter the system in 2 places - func arg AND config object - feels really messy and unnecessary.
    • Also maybe related is #174 because what if somebody mismatches the Arguments exposed and their function's signature? Would we want them to go into the config object instead?
      • Do we, horrors, go the Paver route and say "actually lol your function only TAKES one config object parameter" (No, let's not do this.)
@sophacles

@bitprophet Good start on that document! Distilling this rambling conversation is quite the task :)

Re the points above:

  • local-to-project conf file:
    • What about tasks.conf.* for the default config file? It both removes ambiguity, and ties the names together. It also provides an intuitive way of handling the -c option, invoke -c foo could just look for foo.conf.* unless -f is also specified (at which point anything goes of course)
    • (this also ties into your next point...) is there a simple way of auto-determining the format used, rather than base it on file extension? (e.g. try yaml, then try json, ...) Alternately, can we have a "comment defining type as the first line" requirement? If yes to one of those, it could always just be called $COLLECTION_NAME.conf for simplicity and clarity.
  • .yaml v .yml
    • agree with the caveat: if this causes problems with editing tools for configs wanting the .yml, defaulting to that is nicer. This caveat also presents an argument against the $COLLECTION_NAME.conf idea above.
  • environment
  • +1 on INVOKE_ prefix.
  • perhaps we should provide a helper method, or object, or something, that allows pulling other environment variables also. For instance, docker provides me a lot of runtime variables that would be cleaner to put into a context than either pass around or have to call os.getenv()[1]
  • nested contexts and env vars
    • the ambiguity of a smart parser is a bit frightening from both a "i broke it" and a security stand point.
    • what about and optional syntax along the lines of $INVOKE_FOO="context.path.name=value"?
      • pro: no ambiguity
      • pro: allows simple to understand nesting rules
      • con: ugly
      • con: variable name not tied immediately to context position (workarounds: use array, use a single variable INVOKE_CONTEXT="context.path1.name1=value1:context.path2.name2=value1" with also allows the familiar path pattern in setting things... INVOKE_CONTEXT=$INVOKE_CONTEXT:foo=bar mirroring PATH (etc)
  • cli in light of env
    • I think keeping command line args and context settings explicitly separate is the right route, intermingling them hurts my brain as a concept.
    • I like the notion of 0 or more -X or --ctx options (or a single one, comma separated, depending on the resolution to #132) in the manner of --ctx context.path.name=value.
      • this also has a certain amount of consistency with $INVOKE_CONTEXT thought above.
    • note I may be misunderstanding or just missing some of the complexities here, this entire section is a bit confusing to me from the bullets.

[1] The reason for allowing an explicit placing of a non $INVOKE_ var in the config is because otherwise the options I can think of are:

  1. Modify the context in some explicit code at import/runtime. This code must be copied around a bit, because of overriding concerns
  2. Do a check in each task for the context value, then the equivalent environment varialbe, then decide which to use.
  3. Have some script copy regular env vars to invoke vars prior to running the inovke command, but that is a bit messy (although, it could be by invoking invoke, hah!) and seems wrong.
@bitprophet
Member

What about tasks.conf.* for the default config file?

Nah, that's nonstandard/a bit ugly; I'd rather deal with the minimal downsides of invoke.* instead (which is where I've been leaning anyway).

It also provides an intuitive way of handling the -c option, invoke -c foo could just look for foo.conf.*

True, tho I don't think it's enough of a plus to merit the funny looking filename. I'd rather expose eg Context.merge_file(path) that users run at "load time" in their tasks file. (You touched on this with Modify the context in some explicit code at import/runtime at the end.)

auto-determining the format used, rather than base it on file extension? [...] can we have a "comment defining type as the first line" requirement?

A nice set of ideas with precedent (eg editors) but I think it's too much magic, right now at least.

.yaml v .yml [...] problems with editing tools for configs wanting the .yml

This is an unsolvable problem IMO (most tools pick one or the other and it feels evenly split) unless we make the two YAML extensions separate 'options' (i.e. load order pref becomes yaml -> yml -> json -> py) but that feels clunky.

Extension nonsense is messy everywhere; it's a continuum from "we honor one config file only" to "we honor LITERALLY EVERYTHING! .yaml .yml .json .js .py .conf .cfg .ini, auto-detect, file header, file footer, etc". For a brand new tool w/ no existing conventions I'd rather stick to the side of that continuum with the lower implementation & conceptual overhead, and expect users to adapt. Plus, always easier to add than remove features.

+1 on INVOKE_ prefix. [...] the ambiguity of a smart parser is a bit frightening [...] allows pulling other environment variables also

The "detection / smart parser" questions are semi moot, I looked & etcaetera's env adapter is of the explicit nature (it only pulls in what it is told to). We could implement our own env adapter that is more permissive, but it's probably simplest to start with "you can only use env vars matching predefined config keys" - I think I see how to do that with etcaetera's API.

(E.g. it loads up everything 'below' env vars, then introspects that config object for available keys, using that info to run the env-var adapter.)

or have to call os.getenv()[1]

I don't have a huge problem with telling folks "if you need more, just manually look at os.environ / os.getenv".

optional syntax along the lines of $INVOKE_FOO="context.path.name=value"?

That looks kinda inelegant to me, sorry.

command line args and context settings explicitly separate is the right route

Yea that was a dumb question, really :)

I'm not a fan of dealing with shitty key=value string parsing again (it is/was hell in fabric 1) but I don't see another good way to handle this with the existing parser. We could punt till later and say "use env vars for now as your most-runtimey option"...

@sophacles

.yaml v .yml [...] problems with editing tools for configs wanting the .yml

This is an unsolvable problem IMO (most tools pick one or the other and it feels evenly split) unless we make the two YAML extensions separate 'options' (i.e. load order pref becomes yaml -> yml -> json -> py) but that feels clunky.
(more snipped - about tools being split between the two options)

Since there isn't a strong convention, I like .yaml. For the same reason I preferred .html over .htm back in the old days.

... I looked & etcaetera's env adapter is of the explicit nature (it only pulls in what it is told to). We could implement our own env adapter that is more permissive, but it's probably simplest to start with "you can only use env vars matching predefined config keys" - I think I see how to do that with etcaetera's API. ... it loads up everything 'below' env vars, then introspects that config object for available keys, using that info to run the env-var adapte

Sounds good to me. With overrides and nested contexts, and so on we're getting in to "gordian knot" levels of complexity. This is a straight-forward solution preventing the mess.

I don't have a huge problem with telling folks "if you need more, just manually look at os.environ / os.getenv".

Yeah, I'm OK with this, as long as there's a way to get the env vars into the context (as there is in the previous etcaetera stuff you mentioned)

optional syntax along the lines of $INVOKE_FOO="context.path.name=value"?

That looks kinda inelegant to me, sorry.

Me too (I mean my first con was "ugly" :) ). I find it helps though, to explore the solution space sometimes.

I'm not a fan of dealing with shitty key=value string parsing again (it is/was hell in fabric 1) but I don't see another good way to handle this with the existing parser. We could punt till later and say "use env vars for now as your most-runtimey option"...

Ok, that's reasonable. In fact, it might be a good final position. Just looking back over the discussion the original ticket was largely born out of the lack of a config file (with a -f option to choose the file) - the ability to have arbitrary conf file, plus environment variables to handle runtime things, solves the problem completely. bonus: it does so in a far more repeatable way than command line options - args are easy to "forget" to type, and in the case of higher levels of automation, are harder to understand than setting up an environment.

Thinking about my use cases - the only time I'd ever really use the command line options is during development and exploration, and it's not any more work to edit the config or set the environment.

@bitprophet
Member

Sounds like we're on the same page now, yay :) thanks as usual!

Re: env vars and not flags, my instinct is to have both, but I can't really articulate why, and I do still think punting is the way to go right now. Need less crap in the way of a release.

I'll update the stupid checklist with actionable BS from this part of the discussion and get back to work.

@bitprophet
Member

We never did reach a consensus on INVOKE_RUN_ECHO vs INVOKE__RUN__ECHO but given both options have a downside and the need to "pre-load" available keys in the Env adapter, I think single underscores is it for now - will see how it shakes out in implementation.

@bitprophet bitprophet referenced this issue Sep 22, 2014
Closed

Consider making contextualization the default #114

5 of 6 tasks complete
@bitprophet
Member

Finally in a decent spot re: the docs for this, time to start implementing. Sheesh.

@bitprophet
Member

Argh here's a wrinkle I somehow missed, etcaetera has a handful of Python 2.7'isms in it :( and the author does explicitly list 2.7 only in his setup.py.

I've already fixed all the things preventing import of etcaetera.config.Config, here: https://github.com/bitprophet/etcaetera/compare/python-2.6 and will submit a PR in case they're open to merging it.

Assuming they say it's not worth the loss of dict comprehensions (which I think would be misguided, given how tiny my changes needed to be, but totally their right) I'll probably just need to consider this another point in favor of vendorizing our dependencies, and vendorize my copy :(

Alternately I could try Configurati if it's 2.6 compat, but I like etcaetera so far and we vendorize deps anyway...

@sophacles

I think vendorizing isn't a terrible solution - It sort of depends on the timeline for 2.6 support... how long until 2.6 is considered "too old"? (Presumably it's when Redhat/Centos 6 usage are significantly small).

@bitprophet
Member

I mean, I was going to vendorize it either way (that's just how we're going at this point - partly so invoke itself is an easier install), this just means I may need to vendorize my fork instead of straight upstream.

I dunno when 2.6 is "too old" but when I was checking that out for my recent 2+3 work, 2.6 still had a nontrivial minority install base according to some internal download statistics on PyPI. It was something like 10-25% IIRC. There were other reasons as well (partly CentOS 6, yes, also I think the latest Debian too perhaps).

The tl;dr was that the benefits of 2.7 aren't enormous (again see the diff above - dict comprehensions are nice but only save a few characters over dict(<generator expr>), for example) and it's just as easy to handle 2.6 in terms of 2+3 support, so cutting out people on stable platforms felt like a bad tradeoff.

@bitprophet
Member

Oh huh as I went to check project metadata for 2.6 support in etcaetera (so my PR is comprehensive) I noticed their own CONTRIBUTING docs explicitly say 2.6 onwards!

But also I just remembered - it requires PyYAML which is a C extension and that starts to kill my ability to vendorize things :( Will have to look into that harder to see if there's a pure Python lib that works as well.

EDIT: Seems it actually just preferentially uses libyaml if it's found but will install pure Python otherwise, and I happen to have that installed. So theoretically I can vendorize the pure Python bits and it may "just work".

@bitprophet
Member

Yea boo, PyYAML actually has 2 copies of its source differentiated for Python 2 vs 3 and its setup.py picks the right one at install time. I guess we can still vendor it by including both & tweaking our (still forked, I guess) etcaetera so it imports the right one depending on six.PY2 or whatnot? Will have to see. Might be able to just copy PyYAML's setup.py bits into our own.

Alternately, could just modify the vendored PyYAML so it's 2+3 in one codebase, using six or something, but this depends heavily on how extensive the diff is (presumably nontrivial) and then we're supporting that ourselves.

(Yes, at this point I do start to question the feasibility of the vendorize-everything approach =/)

@bitprophet
Member

Ah yes and I also need to do ye olde import munging inside etcaetera too, when vendorizing. Though by using relative imports I think I can make that into another potential upstream PR.

@bitprophet
Member

Have made a lot of progress so far (albeit slowly) including more tweaks to etcaetera (still working with upstream on whether to merge those).

Current status is that the new tests for the cli and config test modules pass when run in isolation but many of the config ones fail when run as part of the greater suite, implying some sort of state bleed :( Only just saw this so haven't dug yet.

Once that's done it's back to filling out the rest of the skeleton tests.

@bitprophet
Member

Yup that was just some un-cleaned-up os.chdir in the rest of the test suite. It'd be a spot of work to try and not do that so I just made sure appropriate teardown() methods were set up for now.

@bitprophet
Member

More test related fun revolving around use of tests._utils._dispatch combined with sys.exit patching, resulting in situations where incorrect invocations (due to a wrong cwd, or whatever) were not dying correctly but triggering the new "if no tasks, just print help" behavior.

@bitprophet
Member

Runtime config file done. Only hierarchy member remaining is env vars.

Then I need to write a bunch of tests enforcing that the hierarchy functions as intended (so far we've just proven that each individual level works on its own). Ideally that will be almost all test writing and no implementation as that's what we're relying on etcaetera for.

@bitprophet
Member

Our hierarchy presents an ordering problem regarding env vars:

  • Most adapters get added in Config.__init__ (files, overrides etc)
  • It's possible to specify defaults in __init__, meaning you can technically do Config() -> load() -> done - though this is currently only useful for tests.
  • Real cli.py usage has to call set_defaults to load per-task collection driven defaults, so it's Config() -> set_defaults() -> load() -> done
  • The Env adapter would ideally be set up in __init__ like the rest, but it needs to know about the other config keys.
  • Meaning at the very least it has to be set up after set_defaults() (so all adapters besides itself are accounted for), and potentially even during load() (if we want to be "clean" and actually-load, versus trying to save extra load actions & poking at the unloaded adapters beforehand).
    • In fact if we need to do a "pre-load()" then we need to call the upstream load() at least twice between initialization and actual usage of the configuration, for each task run. Maybe not horrible but still kinda bluh.

Sounds like there's no great way around it for now so I'll suck it up and add eg load_env.

@bitprophet
Member

Feels gross when writing the docs. If this API is ever actually going to be used outside of Invoke-core (i.e. by users trying to tweak its behavior) having to instantiate, then call multiple set_* methods prior to a final .load() seems real clunky.

Going to experiment with a slimmer API - either an easier one-shot setup method, or just deferring Config.__init__ until deeper inside the Executor class (i.e. when we have more of the component dicts available to us.)

@bitprophet
Member
  • CLI creates a Context with just the initial Config object, hands to Executor
  • Executor stores context, then at exec time (if task is contextualized) clones it, updates it with defaults data (and now, for env, it should also update with the Env adapter), and calls load.
  • Nothing else uses the context or config object.

So it feels like (ignoring any potential test shortcut concerns) we could simplify as follows:

  • First, delay creation of the Context until the last minute, this way it no longer needs its own clone (which does nothing but call clone on its Config) - that setup was from when Context "was" the config object.
  • Then we can pass in the Config object to Executor instead (still initializing it with data from the parser)
    • We can probably also do away with set_overrides and just add as a constructor kwarg
  • Finally, perhaps roll one new API call for the final load step, which incorporates the defaults data from the collection, then sets up the Env junk, and performs a final real load.
    • Which in turn means we could do away with .set_defaults since its data would be handed to this new method.

An even deeper approach would be to delay Config creation until runtime and just pass in the constituent data (project & runtime paths, and overrides dict) to Executor. However this feels too muddy and only saves us one extra step.

@bitprophet
Member

Also think I will axe the kwargs-as-defaults thing, it's too much of a concession to test setup & with the full hierarchy almost implemented it really starts to feel like a violation of ZoP #2

Having the ability to set defaults at init time as the first arg is almost as good?

This was referenced Oct 14, 2014
@bitprophet
Member

Lollerskates, picking this back up and running into a hilarious issue where the Python module adapter - when tuned to accept lowercase keys - is thus loading literally all of the Python namespace in the module, such as __builtins__. Only discovered this because the new env stuff I am working on performs the necessary recursion to identify valid keys.

@bitprophet
Member

Past that now, and core env var support is in. Env var type-casting is next, then ensuring the hierarchy works, and that might do it.

@bitprophet
Member

Type-casting down.

@bitprophet
Member

Bunch of hierarchy tests already pass, first fail was unexpected, env vars are not overriding the per-project conf file (only thing that should override env vars is the runtime conf file). Wondering if the "double-load" setup isn't working the way I expect it to and env isn't getting applied - though it was working fine in the simpler tests.

@bitprophet
Member

Interesting - when I introspect the AdapterSet at the fail point, the NestedEnv adapter is ending up 2nd in the list, after Defaults, instead of at the end where I expected it. Need to dig and figure out why, perhaps some sort of "where's the next spot to insert an adapter" pointer is getting reset on that first load.

EDIT: Looks like a bug in etc.Config.insert, when it is a "true insert" (vs an insert that would be an append) it does this strange rotate (feels like reverse), append, rotate backwards, process. Probably only shows up in this kind of situation when one adds a new non-special adapter after adding Overrides (since if there is no overrides it's just an append).

EDIT 2: The rotate is https://docs.python.org/2.6/library/collections.html#collections.deque.rotate

@bitprophet
Member

OK, fixed boog, submitted oleiade/etcaetera#19

@bitprophet
Member

Now I am at the expected failure where env/runtime conflict:

  • Env must be added last so it can 'see' all the other values (via an initial load() call)
  • But by being added last, it is now higher than the runtime config file in the hierarchy
  • Feels like the 'best' way to fix it is to manually call .insert on the Config adapters object, instead of calling register on the Config itself.
    • This breaks encapsulation, which is less than ideal.
    • The only other way to affect the adapters deque without doing so is to override the entire adapters list with Config.adapters = which feels just as gross if not moreso.
@bitprophet
Member

OK, entire config test module is green, time to add a handful of CLI module integration style tests (and probably something to actual integration suite, heh) and this might be like...done. For real. EDIT: except I forgot the debuggery stuff in the checklist, and attempted real world use to eg override the -e flag didn't work. Whee. Those dovetail well though.

Realized that super low level debugging might mean even more invasive changes to etcaetera, hopefully not the case (and I might skip it if so, given how long this has taken).

@bitprophet
Member

Wow, I think I got all the way here only to figure out that etcaetera doesn't perform deep nested merging (i.e. anything deeper than top level is wholly overridden by higher levels). Double checking this but it's kind of a major WTF if so. Why is this feature cursed :(

@bitprophet
Member

Yup etc.Config just does .update in a for loop. Updating it to perform recursive merging & will see if upstream accepts - doing it inline for now, could make it controllable pretty easily if that makes it more palatable.

@bitprophet
Member

Have that generally working, but now running into other issues, I suspect because it's not performing tilde expansion within File. (Also, yes, printing debug values about what was loaded or not loaded looks like it might be a pain too.)

At this point, really starting to suspect it's time to cut losses and just remove etcaetera from the loop as I'm spending more time working around it / modifying it, than actually using it :( Will see though, if i can just get this tiny bit farther, leaving it in would work for now.

@bitprophet
Member

Have even more tiny baby branches ready to submit upstream, tho I am going to ask oleiade what he thinks of both my existing PRs and these new ones, and whether it's worth even submitting stuff or if I should just keep using my fork-as is (and probably wean myself back off of etcaetera if I find teh time).

Also realized while poking at the debuggery - I somehow misplaced the tests that proved "YAML prevents loading of JSON or Python, etc"!! Poking at that now :( EDIT: tests in, naturally implementation requires Yet More Etcaetera Munging because I can't really do it as I am now w/ all 3 explicit file paths. So, a File subclass (or something) it is.

@bitprophet
Member

While writing out the docstring for this, it does make me wonder if it's worth revisiting our logic here, now that we have an existing implementation that makes it much easier to have all 3 file suffixes in play. Rereading our arguments above, though, it does still feel like the option best for users as well - nobody is likely to have two types of this file in a single prefix/location and this does make things easier to reason about when troubleshooting. Will soldier on.

@bitprophet
Member

Welp also realized while debugging that I never implemented the INVOKE_ prefix the docs claim we look for, for env vars. Hooray! Easy enough to make an option which I can disable for testing though, adding it to every test string feels kinda gross.

@bitprophet
Member

Continuing to discover various new exciting things that were either forgotten, or unknowable until implementation time.

Latest is that if we want to allow configuration of the --no-dedupe config flag, it causes more config ordering issues because Executor wants to call load on the configuration right before execution (due to loading collection-driven defaults) but we also want to know whether config files stated to perform deduplication or not (and deduping occurs before execution, of course).

This is now the third or fourth instance of having to bend over backwards to meet etcaetera's view of config loading :( I think I am going to keep digging this hole deeper & add another 'pre-load load()' but at this point it's clear I'd have spent much less effort just writing something from scratch.

@bitprophet
Member

Now running into the 4th or 5th instance: dealing with how etc's AdapterSet is a deque and makes reasoning about the hierarchy and inserting/overwriting things a big PITA. Time to branch and try ripping it out.

@bitprophet
Member

A ways into that effort (been doing other things lately too). Realizing my original idea of a truly lazy config object is too much magic/code/etc and is really not necessary, so I'm tweaking things to be a little simpler.

@bitprophet
Member

Wow. The tests pass. And then they pass under Python 3. And so do the integration tests. And the docs build. Is this nightmare over?!

Need to revisit checklist up top. Eg debugging is still possibly hairy - by having debug output in every config-merge, and then having to run merging many times during a single CLI session (due to reloading for env vars, re-merging after cloning, etc) it becomes very verbose, and this isn't even with per-variable logging, which might be useful in some deep cases.

But really, I've spent so much God damn time on this I can't see myself waiting much longer to merge & move onwards. It's been literally months of spare time down the drain.

@sophacles

W00t. For the record... It's not down the drain. We've been building our scripts aroarouthe eventual aavailability of this feature and are super excited for it!

@bitprophet
Member

Yea it was a poor choice of words, but still, there's just no way this feature - however complex - should have taken literally months >_< at least I learned some hard lessons about using third party code. (Which is sad, usually I'm super aggro about people not reusing libraries.)

@bitprophet
Member

I'm currently back looking at my nascent Fabric 2 branch to make sure that this new API works for its needs, since that was the entire reason I started poking this :) assuming it looks passable I'll merge to master and put out a new Invoke release.

@bitprophet bitprophet added a commit that referenced this issue Jan 23, 2015
@bitprophet bitprophet Changelog re #147, closes #147 87eeaf7
@sophacles

W00t!!

@bitprophet
Member

I haven't actually finished the Fab 2 integration but I figured any changes would be minimal and was not worth having this mega huge branch floating outside master :) whee.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment