Internal: Improve how variables flow through the code #45

smyrman · 2017-07-11T10:30:07Z

This might be a v2.0.0 improvement as it could impose some breaking changes.

Right now, when evaluating template variables (and env) in the task package, we explicitly pass:

A pointer to the Executor
A lookup-name for the current Task (part of Call) that can be used to find the task in the Executor.
A set of override Variables (part of Call).

There is a lot of boilerplate on how to set the input parameters for this method, and challenging to keep consistent behavior, as proven by #42 and #40. It should also be noted that the Executor has become sort of a God object with a lot of methods put directly on that, as most methods need to access something within the executor struct. It would be nice to think of a way where variables flow more naturally through the program, and allows us to drop the boiler-plate. It would also be nice if we could move methods from the executor to e.g. the Task or Call types.

I don't have a concrete proposal yet, but here are some ideas to develop further:

If we made a copy of Task when we called it, we could have methods to replace variables directly on the task (and evaluate them) before invoking deps and cmds. Might be best done when support for the set property is dropped from the yaml.
We already use context.Context for our calls. context.Context wold also be a perfect candidate for passing variables to (sub-task), as it is always forked when modified, similar to how a shell is forked when you run a command or sub-shell.

When I think of breaking changes for task, I mostly consider that as breaking changes to the yaml, however task also exposes a public API in the task package that could potentially be used programatically. Something to keep in mind, is weather the public API for the task package could be reduced, so that it's also possible to retain compatibility there. Generally, keep as much as possible as private types / functions, and only expose what's explicitly needed for the main package to work, and what's the minimal requirement for programmatic access by end-users.

The text was updated successfully, but these errors were encountered:

smyrman · 2017-07-13T13:42:22Z

Starting on a proposal/discussion of 1 (copy Task) .

ReplaceVariables is currently called within the following functions:

Executor.runDeps(ctx context.Context, call Call) error
Executor.runCommand(ctx context.Context, call Call, i int) error
Executor.getTaskDir(call Call) (string, error)
Executor.ReplaceSliceVarialbes(call Call) ([]string, error)
Executor.getVariables(call Call) (error)

ReplaceSliceVariables is called within:

Executor.isUpToDateStatus(ctx context.Context, call Call) (bool, error)
Executor. isUpToDateTimestamp(ctx context.Context, call Call) (bool, error)

While it probably does not resolve the full problem, if we made a copy of task, all of the methods above would have at least one true point below:

not need to call ReplaceVariables
not need access to Executor
not need access to Call
not be needed at all
can be moved to Task

This is how a copy of Task might look like in code (function headers only):

type Call struct {
       Cmd  string `yaml:"sh"`
       Task string `yaml:"task"`
       Vars Vars   `yaml:"vars"`
       Env  Vars   `yaml:"env"`. // Maybe add this?
}

type Task struct {
        Deps []Call   // _could_ allow Cmd's as deps
        Cmds []Call 
        ...
        e *Executor  // Maybe add this? Should be null if the task has not been compiled.
}

// CompiledTask returns a copy of the task with the given name with all templates
// compiled for all fields, and Dir resolved to an (absolute) directory.
func (e *Executor) CompiledTask(name string, vars Vars) (*Task, error) {...}

// IsUpToDate returns true if the task is up-to-date, or false if it's not. If an
// error is returned, the first parameter will always be false.
func (t Task) IsUpToDate() (bool, error) {...}

// task private helper functions for the above public method.
func (t Task) isUpToDateTimestamp() (bool, error) {...}
func (t Task) isUpToDateStatus() (bool, error) {...}


// Run a compiled task instance. Returns an error if the task is not compiled.
func (t Task) Run() error {...}

// IsTask returns true if c represents a task call.
func (c Call) IsTask() bool {...} 

// CompiledTask is a shortcut for e.CompiledTask(c.Task, c.Vars).
func (c Call) CompiledTask(e *Executor) (*Task, error) {...}

// Cmd returns a prepared os/exec.Cmd that can be used for execution.
func (c Call) Cmd(env Vars) (exec.Cmd, error) {...}

smyrman · 2017-07-13T14:02:07Z

Since I suggest using Call for both Cmd and Deps, and potentially allow shell commands as deps, we might also want the following convenience function:

func (c Call) IsUpToDate(e *Executor) (bool, error) {
        if !c.IsTask {
                return false, nil
        }
        t, err := c.CompiledTask(e)
        if err != nil {
                return false, err
        }
        return t.IsUpToDate()
}

andreynering · 2017-07-13T15:54:05Z

I agree with these proposals.

We should do this in small steps, and not once, to prevent breakage. 💣💥

Maybe we should fix #40 before start this. I already could reproduce it, but don't have a fix yet.

smyrman · 2017-07-13T18:32:39Z

I think fixing #40 and add a new test for it first would be nice, if we can figur out the cause.

As for the refactor, if some of it can be done in small steps, that would be cool, but I would not be to concerned about doing the concrete proposal for 1 in one big bang as long as it can be properly tested and rewiewed - but it probably do need more tests, perheps also some unittests of particular functions.

…unrelated code this is the first big step of #45 now t.ReplaceVariable will return a copy of the task, but with variables replaced now we don't need to replace variables everywhere in the code, and will make other refactorings easier

smyrman · 2017-07-17T13:43:24Z

Comment to your commit, if you make the call-count, not to use pointers but just be a map[string]int, you don't need to worry about preallocating the integers, as things will just work:

https://play.golang.org/p/fhrnqKuPEw

I also don't see a that there is reason for making this interger an int32 rather than just an int. int is usually a bit faster as it will match the target architecture.

andreynering · 2017-07-17T13:51:10Z

I want to use sync/atomic package to make it thread-safe.

This explain why I used int32, it's because we don't have a atomic.AddInt func, just atomic.AddInt32.

Also it requires a pointer.

EDIT: but of course, we can change that to use a mutex instead, or sync.Map.

smyrman · 2017-07-17T14:10:38Z

I should have read the code better before I commented. Otherwise the changes looks much better.

smyrman · 2017-07-17T14:13:24Z

Happy you invite me for reviews if you want btw., and do some manual testing for this refactoring in particular.

andreynering · 2017-07-17T14:26:07Z

@smyrman Sure, testing is always welcome. 😄

I really liked the idea of cloning the task. Some other small refactorings will be easier now.

andreynering · 2017-07-17T14:28:30Z

We already use context.Context for our calls. context.Context wold also be a perfect candidate for passing variables to (sub-task), as it is always forked when modified, similar to how a shell is forked when you run a command or sub-shell.

I think that wouldn't work, because we'd need to convert context.Context is an map[string]string to pass to templates, and that's not easily achievable because we can't know what keys are available inside a context.

smyrman · 2017-07-17T21:06:23Z

I think that wouldn't work, because we'd need to convert context.Context is an map[string]string to pass to templates, and that's not easily achievable because we can't know what keys are available inside a context.

I meant 1 (copy of task) and 2 (use context) as two alternatives, so happy we go with 1. I think that's the best fit. However, I assure you it wold work :-)

If you are interested in using Context for storing variables (for some other project perhaps), the way I understand the idiotic pattern for best usage is something like this:

type ctxKey int // private type to avoid key collision with other packages using context...

// private keys to prevent miss-usage outside of package.
const (
    ctxKeyVars ctxKey = iota
    ctxKeyEnv

func ContextWithVars(ctx context.Context, vars Vars) context.Context
        return context.WithValue(ctxKeyVars, vars)
}

func VarsFromContext(ctx) Vars {
        i := ctx.Value(ctxKeyVars).
        if i == nil {
                // when no value was set for ctxKeyVars.
                return nil
        }
        // only ContextWithVars may possible set this value,
        // so no reason to check for a second ok parameter 
        // during type-cast.
        return i.(Vars)
}

That said, it's probably still better to pass explicit parameters whenever possible, or at least according to some of the popular golang bloggers.

andreynering · 2017-07-17T21:34:04Z

Hmm... I meant, it wouldn't work each value were set as a separated key in the context. I understood you meant this way, because them they would be immutable.

Anyway, I think copying is a best fit. 😉

smyrman added the type: enhancement A change to an existing feature or functionality. label Jul 11, 2017

smyrman changed the title ~~Internal: Improve how variables flow through the code.~~ Internal: Improve how variables flow through the code Jul 11, 2017

andreynering added type: refactoring and removed type: enhancement A change to an existing feature or functionality. labels Jul 11, 2017

smyrman mentioned this issue Jul 16, 2017

Multiple tasks with the same dependency should run dependency once! #53

Closed

andreynering closed this as completed Sep 6, 2017

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Internal: Improve how variables flow through the code #45

Internal: Improve how variables flow through the code #45

smyrman commented Jul 11, 2017 •

edited

smyrman commented Jul 13, 2017 •

edited

smyrman commented Jul 13, 2017 •

edited

andreynering commented Jul 13, 2017

smyrman commented Jul 13, 2017 •

edited

smyrman commented Jul 17, 2017

andreynering commented Jul 17, 2017 •

edited

smyrman commented Jul 17, 2017 •

edited

smyrman commented Jul 17, 2017

andreynering commented Jul 17, 2017

andreynering commented Jul 17, 2017

smyrman commented Jul 17, 2017 •

edited

andreynering commented Jul 17, 2017

Internal: Improve how variables flow through the code #45

Internal: Improve how variables flow through the code #45

Comments

smyrman commented Jul 11, 2017 • edited

smyrman commented Jul 13, 2017 • edited

smyrman commented Jul 13, 2017 • edited

andreynering commented Jul 13, 2017

smyrman commented Jul 13, 2017 • edited

smyrman commented Jul 17, 2017

andreynering commented Jul 17, 2017 • edited

smyrman commented Jul 17, 2017 • edited

smyrman commented Jul 17, 2017

andreynering commented Jul 17, 2017

andreynering commented Jul 17, 2017

smyrman commented Jul 17, 2017 • edited

andreynering commented Jul 17, 2017

smyrman commented Jul 11, 2017 •

edited

smyrman commented Jul 13, 2017 •

edited

smyrman commented Jul 13, 2017 •

edited

smyrman commented Jul 13, 2017 •

edited

andreynering commented Jul 17, 2017 •

edited

smyrman commented Jul 17, 2017 •

edited

smyrman commented Jul 17, 2017 •

edited