Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Internal: Improve how variables flow through the code #45

Closed
smyrman opened this issue Jul 11, 2017 · 12 comments
Closed

Internal: Improve how variables flow through the code #45

smyrman opened this issue Jul 11, 2017 · 12 comments

Comments

@smyrman
Copy link
Contributor

smyrman commented Jul 11, 2017

This might be a v2.0.0 improvement as it could impose some breaking changes.

Right now, when evaluating template variables (and env) in the task package, we explicitly pass:

  • A pointer to the Executor
  • A lookup-name for the current Task (part of Call) that can be used to find the task in the Executor.
  • A set of override Variables (part of Call).

There is a lot of boilerplate on how to set the input parameters for this method, and challenging to keep consistent behavior, as proven by #42 and #40. It should also be noted that the Executor has become sort of a God object with a lot of methods put directly on that, as most methods need to access something within the executor struct. It would be nice to think of a way where variables flow more naturally through the program, and allows us to drop the boiler-plate. It would also be nice if we could move methods from the executor to e.g. the Task or Call types.

I don't have a concrete proposal yet, but here are some ideas to develop further:

  1. If we made a copy of Task when we called it, we could have methods to replace variables directly on the task (and evaluate them) before invoking deps and cmds. Might be best done when support for the set property is dropped from the yaml.
  2. We already use context.Context for our calls. context.Context wold also be a perfect candidate for passing variables to (sub-task), as it is always forked when modified, similar to how a shell is forked when you run a command or sub-shell.

When I think of breaking changes for task, I mostly consider that as breaking changes to the yaml, however task also exposes a public API in the task package that could potentially be used programatically. Something to keep in mind, is weather the public API for the task package could be reduced, so that it's also possible to retain compatibility there. Generally, keep as much as possible as private types / functions, and only expose what's explicitly needed for the main package to work, and what's the minimal requirement for programmatic access by end-users.

@smyrman smyrman added the type: enhancement A change to an existing feature or functionality. label Jul 11, 2017
@smyrman smyrman changed the title Internal: Improve how variables flow through the code. Internal: Improve how variables flow through the code Jul 11, 2017
@andreynering andreynering added type: refactoring and removed type: enhancement A change to an existing feature or functionality. labels Jul 11, 2017
@smyrman
Copy link
Contributor Author

smyrman commented Jul 13, 2017

Starting on a proposal/discussion of 1 (copy Task) .

ReplaceVariables is currently called within the following functions:

  • Executor.runDeps(ctx context.Context, call Call) error
  • Executor.runCommand(ctx context.Context, call Call, i int) error
  • Executor.getTaskDir(call Call) (string, error)
  • Executor.ReplaceSliceVarialbes(call Call) ([]string, error)
  • Executor.getVariables(call Call) (error)

ReplaceSliceVariables is called within:

  • Executor.isUpToDateStatus(ctx context.Context, call Call) (bool, error)
  • Executor. isUpToDateTimestamp(ctx context.Context, call Call) (bool, error)

While it probably does not resolve the full problem, if we made a copy of task, all of the methods above would have at least one true point below:

  • not need to call ReplaceVariables
  • not need access to Executor
  • not need access to Call
  • not be needed at all
  • can be moved to Task

This is how a copy of Task might look like in code (function headers only):

type Call struct {
       Cmd  string `yaml:"sh"`
       Task string `yaml:"task"`
       Vars Vars   `yaml:"vars"`
       Env  Vars   `yaml:"env"`. // Maybe add this?
}

type Task struct {
        Deps []Call   // _could_ allow Cmd's as deps
        Cmds []Call 
        ...
        e *Executor  // Maybe add this? Should be null if the task has not been compiled.
}

// CompiledTask returns a copy of the task with the given name with all templates
// compiled for all fields, and Dir resolved to an (absolute) directory.
func (e *Executor) CompiledTask(name string, vars Vars) (*Task, error) {...}

// IsUpToDate returns true if the task is up-to-date, or false if it's not. If an
// error is returned, the first parameter will always be false.
func (t Task) IsUpToDate() (bool, error) {...}

// task private helper functions for the above public method.
func (t Task) isUpToDateTimestamp() (bool, error) {...}
func (t Task) isUpToDateStatus() (bool, error) {...}


// Run a compiled task instance. Returns an error if the task is not compiled.
func (t Task) Run() error {...}

// IsTask returns true if c represents a task call.
func (c Call) IsTask() bool {...} 

// CompiledTask is a shortcut for e.CompiledTask(c.Task, c.Vars).
func (c Call) CompiledTask(e *Executor) (*Task, error) {...}

// Cmd returns a prepared os/exec.Cmd that can be used for execution.
func (c Call) Cmd(env Vars) (exec.Cmd, error) {...}

@smyrman
Copy link
Contributor Author

smyrman commented Jul 13, 2017

Since I suggest using Call for both Cmd and Deps, and potentially allow shell commands as deps, we might also want the following convenience function:

func (c Call) IsUpToDate(e *Executor) (bool, error) {
        if !c.IsTask {
                return false, nil
        }
        t, err := c.CompiledTask(e)
        if err != nil {
                return false, err
        }
        return t.IsUpToDate()
}

@andreynering
Copy link
Member

I agree with these proposals.

We should do this in small steps, and not once, to prevent breakage. 💣💥

Maybe we should fix #40 before start this. I already could reproduce it, but don't have a fix yet.

@smyrman
Copy link
Contributor Author

smyrman commented Jul 13, 2017

I think fixing #40 and add a new test for it first would be nice, if we can figur out the cause.

As for the refactor, if some of it can be done in small steps, that would be cool, but I would not be to concerned about doing the concrete proposal for 1 in one big bang as long as it can be properly tested and rewiewed - but it probably do need more tests, perheps also some unittests of particular functions.

andreynering added a commit that referenced this issue Jul 16, 2017
…unrelated code

this is the first big step of #45

now t.ReplaceVariable will return a copy of the task, but with variables
replaced

now we don't need to replace variables everywhere in the code, and will
make other refactorings easier
@smyrman
Copy link
Contributor Author

smyrman commented Jul 17, 2017

Comment to your commit, if you make the call-count, not to use pointers but just be a map[string]int, you don't need to worry about preallocating the integers, as things will just work:

I also don't see a that there is reason for making this interger an int32 rather than just an int. int is usually a bit faster as it will match the target architecture.

@andreynering
Copy link
Member

andreynering commented Jul 17, 2017

I want to use sync/atomic package to make it thread-safe.

This explain why I used int32, it's because we don't have a atomic.AddInt func, just atomic.AddInt32.

Also it requires a pointer.

EDIT: but of course, we can change that to use a mutex instead, or sync.Map.

@smyrman
Copy link
Contributor Author

smyrman commented Jul 17, 2017

I should have read the code better before I commented. Otherwise the changes looks much better.

@smyrman
Copy link
Contributor Author

smyrman commented Jul 17, 2017

Happy you invite me for reviews if you want btw., and do some manual testing for this refactoring in particular.

@andreynering
Copy link
Member

@smyrman Sure, testing is always welcome. 😄

I really liked the idea of cloning the task. Some other small refactorings will be easier now.

@andreynering
Copy link
Member

We already use context.Context for our calls. context.Context wold also be a perfect candidate for passing variables to (sub-task), as it is always forked when modified, similar to how a shell is forked when you run a command or sub-shell.

I think that wouldn't work, because we'd need to convert context.Context is an map[string]string to pass to templates, and that's not easily achievable because we can't know what keys are available inside a context.

@smyrman
Copy link
Contributor Author

smyrman commented Jul 17, 2017

I think that wouldn't work, because we'd need to convert context.Context is an map[string]string to pass to templates, and that's not easily achievable because we can't know what keys are available inside a context.

I meant 1 (copy of task) and 2 (use context) as two alternatives, so happy we go with 1. I think that's the best fit. However, I assure you it wold work :-)

If you are interested in using Context for storing variables (for some other project perhaps), the way I understand the idiotic pattern for best usage is something like this:

type ctxKey int // private type to avoid key collision with other packages using context...

// private keys to prevent miss-usage outside of package.
const (
    ctxKeyVars ctxKey = iota
    ctxKeyEnv

func ContextWithVars(ctx context.Context, vars Vars) context.Context
        return context.WithValue(ctxKeyVars, vars)
}

func VarsFromContext(ctx) Vars {
        i := ctx.Value(ctxKeyVars).
        if i == nil {
                // when no value was set for ctxKeyVars.
                return nil
        }
        // only ContextWithVars may possible set this value,
        // so no reason to check for a second ok parameter 
        // during type-cast.
        return i.(Vars)
}

That said, it's probably still better to pass explicit parameters whenever possible, or at least according to some of the popular golang bloggers.

@andreynering
Copy link
Member

Hmm... I meant, it wouldn't work each value were set as a separated key in the context. I understood you meant this way, because them they would be immutable.

Anyway, I think copying is a best fit. 😉

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants