Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Ability to loop over values #1220

Merged
merged 6 commits into from Jul 25, 2023
Merged

Ability to loop over values #1220

merged 6 commits into from Jul 25, 2023

Conversation

pd93
Copy link
Member

@pd93 pd93 commented Jun 15, 2023

Fixes #82

This PR adds the ability to loop over values in Task. This is useful when you want to run a command multiple times with different values. It supports looping over static lists, variables, and task sources and can be used with commands and tasks.

I've implemented this slightly differently to the suggestions in #82 as I believe that implementing this at the cmds level gives us a lot more flexibility. It means we can run other commands before/after doing loops, but still allows us to loop over commands and tasks and pass around variables.

I've pasted the docs added in this PR below as they provide a good idea of how to use the new functionality:


Looping over values

Task allows you to loop over certain values and execute a command for each.
There are a number of ways to do this depending on the type of value you want to
loop over.

Looping over a static list

The simplest kind of loop is an explicit one. This is useful when you want to
loop over a set of values that are known ahead of time.

version: '3'

tasks:
  default:
    cmds:
      - for: ['foo.txt', 'bar.txt']
        cmd: cat {{ .ITEM }}

Looping over your task's sources

You are also able to loop over the sources of your task:

version: '3'

tasks:
  default:
    sources:
      - foo.txt
      - bar.txt
    cmds:
      - for: sources
        cmd: cat {{ .ITEM }}

This will also work if you use globbing syntax in your sources. For example, if
you specify a source for *.txt, the loop will iterate over all files that
match that glob.

Looping over variables

To loop over the contents of a variable, you simply need to specify the variable
you want to loop over. By default, variables will be split on any whitespace
characters.

version: '3'

tasks:
  default:
    vars:
      my_var: foo.txt bar.txt
    cmds:
      - for:
          var: my_var
        cmd: cat {{ .ITEM }}

If you need to split on a different character, you can do this by specifying the
split property:

version: '3'

tasks:
  default:
    vars:
      my_var: foo.txt,bar.txt
    cmds:
      - for:
          var: my_var
          split: ','
        cmd: cat {{ .ITEM }}

All of this also works with dynamic variables!

version: '3'

tasks:
  default:
    vars:
      my_var:
        sh: find -type f -name '*.txt'
    cmds:
      - for:
          var: my_var
        cmd: cat {{ .ITEM }}

Renaming variables

If you want to rename the iterator variable to make it clearer what the value
contains, you can do so by specifying the as property:

version: '3'

tasks:
  default:
    vars:
      my_var: foo.txt bar.txt
    cmds:
      - for:
          var: my_var
          as: FILE
        cmd: cat {{ .FILE }}

Looping over tasks

Because the for property is defined at the cmds level, you can also use it
alongside the task keyword to run tasks multiple times with different
variables.

version: '3'

tasks:
  default:
    cmds:
      - for: [foo, bar]
        task: my-task
        vars:
          FILE: '{{ .ITEM }}'

  my-task:
    cmds:
      - echo '{{ .FILE }}'

Or if you want to run different tasks depending on the value of the loop:

version: '3'

tasks:
  default:
    cmds:
      - for: [foo, bar]
        task: task-{{ .ITEM }}

  task-foo:
    cmds:
      - echo 'foo'

  task-bar:
    cmds:
      - echo 'bar'

Copy link
Member

@andreynering andreynering left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Awesome work as always! 👏 👏 👏

The implementation seems great!

I added a couple of comments for discussion.

taskfile/cmd.go Outdated
Task string
ForEach *ForEach
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if we should call it only for. It'd would align well with the Go syntax. I don't have a strong opinion, though. foreach is also fine.

If we keep foreach, I'd add a struct tag here because otherwise I believe it'd also be possible to declare it as for_each, leading to inconsistencies.

Suggested change
ForEach *ForEach
ForEach *ForEach `yaml:"foreach"`

Copy link
Member Author

@pd93 pd93 Jun 18, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Happy to use for instead. It keeps things nice and concise. foreach was based off the comments in #82.

Edit - Updated

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@andreynering Quick question on this. foreach: source kind of made sense because it reads well in English.

  • "Do thing for each source"

Now that we have for instead of foreach, I think it makes more sense to use the attribute name directly rather than the singular form of the noun. i.e. for: sources reads better to me than for: source.

This would also make it easier to be consistent if we ever loop over other fields. for: generates makes more sense than trying to work out what the singular form of generates is.

Copy link
Member Author

@pd93 pd93 Jun 27, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

^^^ I have updated to for: sources as mentioned.

variables.go Outdated Show resolved Hide resolved
}
// Get the list from the task sources
if cmd.ForEach.From == "source" {
list, err = fingerprint.Globs(new.Dir, new.Sources)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I noticed that foreach: source is returning absolute paths like:

/Users/andrey/Developer/andrey/task/args/args.go

instead relative like

args/args.go

We need to discuss which would be more appropriate to the user: absolute or relative paths. Ideally, whichever option we choose, the user should be able to convert one into another. I'm inclined to prefer relative. We can use filepath.Rel to convert an abs path to relative.

What may become tricky here is that we should take into account the task directory and not the user directory when converting.

Maybe we should add a TASK_DIR special var similar to the other special variables we already have. Then, the user would be able to do something like echo {{join .TASK_DIR .ITEM}} to get the absolute path.

(This join function doesn't seem to be included into slim-sprig nor on the list of functions we expose ourselves to the user. We will need to add it. Given a rel function is also missing, we could probably add it as well.)

Any thoughts?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a super interesting point that I didn't consider. I completely agree that having functions to switch between relative (rel) and absolute (abs) URLs would be really useful (and also a join).

However, I'm not at all sure which type of path would be most useful to the user. Out of curiosity, is there any reason that you lean towards relative over absolute?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If feels "cleaner", I guess. Less verbose.

There's no right on wrong on this decision, though.

I think I'm fine if we keep as absolute as well, but with the proper function to convert + an example on the docs on how to do it.

Copy link
Member Author

@pd93 pd93 Jun 27, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I took a look at this today. I think adding abs and rel functions to the templates is going to be really hard. Some variation of join is easy since it always takes the same arguments. However, as you eluded to, abs and rel would need to be aware of the context (task directory).

There isn't really a simple way of doing this. You could make a function that accepts the current directory as a parameter, but that is not very user friendly IMO. If we wanted to make the function itself context aware (so that a user doesn't have to pass in the current directory), we'd need to create a set of templateFuncs for each call to Replace() to avoid race conditions. This is a drastic change from the current contextless functions defined in the init() function.

The other solution would be to add something like pathJoin to sprig (join already exists for slices) and the TASK_DIR variable you mentioned.

I prefer the second option, but my gut feeling is to leave these out of this PR and add them later if users request it. At least this will mean the main functionality is available.


As suggested, I have changed the paths created by sources to be relative as I have no strong opinion on which format they should use. I think most users would write them relatively if they were to manually list them out, so this makes sense to me for now.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@pd93

You could make a function that accepts the current directory as a parameter.

I think this route would be an appropriate middle ground. What do you think?

I just looked at the Go's documentation and realized that filepath.Rel actually already asks for the base path, so asking on our side would be consistent.

Regarding abs, it would actually just be a pathJoin(<working-dir>, <path>), so maybe we could just skip abs to something we would consider to do in the future.

The important here is to properly document this in the documentation so it's easy enough for users to find and use.


Last thought, maybe we should add path to all functions for clarity:

  • joinPath
  • relPath
  • absPath (future)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have added the joinPath and relPath functions. I wasn't sure if these should be in slim-sprig or Task, so I've just added them to Task for now.

I did also look into adding a TASK_DIR special variable, but this is a bit awkward at the moment as t.Dir is not resolved until after the variable compilation happens and t.Dir also accepts variables which need to be resolved. There are ways to solve this, but I think doing lazy variables (in some form) will solve that issue anyway, so maybe this is another future thing.

I've also added some docs to the for section describing how you would go about converting the relative paths to absolute ones.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have added the joinPath and relPath functions. I wasn't sure if these should be in slim-sprig or Task, so I've just added them to Task for now.

That's fine. We can always move them to slim-sprig later if we want to.

@titpetric
Copy link

titpetric commented Jun 30, 2023

I love the addition but would want for on the task level as well. [GitHub Matrix tests](ref: https://docs.github.com/en/actions/using-jobs/using-a-matrix-for-your-jobs) would be my preference there rather than a simple for, as we'd use it in similar ways, looping over multiple arrays. The variables in a matrix job allow to iterate over combinations of values and have all of them named rather than .ITEM or preffered {{.}}.

I would also use it on the command level, but for different purposes, so i welcome the addition. Sort of inside smaller more focused task targets which end up being just one command (e.g. "build" and "push"), so it also seems that being one level up would behave similarly.

My wish would be we could reference vars for reuse of the []; but it doesn't seem like much of a problem to address, yaml anchors exist which would basically enable that.

@pd93
Copy link
Member Author

pd93 commented Jun 30, 2023

I love the addition but would want for on the task level as well.

I don't think this is necessary. If your task is a single command then obviously, the current syntax is no problem. You just loop over your single command.

However (correct me if I'm wrong), I'm assuming you want this for cases when you have multiple commands that you want to loop over. For example:

build-and-push-all:
  for: [ 'foo', 'bar' ]
  cmds:
    - docker build -t example:{{.ITEM}} ./{{.ITEM}}
    - docker push ...

In these cases, I think the better way is to create a distinct task that will run the commands for a single iteration, and then call that from somewhere else:

build-and-push-all:
  cmds:
    - for: [ 'foo', 'bar' ]
      task: build-and-push
      vars:
        ITEM: {{.ITEM}}

build-and-push:
  cmds:
    - docker build -t example:{{.ITEM}} ./{{.ITEM}}
    - docker push ...

This is more verbose, but it has the advantage that the build-and-push task is reusable. This approach encourages a "functional" design and means that we don't have to implement for at multiple levels which could get really confusing (both for users and in the code).

Happy to hear other opinions or if you can provide an example where this approach doesn't work as well.

[GitHub Matrix tests](ref: https://docs.github.com/en/actions/using-jobs/using-a-matrix-for-your-jobs) would be my preference there rather than a simple for,

I think matrix is a separate feature. Definitely one worth exploring, but I don't think it needs to block for. My assumption is that these features would be mutually exclusive (i.e. you can't use both).

My wish would be we could reference vars for reuse of the []; but it doesn't seem like much of a problem to address, yaml anchors exist which would basically enable that.

I'm not quite sure I fully understand. Could you elaborate?

@titpetric
Copy link

titpetric commented Jul 9, 2023

build-and-push-all:
  cmds:
    - for: [ 'foo', 'bar' ]
      task: build-and-push
      vars:
        ITEM: {{.ITEM}}

build-and-push:
  cmds:
    - docker build -t example:{{.ITEM}} ./{{.ITEM}}
    - docker push ...

This example would be more like:

build-and-push-all:
  cmds:
      task: build-and-push
      vars:
        release:
          for:
            - release-4-lts
            - release-5-lts
            - release-5.1
            - master

build-and-push:
  vars:
    release: '{{ .release | default "master" }}'
  cmds:
    - docker build -t example:{{.release}} ./{{.release}}
    - docker push ...

So, the for is defined on the same level as the var in this case, giving some sort of definition for a single var, .release. A different structure, possibly more inline, would be:

build-and-push-all:
    vars:
      release:
        for:
          - release-4-lts
          - release-5-lts
          - release-5.1
          - master
      image: '{{ .image | default "..." }:{{ .release }}'
  cmds:
      - docker build ... -t {{.image}} -f docker/{{ .release }}/Dockerfile .
      - docker scan {{.image}}
      - <other things for each image>

But, as we're here, this could be expressed as:

matrix:
  release: [ release... ]
  ...

Just having that .ITEM, being able to control that variable name would make it easier to transition from for into matrix (separate feature) due to not requiring to rename .ITEM to a var. Composability is important, but also overriding the vars for local builds is common.


The ITEM concerns are already addressed with as (my bad, skipped that one in the defs), i think the only comment remaining is task-level for. I would be fine leaving it on command level to see how it plays out, sort of force me to make smaller task defs...

@titpetric
Copy link

About the yaml anchors, i forget the syntax, but essentially:

vars:
  releases: &releases
    - 5.0
    - 5.1
    - master

tasks:
  default:
     for: *releases

Consider it more pseudo code, I don't think I remember the syntax fully, nor if task yaml parser supports it. Sometimes it's a bit of a bother carrying values over vars and i think yaml anchors could be usable. Something about having default values for vars here would be also possible with anchors rather than {.var | default "<string>"} stuff i have been writing, maybe there's a better way to define those.

/offtopic

@aucampia
Copy link

I have done looping with recursion before (see this). In cases where I used it, I had both the task name and some additional parameters in each item, and that was obtained from a variable that was dynamically populated. I think it may be possible with this construct using ndjson and a newline as the delimiter, though I would just like to confirm that it will be.

So basically, I would want to be able to do something like this:

version: '3'

tasks:
  default:
    vars:
      my_var: |
        { "task": "foo",  "args":  { "x": 1 } }
        { "task": "foo",  "args":  { "x": 1 } }
    cmds:
      - for:
          var: my_var
          split: "\n"
        task: "{{ (.ITEM | mustFromJson).task }}"
        vars:
          ITEM_ARGS: '{{(.ITEM | mustFromJson).args | mustToJson}}'
   foo: {}

@kevinmichaelchen
Copy link

I installed this branch locally and took it for a spin, and I was happy with it.

Any chance we'll see it in an imminent release?

@andreynering
Copy link
Member

This is very close, just require small adjustments to abs/rel path handling.

I see this being available in the next release within a few weeks. 🙂

@pd93
Copy link
Member Author

pd93 commented Jul 17, 2023

@kevinmichaelchen @andreynering looking to get the outstanding comments fixed in the next couple of days. Watch this space 👍

@pd93
Copy link
Member Author

pd93 commented Jul 19, 2023

I think it may be possible with this construct using ndjson and a newline as the delimiter, though I would just like to confirm that it will be.

@aucampia For some reason (I haven't looked into it) mustFromJson didn't work when I tried, but if I checkout this branch and use the non "must" JSON functions instead, your task runs just fine 👍

@pd93 pd93 requested a review from andreynering July 19, 2023 21:46
| --------- | -------- | ---------------- | -------------------------------------------- |
| `var` | `string` | | The name of the variable to use as an input. |
| `split` | `string` | (any whitespace) | What string the variable should be split on. |
| `as` | `string` | `ITEM` | The name of the iterator variable. |

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nice.

@titpetric
Copy link

I've missed the ability of as, and thanks to reading the diffs now i also found out about requires 😓 - i am happy to correct myself on some points, and can't wait for this to land in task :)

@andreynering andreynering merged commit ca72f3c into main Jul 25, 2023
12 checks passed
@andreynering andreynering deleted the foreach branch July 25, 2023 01:05
andreynering added a commit that referenced this pull request Jul 25, 2023
@titpetric
Copy link

@pd93 for.var doesnt seem to like literal values:

for:
  var: a b c d
  var "a b c d"

I would expect this to be possible:

  sync-not-ok:
    desc: "Update docs"
    cmds:
      - for:
          var: tyk tyk-analytics tyk-analytics-ui tyk-docs exp
          as: project
        cmd: echo {{.project}}

Or some sort of quoted literal value, but i suppose this ends up being a string anyway.

@titpetric
Copy link

A smaller caveat:

for: { var: projects, as: project }
task: git:clone
vars:
  project: '{{.project}}'

could the var passing here be explicit, dropping the last 2 lines?

@pd93
Copy link
Member Author

pd93 commented Aug 3, 2023

@titpetric If you want to loop over literal values then you either need to declare these values in vars separately and split it:

sync-not-ok:
  vars:
    MY_VAR: "tyk tyk-analytics tyk-analytics-ui tyk-docs exp"
  cmds:
    - for:
        var: MY_VAR
        as: PROJECT
      cmd: echo {{.project}}

(splitting on whitespace is the default), but you can specify a different character if you want with split: "," for example)

OR, you can simply define the list as an array on the for key:

sync-not-ok:
  cmds:
    - for: [tyk tyk-analytics tyk-analytics-ui tyk-docs exp]
      cmd: echo {{.ITEM}}

Note that the second choice doesn't currently support the ability to use as to rename the variable. This is something we can look to improve in the future.

@titpetric
Copy link

yeah, i was thinking for: parameters as parameter but it sort of goes beyond yaml schema into DDL

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Templating, command lists and file globs
5 participants