Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
RFC:
acrossstep #29base: master
Are you sure you want to change the base?
RFC:
acrossstep #29Changes from all commits
724ff71ce2481dc136fdfefe6eff2643e5582b84e94c2ea25ba81898cfec275a8f6046a89bb2cFile filter
Filter by extension
Conversations
Jump to
There are no files selected for viewing
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Have you thought about allowing the
acrossstep to iterate over objects from an arbitrary prototype (i.e. not just constrained tovar_sourceprototypes and theirlistmessage)? That could make the object streams returned by prototypes more useful outside of the special prototype interfaces that Concourse will interact with natively (resources, var_sources, notifications, etc.)In this comment concourse/concourse#5936 (comment), you suggested that the
runstep could possibly support aset_varfield to store the returned object as a local variable. Perhaps an alternative toset_varwould be to use theacrossstep in its place, e.g.Kind of weird in this case since we're iterating over a single value always - but I guess for
set_varto work in general, it would also have to set the var to a list of objects (even if a single object is emitted)There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This also makes me realize that the
acrossstep can be used for conditional portions of build plans, by having the list of values be either 0 or 1 element long 🤔There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's also making me wonder about dynamic build plans in the context of the
acrossstep (i.e. the list of values to iterate over can be determined at runtime) - what are your thoughts on that? I think it could greatly extend the use cases thatacrosscan support, but it could possibly make pipelines difficult to understand since you could introduce arbitrarily complicated control flowThere was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have, and your comments are pretty much the same thought process I went thought. 😄 It seems like a really cool idea, but yeah, the difficulty is that it seems to require some sort of dynamic build plan construct.
I'm running into another area that would benefit from dynamic build plans: the way
file:andimage_resource:interact on a task step. I'm working on a refactor that pulls image fetching out into explicitcheckandgetsteps in the build plan, but the existence offile:means I can't know whether they're needed (Darwin and Windows have no images, for example), or what image to evencheckandget, until the build is running. In this case I can probably just preemptively configure steps which may just do no-ops, and they can probably get their config in a similar way to how theGetplan usesVersionFromto get some config from a prior step, but it all feels kind of held together with duct tape. (There's also a bit of complexity in that we may need tocheck+geta custom type that theimage_resource:uses, but I think have a way to get around that.)I still don't see a super obvious way to implement dynamic build plans though. 🤔 But I haven't really tried, either, I guess.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The only thought I had was for there to be some sort of build event that says "inject this build plan under this plan ID" or "replace this plan ID with this build plan". So, the plan would initially look something like:
{ "schema": "exec.v2", "plan": { "id": "across-id", "across": {} } }and there could be a build event like:
{ "event":"dynamic-build-plan", "version":"1.0", "data": { "id": "across-id", "across": { "some": "data" } } }which the UI would interpret to construct a final build plan of:
{ "schema": "exec.v2", "plan": { "id": "across-id", "across": { "some": "data" } } }I'm sure there are some technical challenges I haven't considered, though (/it may not work at all!)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That seems close to what I'm proposing - whatever the step results in gets used as the value. I think it'd just look like this with the syntax I had before:
...but this is just a different syntax.
values_fromis probably clearer. I think we'd still need to handlevalues: ((.:foo)), but that could probably be handled by theacrossstep too instead of a separatevalues:step.In either case, I'm not sure how the
acrossstep would access the result in the case of ado. 🤔 Currently results are stored under step IDs stored in theRunState, but this is almost like a return value from an expression. With a single step it's obvious what step ID to use, but with adoit's not so easy. It's almost like we need(exec.Step).Runto be able to return a value on its own, anddowould return the last value. But that changes the interface.Maybe this whole storing-results-in-
RunStatepattern could/should change? @clarafu @chenbh Has your recent experience in this area led to a similar line of thinking?There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, also re:
get_varsvslist_vars, yeah depending on what the resulting value islist_varsmight be more accurate. I think eitherget_varswould actually fetch all the var values individually, using alistcall under the hood for the scheduling/triggering, or we would havelist_varsinstead and allow the user to run a separateget_varstep for each var value if that's desired. Don't know which; haven't put much thought into it yet.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What I was proposing is a bit more complicated than just a multi-step
values_from(where the value comes from the last step), but I didn't have the proper framing. Now, I realize it the proposal was actually a combination of two ideas:I'll flesh out my original example a bit more:
Here, A is an "anonymous job". When the trigger on B gets a new version, the
parentjob doesn't trigger - only the anonymous job A. Theload_varC implicitly emits the value it loads, andparentimplicitly receives and triggers on changes to this value.The implicit emitting and receiving is very non-obvious, though, and the anonymous job idea is a bit odd (either we make it fully self-contained and need to duplicate the
get: ci, or it magically can borrow artifacts from the parent job). What if instead we made it explicit and gave the filtering it's own job:A emits a new value and stores it as
data. B triggers on changes to that emitted value and stores the result in a local variable.This mechanism could even be used for
{get_var: ..., trigger: true}- the builds we run periodically to fetch the var couldemitthe value to the DB, and the job wouldreceivethat changed value from the build.EDIT:
This could even possibly be used for
{get: ..., trigger: true}- the periodiccheckbuilds couldemitthe latest version, and the generated plan could effectively be:Wouldn't work with
version: everyor pinned/disabled versions, though, so doesn't make much senseThere was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I thought of it like every type of step would have a single output (e.g.
checkis version,getis image spec,load_varandget_varare var), indo's case maybe we just need to figure out the contract for what it should return (is it time to introduce aresultstep/modifier)?It would then be up to
acrossto figure out what to do with this info (which might be wild cause that implies we can runacrossa list of versions from acheck).There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am not sure if I am adding anything useful to the conversation, but when I saw the the 7.0.0 release notes I immediately saw a good use case for one of our pipelines and I tried to build a working prototype using pipeline instances and the across step.
I am keen for the normal resource type to trigger the job and then use the
load_varto load a variable that is a list and then just use that for for the argument tovalues.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@drunkirishcoder going to move this discussion to a review comment thread to avoid cluttering the top-level.
Ah, so that's what you meant. There are no plans for this currently, and it would be a bit challenging to get right - there's no precedence for partial reruns of a build. In general, we'd still need to run everything before the
acrossstep again, since those steps may produce artifacts that theacrossstep relies on (e.g.gets ortaskoutputs). Plus, if the step that's run in a build matrix emits build outputs (viaget/putsteps), we'd still need to emit the build outputs for the previously successful combinations (in order for build scheduling to work properly).I think the ideal solution would for the deploy steps to be idempotent - i.e. given the same inputs, if the deployment already succeeded the step would no-op, so re-running the whole matrix wouldn't be an issue. That's perhaps easier said than done, though.
Is there a reason why manual retries is preferable to using
attemptsfor automated retries for your use case?There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@aoldershaw ok, thank you for the detailed response. that makes a lot of sense. as to why prefer manual over automated retries. I'm just thinking sometimes a deployment would fail due to circumstances that may take longer to troubleshoot and resolve, like someone forgot to apply a firewall rule. but yeah I understand why it would be difficult to do in concourse.
would be nice if there's a similar feature that will represent each one as an independent pipeline level task instead, that can be restarted. just thinking out loud.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One thing you could do is configure a pipeline for each cell in the matrix using the
set_pipelinestep, rather than setting up all the environments within the single build. That way you could retrigger the failed job in each pipeline.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@JohannesRudolph Sorry, completely forgot to address your point! (Raising this as a review comment so the replies can be threaded.)
Would it be sufficient to allow a static list of combinations to skip to be provided?
This would work similarly to excluding jobs in Travis CI, i.e. a subset of vars may be specified, and all variations with those vars will be skipped.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Related thought: it would be kind of cool if we could
exceptcombinations based on a relationship between variables. e.g. if you want to test various upgrade paths, you only want to test combinations offromandtowhereto > from. I have no idea how this would work - it might just be a use case for var sources rather thanexceptThere was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Something that didn't really come up in the proposal is how we handle steps that emit artifacts (e.g.
get,task.outputs, prototype outputs). Currently, the implementation of the staticacrossstep runs each iteration in its own local artifact scope. This means that any artifacts go out of scope past theacrossstep and cannot be used.It'd be cool if there was a way to "fan-in" on artifacts emitted within an
acrossstep. For instance, suppose there was aprototype(or just atask) for compiling a Go binary that outputs an artifact calledbinary. You might want to compile it for multiple platforms/architectures, which could make use of theacrossstep:(side-note: it might make more sense from a performance PoV for a
goprototype to support compiling multiple platforms/architectures in a singlerunstep where possible, but bear with me for the example)The trouble is - under the current implementation, there's no way to access
binaryoutside of theacrossstep. Even if we used a shared artifact scope for each iteration, it'd just be "last write wins" (so, with no parallelism,binarywould be the result for(windows, arm64)only) - since they all share a name, the results would clobber each other. This could be avoided by encoding more information in the output name via anoutput_mapping(e.g. name the artifactsbinary-linux-amd64,binary-linux-arm64, ...) - but that only works for simple values like strings/numbers.So, what if we let artifacts in an
acrossstep share a name, but be uniquely identified by the set of vars that were used to produce that artifact (in the same way that instanced pipelines share a name and are uniquely identified by theirinstance_vars). A possible syntax for referencing the binary created for(linux, amd64)could bebinary{platform: linux, arch: amd64}Here,
binaryreally refers to a matrix of outputs (that mirrors theacrossmatrix). What if we also provided a way to filter down that matrix to get at a subset of the outputs. e.g.binary{platform: windows}would give the (2) outputs built for windows, andbinary{arch: arm64}would give the (3) outputs built for arm64.So, suppose you wanted to upload a github release containing all 6 binaries produced in the matrix. You could do something like:
Suppose you wanted to bundle the different architectures separately - then, you could do:
This may warrant a separate proposal but figured I'd bring it up here since it's definitely a limitation of the
acrossstep as it standsThere was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One idea, maybe we could re-use the concept of
output_mappings. Similar to how a task defined in a file can have its outputs mapped by anoutput_mappingsconfiguration in the pipeline referencing that task, anacrossstep could have its outputs made available via a similar mapping.As an example:
This would be a fairly verbose if you wanted to map every output from the matrix of steps:
Though it would possibly simpler to implement, without relying on prototypes or other Concourse functionality.
When, or if, this limitation on using vars in
get:andput:steps (eg.get: bundle-((.:arch))-((.:version))) is lifted, theacross_output_mappingcould be updated to allow for the use of vars as well, simplifying the configuration to something like...