Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RFE: Using set_stats module to pass a variable value from nested WT to JT #4481

Closed
theansibler opened this issue Aug 14, 2019 · 11 comments
Closed

Comments

@theansibler
Copy link

ISSUE TYPE
  • Feature Idea
SUMMARY

On creating a Workflow template like below, we can able to pass the variable value from JT to WT:

Job Template(passes the value of var1) >>On Success>> Nested Workflow(can use the value of var1)

But currently we cannot pass the variable value from WT to JT:

Nested Workflow(passes the value of var1) >>On Success>> Job Template(can use the value of var1)

The purpose of this RFE is to pass variable value from a nested WT to JT/WT.

wf

@Siegfrost
Copy link

This enhancement would also be highly beneficial for our usage where we're trying to combine and nest WT created by different teams.
Hopefully this RFE can be prioritized!

@JacobCallahan
Copy link

Since #6806 is considered a dupe of this issue, I'll add on our requests from that one.

ISSUE TYPE
  • Feature Idea
SUMMARY

Ansible Tower API RFE

Background:
Our team uses nested workflows. Each of which passes down information from
the other using set_stats. At the end of the execution flows, the information
is accumulated and formatted in our desired format. Our usage of Ansible Tower API is almost entirely for identifying and executing workflows.
This is expected to be at a high level, with most workflows having nested
workflows and/or templates.

Problem:
With the current implementation of AT's API, set_stats can only flow down.
This means that, after the entire flow has finished executing, we need to
find the last job that ran and pull the stats from it.

Desired Solution:
Implement a way to flow information back up to parent jobs and workflows.
This way an API user can kick off a workflow or job execution and get all
generated information from it once all child processes have completed.

Possible implementation:

Because workflows in general have arbitrary number of leaf nodes and we can not anticipate what will be the leaf nodes (for example, some paths may not be taken depending on what happens at run time), and it would be impossible to guess the preference of how to merge the different sets of artifacts and with what preference, as a first approach we propose setting the artifacts set by a job of a workflow node on the workflow node itself.

E.g. if a child job of a workflow job node has the artifacts property, set that on the workflow node.

As a second step, on the workflow, if workflow nodes in the workflow have the artifacts property set, save this on the workflow job itself in the form of a dictionary, where the key is some identifier for the workflow node, and the value is the artifacts set on that workflow node.

One possible configuration option would be to set the behavior for merging multiple branches. Options could include:

  • Overwrite with latest (whatever is the last to run overwrites previous values)
  • Branch (split each branch into a new key on the returned dictionary)
  • Minimal branch (only branch if there is a conflict)

Possible example:

{
    "id": 20,
    "type": "workflow_job",
    "url": "/api/v2/workflow_jobs/20/",
    "related": {
        "created_by": "/api/v2/users/1/",
        "modified_by": "/api/v2/users/1/",
        "unified_job_template": "/api/v2/workflow_job_templates/7/",
        "workflow_job_template": "/api/v2/workflow_job_templates/7/",
        "notifications": "/api/v2/workflow_jobs/20/notifications/",
        "workflow_nodes": "/api/v2/workflow_jobs/20/workflow_nodes/",
        "labels": "/api/v2/workflow_jobs/20/labels/",
        "activity_stream": "/api/v2/workflow_jobs/20/activity_stream/",
        "relaunch": "/api/v2/workflow_jobs/20/relaunch/",
        "cancel": "/api/v2/workflow_jobs/20/cancel/"
    },
    "summary_fields": {
        "workflow_job_template": {
            "id": 7,
            "name": "baz",
            "description": ""
        },
        "unified_job_template": {
            "id": 7,
            "name": "baz",
            "description": "",
            "unified_job_type": "workflow_job"
        },
        "created_by": {
            "id": 1,
            "username": "foobar",
            "first_name": "",
            "last_name": ""
        },
        "modified_by": {
            "id": 1,
            "username": "foobar",
            "first_name": "",
            "last_name": ""
        },
        "user_capabilities": {
            "delete": true,
            "start": true
        },
        "labels": {
            "count": 0,
            "results": []
        }
        "artifacts": { <SOME KEY THAT LETS ME KNOW WHAT NODE IT CAME FROM>: {"string": "abc",
                          "integer": 123,
                          "float": 1.0,
                          "unicode": "竳䙭韽",
                          "boolean": true,
                          "none": null},
                       <SOME KEY THAT LETS ME KNOW WHAT NODE IT CAME FROM>: {"string": "abc",
                          "integer": 123,
                          "float": 1.0,
                          "unicode": "竳䙭韽",
                          "boolean": true,
                          "none": null},
                     }
    },
    "created": "2020-04-22T16:11:49.120399Z",
    "modified": "2020-04-22T16:11:49.633559Z",
    "name": "grandma",
    "description": "",
    "unified_job_template": 7,
    "launch_type": "manual",
    "status": "running",
    "failed": false,
    "started": "2020-04-22T16:11:49.623267Z",
    "finished": null,
    "canceled_on": null,
    "elapsed": 11.223614,
    "job_args": "",
    "job_cwd": "",
    "job_env": {},
    "job_explanation": "",
    "result_traceback": "",
    "workflow_job_template": 7,
    "extra_vars": "{}",
    "allow_simultaneous": false,
    "job_template": null,
    "is_sliced_job": false,
    "inventory": null,
    "limit": null,
    "scm_branch": null,
    "webhook_service": "",
    "webhook_credential": null,
    "webhook_guid": ""
}

@kedark3
Copy link

kedark3 commented Apr 23, 2020

Another potential solution(adding to what @JacobCallahan said)

Someone on the tower team's slack suggested me that they could think about using checkboxes * [ ] to select which workflow nodes do we need to gather artifacts from. That may give us control over what artifacts are gathered.

I think original comment by the person opening this issue is in line with what Jake and I am looking for. This will be something we will be using internally in Red Hat QE @wenottingham
So it will be great if we can collaborate in any way to see if this feature would make sense to larger Tower user base.

Having set_stats from WF bubble up and be able to use it in following JTs/WFs is great feature and deserves attention. We are also enabling API users to integrate Tower with other things. Example of that would be, my team trying to integrate Tower with Jenkins and also with PyTest (testing) Sat QE.

@chrismeyersfsu
Copy link
Member

chrismeyersfsu commented Apr 27, 2020

@theansibler your need is interesting. The workflow becomes a function call where the return is based on set_stats values. So there are 2 feature requests here.

  1. artifacts bubble up to WorkflowJob object.
  2. artifacts on WorkflowJob are passed to next UnifiedJob in the graph (similar to how it already works when there is an artifact on a Job in a Workflow)

Feature 2 - WorkflowJob act like functions

Design decision wise this feature is straight forward.

Feature 1 - Bubble Up Artifacts

We need to decide the semantics of artifacts bubbling up. Specifically, which WorkflowNode should have artifacts to consider bubbling up?

1A Let the user pick

Very straight forward. Shifts the problem to the user to solve. Thanks @kedark3

1B Define a Bubble Up Semantic

The seemingly simplest bubble up semantic would be that leaf node artifacts bubble up. The execution graph is different than the WorkflowJobTemplate graph. Below is a WorkflowJobTemplate.
image
Below is a WorkflowJob example run of the above WorkflowJobTemplate. In the run-time graph, N3 could be considered a leaf node. N4 did not run so there is absolutely no way for it to output artifacts.
image

The above example highlights that it would be meaningless to consider N4 a leaf node since it never ran and thus can not contribute artifacts. Let's consider N3 a leaf node then in this case. The customer would have to code the playbook in N3 with the expectation of N4 never running and set a meaninful artifact enough though it itself failed. This is unreasonable. Therefore, I argue for merging only considering WorkflowJobTemplate leaf node artifacts and not consider the run-time graph when determining leaf nodes.

Feature 1 and 2 - Variable combining semantic

More than one node producing artifacts requires that we "merge" artifacts when we bubble them up to the WorkflowJob. @JacobCallahan lists some ways to handle this. I advocate for consistency and handle merging artifacts like we handle merging extra_vars + artifacts when running a playbook. We merge the root-level keys and values. If there is a conflict the winner is non-deterministic.

Trust Model

This is all about expectations. If the user picks the node that will bubble up artifacts then the trust implication is clear. The artifacts on the WorkflowJob can be trusted to have come from the node. If we use a semantic bubble up + non-deterministic merging strategy then all leaf nodes must be trusted. If we use a semantic bubble up + per-node-identifier merging strategy then the trust is shifted to the usage, where the trusted node is explicitly referenced.

@kedark3
Copy link

kedark3 commented Apr 27, 2020

@chrismeyersfsu so you think this is a feature that can be implemented in Tower? What would be the next steps for us(as issue reporter) to know if its going to be implemented?

Thanks for your comment.

@ryanpetrello
Copy link
Contributor

ryanpetrello commented Apr 27, 2020

@kedark3,

If it's implemented downstream in Red Hat Ansible Tower, you'd see the work done in AWX first.

The way to track the work is to watch this ticket (if there's no pull request or commentary here, then nobody has started on it).

@bherrin3
Copy link

bherrin3 commented Apr 27, 2020

@chrismeyersfsu

I would like to get a little more real world with the above scenario, if I may.

For your execution graph, please consider the following additions and consideration:

  • N0 is a start node for the workflow that supplies extra_vars to N1 and N2, as is generally the case for workflow graphs.
  • N1 and N2 to be identical job_templates to abstract and keep job_templates as DRY as possible but functionality produce two unique but same named data structures in set_stats. For example, this happens to parallel deploy multiple OS VMs on a provider that return facts that are unique to each.
  • Consider N3 to be an "aggregate node" within that workflow where the workflow designer is already having to deal with taking two identically name data structures from two different scopes and merging them into the same scope within N3.

Dealing with convergent data structures in set_stats in workflows is already a problem with which a workflow designer already contends.

Specifically for failures described above, the calling workflow that drives the execution being in a failed state. At that point, I am not sure how important aggregating leaf artifacts from a node are. In my case, the information is non-valuable and non-trusted. The failure would have to be investigated, if not handled within the workflow.

In the cases where failure paths that are handled, I would still think it would be up to the workflow designer to specify what does and does not get passed. Auto-resolving issues like this seems problematic at best.

Where we are currently where the workflow designer has no way to specify any set_stats artifacts back to the initial calling workflow through the /v2/api/workflow_jobs/ API -- whether auto-converged artifacts or manually done by the designer - seems to be a more critical issue that needs a/any solution first.

Thanks for hearing my input.

@ryanpetrello
Copy link
Contributor

@wenottingham is this a near-term priority that we should track more closely as an enhancement in the next major release?

@kedark3
Copy link

kedark3 commented Nov 25, 2020

@ryanpetrello pretty please 🥺

@WallsTalk
Copy link

This would be an amazing enhancement for AT as-well. Even tho I understand the problematics of this.

@AlanCoding
Copy link
Member

https://github.com/ansible/awx/compare/devel...AlanCoding:wj_artifacts?expand=1

This is a completely careless implementation, but it passes a basic functionality integration test. I need to think about the variable precedence issues raised in this issue... but I'm not convinced any of them will matter in the real world.

@tvo318 tvo318 changed the title RFE: Using set_stats module to pass a variable value from nested WT to JT RFE: Using set_stats module to pass a variable value from nested WT to JT Jul 19, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests