-
Notifications
You must be signed in to change notification settings - Fork 3.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Pass combined artifacts from nested workflows into downstream nodes #12223
Conversation
test results look good, ready for review. |
having thought about it, I realize I do need to put in some recursion protection. |
awx/main/models/workflow.py
Outdated
@@ -57,6 +57,20 @@ | |||
logger = logging.getLogger('awx.main.models.workflow') | |||
|
|||
|
|||
def get_artifacts_from_unified_job(job): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Instead of this standalone function, why not just have different implementations of .get_combined_artifacts()
on the different UnifiedJob
subtypes?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Took me a while to process, but I wound up going with exactly this suggestion. That allowed getting rid of this method altogether.
awx/main/models/workflow.py
Outdated
job_queryset = ( | ||
UnifiedJob.objects.filter(unified_job_node__workflow_job=self) | ||
.defer('job_args', 'job_cwd', 'start_args', 'result_traceback') | ||
.order_by('status', 'finished', 'id') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm a bit iffy on ordering by status
. This will wind up being alphabetical, which feels like it might have unintended consequences.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
haha, I was kind of fishing for a comment like this. The status (s)uccessful comes after (f)ailed and (e)rror, thus, the successful jobs will have their artifacts take precedence.
In practice, I'm iffy on whether error or failed jobs should contribute artifacts in the first place. Still unsure on what the best thing to do here is.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
.filter(status='successful')
instead, then?
This brings to mind a thing that I was enthusiastic about when I noticed it early on in the Django upgrades, but that got buried under the weight of dealing with the difficult bits: Django 3 supports actual enumerated choices for choice fields. https://docs.djangoproject.com/en/3.2/ref/models/fields/#enumeration-types
It'd be nice to have these, and avoid having to hard-code choice strings. Not urgent, just kind of pleasant.
awx/main/models/workflow.py
Outdated
) | ||
if parents_set is None: | ||
parents_set = set() | ||
new_parents_set = parents_set | set([self.id]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd like to start getting away from Python 2-isms like set([...])
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it even possible to have a cycle in the jobs? The job templates, yes, I could see that potentially happening, but even if the same JT is used in multiple places inside a nested workflow won't it just spawn distinct new jobs (thus having new pks)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, as structured I don't think it would have any cycles. The map of workflow jobs would always be a tree, since a workflow job node can't (in practice) link to the same workflow job as another job.
I kept coming back to questioning if we should track and skip duplicate workflow job templates, which is a much stronger criteria. A workflow can use the same WFJT inside of it multiple times, or it can be nested deeper. This would cut down on queries, but I'm not sure if users might actually want to combine them because multiple runs of the same WFJT could give different artifacts.
awx/main/models/workflow.py
Outdated
@@ -682,6 +681,28 @@ def get_ancestor_workflows(self): | |||
wj = wj.get_workflow_job() | |||
return ancestors | |||
|
|||
def get_effective_artifacts(self, parents_set=None): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we just make this signature (self, **kwargs)
as well, and get parents_set
out of the kwargs
dict? It will make things just a bit nicer if we wind up adding more things later on.
awx/main/models/workflow.py
Outdated
@@ -318,8 +318,7 @@ def get_job_kwargs(self): | |||
for parent_node in self.get_parent_nodes(): | |||
is_root_node = False | |||
aa_dict.update(parent_node.ancestor_artifacts) | |||
if parent_node.job and hasattr(parent_node.job, 'artifacts'): | |||
aa_dict.update(parent_node.job.artifacts) | |||
aa_dict.update(parent_node.job.get_effective_artifacts(parents_set=set([self.workflow_job_id]))) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👀
I need to think through this. This is the part that I worry is going to cause a bunch of repeated queries.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Never mind, I think this isn't a problem.
I still dislike set([...])
, though.
SUMMARY
Connect #4481
The issue is a reasonable request, and I am trying not to overthink the problem.
As of putting up this PR, I am doing this without a schema change. This trades lower memory use for more processing cost. Because of that, I've tried to write the method
get_combined_artifacts
relatively efficiently, although you can come up with cases where it will slog.ISSUE TYPE
COMPONENT NAME
AWX VERSION
ADDITIONAL INFORMATION
I've done two test cases
Both test cases fail in
devel
, finding that the receiving job does not get the artifacts.