-
Notifications
You must be signed in to change notification settings - Fork 3.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add JobJobEventsChildrenSummary endpoint #11928
Add JobJobEventsChildrenSummary endpoint #11928
Conversation
d6513b3
to
d5d6c14
Compare
e1b1928
to
000e9b7
Compare
awx/api/views/__init__.py
Outdated
@@ -3826,6 +3826,174 @@ def get_queryset(self): | |||
return job.get_event_queryset().select_related('host').order_by('start_line') | |||
|
|||
|
|||
class JobJobEventsChildrenSummary(APIView): | |||
''' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if people feel this doc string is too long, we can add this as an entry in docs/
awx/api/views/__init__.py
Outdated
def get(self, request, **kwargs): | ||
job = get_object_or_404(models.Job, pk=kwargs['pk']) | ||
if not job.event_processing_finished: | ||
return Response({}) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If this won't run when event processing isn't finished, then why do we need to bother with the rowNumber? It should be the same as counter minus 1.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@AlanCoding does that mean there are never gaps in the counter values once processing is done? If that's the case, we can sort that out in the UI.
It also means I could probably look for ways to simplify the UI logic, as it does extra work to handle cases where there's a counter gap. Can there still be a gap while the job is running/processing?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There can't reasonably be a gap when job.event_processing_finished
is True. Consider the logic from that.
awx/awx/main/models/unified_jobs.py
Line 1022 in 6c1adad
return self.emitted_events == event_qs.count() |
The way this works:
- Control process receives output from receptor, and it does minimal processing of this and submits the data to receptor
- This process counts the number of events that it sends, and saves that to
emitted_events
- This process counts the number of events that it sends, and saves that to
- Callback receiver pulls the data off the redis queue and then saves it to database
If event_processing_finished
never transitions to True
, then that's an API bug. Doesn't matter what type of job or what it does. This is accounting - the callback receiver should always save the same number of events that the control process submitted.
If the job is actually running then this endpoint returns {}
anyway
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
there are 3 numbers to consider here
- event_qs.count()
- emitted_events
- the
counter
field on the event itself (from ansible-runner)
the first 2 are pretty much guaranteed to match if event_processing_finished==true, as Alan outlined above.
What about counter
field on the event itself? if there are 100 events, are we guaranteed that we have events with counters 1 through 100? Looking at code I don't see why not. I'm wondering if the previous reports of missing counters was because of not checking whether events had been fully processed or not.
We could make the decision to rely on counter
, simplifying this endpoint a bit. We could just throw an error or warning if this code finds some missing counters.
awx/api/views/__init__.py
Outdated
def get(self, request, **kwargs): | ||
job = get_object_or_404(models.Job, pk=kwargs['pk']) | ||
if not job.event_processing_finished: | ||
return Response({}) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right after this line, I think you should go ahead and write a conditional to give a 400 with too many events. You don't need to load anything, use job.emitted_events
, already in memory. If that number is greater than your limit, then bail.
How do you know what the limit is? Use the firehose script or whatever you want to get the time taken to load some events. Divide to get seconds per event, and then use that to find out how many events will take 10 seconds to load and make that the limit. This is simple enough that it would be best to do that now and avoid more costly iteration in the future.
- returns a special view to output the total number of children (and grandchildren) events for all parents events for a job value is the number of total children of that event - intended to be consumed by the UI, as an efficient way to get the number of children for a particular event - see api/templates/api/job_job_events_children_summary.md for more info
3abb4ca
to
cfd6df7
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like these updates a lot!
SUMMARY
Adds a special endpoint for the purposes of optimizing the Job output page in the UI
Events are hierarchical
The UI needs to know how many children (and grandchildren, etc.) belong to a particular event. Currently the UI does a couple of API calls to get this information, per nested event branch. These extraneous networks calls were leading to issues
#11765 (comment)
This new endpoint is job specific, and returns a json object where each key is the event counter, and the value is the total number of children under that event. The idea is that the UI can get this data once, then use it as a lookup table any time it needs to know the number of children that belong to an event.
for the above event structure it would be
Note, events with no children are not included. Also, it only returns data if the job is in a completed state.
ISSUE TYPE
COMPONENT NAME
AWX VERSION