Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

event timestamp on report result of a schedule is incorrect #16

Open
hmh opened this issue Nov 6, 2019 · 3 comments
Open

event timestamp on report result of a schedule is incorrect #16

hmh opened this issue Nov 6, 2019 · 3 comments

Comments

@hmh
Copy link
Contributor

hmh commented Nov 6, 2019

Context:
According to RFC 8193 section 4.6.2, and also RFC 8194 section 4.3, the ma-report-result-event-time (RFC 8193) or "leaf event" (RFC 8194) must be the time when the event that triggered the SCHEDULE (that caused that action to "run") happened, prior to any randomization/spread.

This behavior is needed in order to be able to correlate back the reports of distinct actions to a specific invocation of a specific schedule that ran those actions (i.e. to know that the results of those actions "belong together").

Description of the issue:
When one has a schedule that triggers (sequentially) three actions that take a few seconds to run each, the "event" times in the report are not equal as they ought to be. The second action reports an "event" time that is a few seconds later than the first one.

@hmh
Copy link
Contributor Author

hmh commented Nov 6, 2019

Looking at src/xml-io.c, and src/json-io.c function render_result(), we have:

if (res->event) render_leaf_datetime(node, ns, "event", &res->start);

This most likely should be:
if (res->event) render_leaf_datetime(node, ns, "event", &res->event);

I will submit a patch as a PR.

@hmh
Copy link
Contributor Author

hmh commented Nov 6, 2019

Pull request #17 fixes the worst of the issue, by ensuring that at least the "event" time is correctly rendered to the report. This time comes from schedule->last_invocation via the CSV metadata.

I have not checked whether schedule->last_invocation has the desired semantics or not re. the random spread (i.e. that it does not have the random spread added to it).

@hmh
Copy link
Contributor Author

hmh commented Aug 24, 2020

@schoenw, this is a really annoying bug that makes it quite hard to properly correlate the reports from multiple actions of the same schedule later on a data processing pipeline.

This bug rendered lmapd unusable for us, we have been using the solution we proposed on #17 for quite a while already on production, successfully.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant