Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

JSON passed to archiver plug-in missing result.test object #929

Closed
arlake228 opened this issue Sep 23, 2019 · 5 comments
Closed

JSON passed to archiver plug-in missing result.test object #929

arlake228 opened this issue Sep 23, 2019 · 5 comments
Assignees

Comments

@arlake228
Copy link
Collaborator

@arlake228 arlake228 commented Sep 23, 2019

@tonin noticed while setting up an HTTP and DNS that esmond archiving was failing with this error: No type set for raw record

Looking at code that happens when the Esmond archiver cannot find result.test.type in the JSON. Turning on debugging, indeed the entire result.test object was missing from the JSON passed to the archiver.

We had trouble recreating this on other hosts and none of the code involved had any obvious changes. We ran into it very close to the 4.2.1 release, so plan is to do a workaround in the esmond archiver that falls-back to result.task.test.type when result.test.type is not set. Will leave this issue open in case we are able to reproduce the problem more reliably.

arlake228 added a commit that referenced this issue Sep 23, 2019
…n two places in JSON to account for hard to reproduce error where it is missing in one but not the other.
@mfeit-internet2

This comment has been minimized.

Copy link
Member

@mfeit-internet2 mfeit-internet2 commented Sep 23, 2019

Discussion on this from Slack:

Andy Lake 10:45 AM
@mfeit debugging the archiver thing with antoine. it looks like the esmond archiver is looking for the test type in JSON that gets passed in under result.test.type but looking at the JSON it is setting the test type is under result.task.test.type. did something change in the format? also confused why we would not see this for all test types and anything running the esmond archiver (edited)

Mark Feit 10:45 AM
Ugggg, I hope not.
Let me check the history.

Mark Feit 10:57 AM
Nothing in the archiver daemon changed. Let me see if the runner changed something.

Andy Lake 11:22 AM
so fwiw i see the test defined under result and result.task on all my nodes (4.2.0, 4.2.1 and 4.3.0). if there's nothing obvious I could update the esmond archiver to look both places for now. i don't think the other archivers dig into the results like that. sounds like its maybe only happen some of the time for antoine (edited)

Mark Feit 11:23 AM
Yeah, I'm not finding much, either.

Mark Feit 11:33 AM
You should still be able to find the task info in .result.test, which has been there since day one. .result.run and .result.task were added as additional info in #683 for 4.2b1. (edited)
So there's duplication, but only in the interest of not breaking existing uses.
There are no conditionals in the daemon that could make it be formatted in any other way.
I pulled the spec for one of the tasks that Antoine listed as showing an error, and there's no transform in it, so no pilot error.

@tonin

This comment has been minimized.

Copy link
Member

@tonin tonin commented Sep 24, 2019

Not sure if it helps, and not sure its a constant behavior, but it looks like it's mostly the tests coming from my mesh that fail archiving and not the one I run from the CLI with the same archiver definition. Would there be any difference?

@mfeit-internet2

This comment has been minimized.

Copy link
Member

@mfeit-internet2 mfeit-internet2 commented Sep 25, 2019

@tonin It would be interesting to retrieve the task spec from one in your mesh that fails and one from the CLI that doesn't and see where they differ.

@tonin

This comment has been minimized.

Copy link
Member

@tonin tonin commented Nov 11, 2019

The change from 3c2151b didn't solve this issue. Hosts running 4.2.2 still show the same behavior.

However, this might be related to #941 and I'm now running a patch from 7a667b8 to confirm.

@tonin tonin self-assigned this Nov 11, 2019
@tonin

This comment has been minimized.

Copy link
Member

@tonin tonin commented Nov 12, 2019

Running the above-mentioned patch on a host during a bit more than 24h and I don't see the raw record error anymore. Seems this is fixed!

@tonin tonin closed this Nov 12, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
3 participants
You can’t perform that action at this time.