-
Notifications
You must be signed in to change notification settings - Fork 419
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merge structured output #286
base: master
Are you sure you want to change the base?
Conversation
Add options "-B / --structured-output FORMAT" to print the output of strace in a more parsable format. Currently, we provide "json" for JSON format and "ocaml" for OCaml format. Given the extend of modifications, we don't aim at covering all the features of strace, but only the most commonly used, and keep full textual backward compatibility. See file README-format for more information
There seems to have been a decision made to convert flags to strings. This is fine, but the implementation went a little overboard and in some cases the flags objects are opened within one another resulting in dubious output. After fixing the merging errors from the cherry picked commits we have this kind of output: {
"cmd": "mmap",
"args": [null, "8192", {
"flags": ["PROT_READ", "PROT_WRITE"] <--- This is normal, the prot field of mmap
}, {
"flags": [{
"flags": ["MAP_PRIVATE", {
"flags": ["MAP_ANONYMOUS"] <--- This is not great
}]
}]
}, "-1", "0"],
"return": "0x75bd0874c000"
} Now it looks like the Now I was able to fix this for Any input is appreciated, would you rather have this or letting the syscall-specific implementation manage the flags ? For instance in [...]
tprint_flags_begin(); // Open the flag object here --------------------------
// Print the type mask first, then remove it from the flag integer ----------
printxvals_ex(flags & MAP_TYPE, "MAP_???", XLAT_STYLE_ABBREV,
mmap_flags, NULL);
flags &= ~MAP_TYPE;
// Same for another specific mask, but print later --------------------------
const unsigned int mask = MAP_HUGE_MASK << MAP_HUGE_SHIFT;
const unsigned int hugetlb_value = flags & mask;
flags &= ~mask;
if (flags) {
tprint_flags_or();
// Print the flags, opening and closing a nested flag object /!\ ----
printflags_ex(flags, NULL, XLAT_STYLE_ABBREV,
mmap_flags, NULL);
}
if (hugetlb_value) {
tprint_flags_or();
tprint_shift_begin();
PRINT_VAL_U(hugetlb_value >> MAP_HUGE_SHIFT);
tprint_shift();
/*
* print_xlat_u is not used here because the whole thing
* is potentially inside a comment already.
*/
tprints_string("MAP_HUGE_SHIFT");
tprint_shift_end();
}
// Close the flag object ----------------------------------------------------
tprint_flags_end();
[...] I was able to fix this by removing the open/closing of the flags, then in the case of the structured output disabling the removing of the flags ( |
Noting that |
At least in master branch I don't see
The idea is to move heavy lifting away from individual syscall dissectors when possible, but it's surely not possible in all cases as some syscalls are quite complex.
printflags_ex is a low level method, it is not expected to open/close a nested flags object.
|
I agree, |
Yeah those were added in the structured output commit: https://github.com/lefessan/strace/blob/9c054003d463e3589ff30f22f6b228cb4ece3baf/src/xlat.c#L431 If you are telling me they are not intended to be here this answers a lot of my questions about the implementation details. |
From my understanding, the tests make a system call, then use the results to generate the expected output, am I correct ? I wanted to see if I could hijack the current framework to make it compatible with both 'raw' and structured output, but it seems pretty non trivial, correct ? |
Yes, most of the tests fall into this category.
Yes, you're spot on. In fact, I see it as the main obstacle: without tests there is no way to tell whether the structured output is correct, but the tests are not written in a way to test any output format besides the traditional. |
Ok, so what should the tests include ? Can I introduce a dependency on python to have access to a JSON formatter ? I am thinking of testing the contents of the structured output and not character by character, hence using a third party tool to format the output in order to compare the results on a common basis. |
A third party tool for json tests is OK. I suppose there are quite a few tools to choose from, and I'd naturally prefer something that won't bring a lot of dependencies. |
@parport0 additionally suggested that a converter from json output into traditional strace output would allow reusing most of the existing strace tests for testing of structured output. |
Feels like a chicken and egg problem to me but if anyone wants to give it a shot they are welcome. I am making little progress due to work, but I am still working on the PR and will contact the ML if there are obstacles I cannot get over |
I have a test framework, now I need to make at least the first one pass. I am trying to print brackets before and after the main loop with JSON, to make the whole output valid as a JSON object. What would be the best way to make sure that every output file has a |
Common log is |
8960b56
to
9f20b86
Compare
I am happy to report that the first ever JSON output test is passing. Writing JSON tests is unfortunately a tedious and error-prone process, and I am not 100% familiar with the code so I made several shortcuts that will have to be reckoned with. However, I am pleased with the result and wish to push further. I need however to discuss the matter of introducing external libraries to In lustre-adjacent tools we use json-c and it comes on a majority of distributions, even relics (from my POV) like Centos-7. JSON support can be enabled at configure if the library is found. Would that be acceptable ? |
I'm not quite familiar with libraries that are designed to produce json output, but from a cursory look json-c looks fine. |
Not too long ago I chose cjson for one of my dayjob projects, primarily because it is just one well-embeddable MIT-licensed header file. I did not compare it with json-c. |
Damn I do like a header only library but |
If you're talking about cJSON, then it has cJSON.c as well. Also, its README claims that |
NULL as an address is JSON's null
Was there not a PR about disabling string output size limit ? I think it is very relevant to this PR about structured output to not have human-readable partial strings sprinkled in machine-readable format. However disabling this seems less than trivial, as it seems the best way is to increase |
Hex and octal are int in OCaml, quoted in JSON
Realistically, what tests are critical and what tests can be overlooked ? I am alone and 1000 tests to rewrite seems like a lot. |
To be honest, I didn't think you're going to manually rewrite all the tests. |
Well due to the nature of the current tests I do not see another way of implementing them other than a manual rewrite ... |
For your viewing pleasure, the changes made by lefessan updated to fit with the new commits made on master. I aligned the whitespace to the master branch and made sure all the default tests pass. This is not ready for merging; any input is highly appreciated as to feature completion.
Namely, I noticed the JSON outputs puts everything in quotes, even integers, which I think is not
intendedideal and I willchangemake toggleable.