-
Notifications
You must be signed in to change notification settings - Fork 542
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Reporting review #90
Comments
I never removed the xml or html reports because they were so far down on the list, but I agree with you. I think there might be some value in implementing an HTML concrete class that uses the reporting stuff and dumping the old things. RE: inverting --report, I think it's a good idea. I think that there is an issue around here somewhere that I never got around to to make --report on by default. |
What about dumping a json file with certain metadata that external tools could use for their own reporting? Not saying rip out the existing reporting (well maybe xml b/c its just ugh) but something in addition to whats there. |
This is the idea with SOMA (Sos object model archive) - making the archive more discoverable and presenting the data in an abstracted fashion. Discussions about this have been going on since $forever with little actual movement. |
This is a pretty aggressive time slot for resolving this bug but ill try to get it done by 3.1 |
@bmr-cymru could we setup an irc meeting to discuss how we want to tackle SOMA and also the dbus interface. Thanks! |
For the html output generation should we use a template library like cheetah or jinja? Or are we thinking we should manually create the HTML and elements within a HTML report type class? |
Moving this to 3.3 as nothing is broken by it and we don't have time to get anything new in for 3.2. |
augtool dump-xml /files > /tmp/augtool_dump_xml_all_files.xml augtool, could maybe help with the lenses that have already been created ??? just a thought. |
Not really (we've looked at Augeas several times; if we are to use it it'll be via the Python API): If we address this it needs to be in a manner that's readily consumable and doesn't just layer on more inconvenience. Anyone who wants augeas-formatted XML for an sosreport can easily get it right now by just pointing the tool at a report archive. |
@bmr-cymru @battlemidget anybody work on this issue? |
Not yet |
I would like to work on this issue, I share what point in my mind.
Any suggestions, please share. |
Nack; there is no need for this. All the data to be reported is in-memory. Writing it to disk and then reading it back and writing it again is pointless make-work.
Nack (unless I mis-understood): why do these need to be external scripts? The current project structure uses python modules to assemble various subsystems that interact via defined interfaces. The only time we use an
This is an admirable goal to work toward but I do not think it depends on either point (1) or (2). |
Point first 1) : large size of data is not efficient store in main-memory, so it's need to write temporary file and then get back read, reading data required only when writing report(.html, txt and xml). Point second 2) : reporting scripts can be easily manage and reduce sosreport.py script size, In future developer can easily change report look structure.(example, developer want to change html style or plain text style then it do make easy) and also reduce complexity. |
It is already there - look at the current reporting code. It iterates over the set of loaded plugins and interrogates them for the data to be stored in the report fields. If you are making a case that that repetitive formatting (for XML, HTML, text, etc.) is inefficient that is a different argument and one that I don't see is solved by merely writing the JSON data out to disk.
So would abstracting this out into |
I think a good first step would be to move all the still-desired reporting functionality out of This would help to ensure the interfaces we have are sane and workable and de-clutters the main I think at this stage making any design decision on the basis of presumed performance improvements is a mistake - Knuth is right - "premature optimisation is the root of all evil". There are known parts of sos that have very suboptimal memory usage right now but the reporting code is certainly not one that I lose any sleep over (PackageManager is a different matter for e.g...). |
ohh it's my bad about first point |
If i get wrong please correct this : We add xmlreport.py, htmlreport.py as module don't need to call extra process, inside sosreport.py |
Right - I think for now this is the best approach. It keeps to existing project conventions and it would be a big improvement in the code structure and maintainability. If at the end of all that work there are measurable performance concerns then we can look at optimisations like caching or writing data to the file system. |
ok 👍 .. |
@bmr-cymru, I write small web application to list out and browse the reports. |
Is there any update? |
Cycling around on this, just dealt with a situation where a sosreport took over 4 hours to run, with the vast majority of that time (3+ hours) spent on generating the reports. I think the reason this happened was the shear volume of files that the sosreport created due to it being run on a heavily utilized OCP node - there were just shy of 114k files in the archive. That is a lot, but is it really expected to take 3 hours at that volume, or is this indicative of a lower level issue? Also, what consumes the html and xml reports today? Would it be beneficial to dynamically set reporting to be on or off based how large the sosreport is by the time we finish running the plugins? |
Do we know why there was such a volume? I.e. is this sane, either in terms of the node configuration, or what we are attempting to collect? |
It was a fairly heavily used OCP node. 150 running containers, another 130 stopped, and a total of 1100 images on it. All the docker plugin bits on that but more importantly the cgroups plugin grabbing /sys/fs/cgroup/* bits for the kubernetes pods which is where the bulk of this came from:
|
Sorry, that didn't actually answer your question. The volume would be sane for the size of the OpenShift environment it was on, but that is probably in the upper-end of such environments. So I imagine there are other end users running into similarly long run times and just "dealing with it" at the moment. |
By biggest problem with reports is it kinda feels like it should be post-processable. We should be able to take an archive, and comprehend it to produce that output, entirely independently of the collection host (it's just pretty printing, effectively). That way we could turn it off by default and let users do something like:
(or whatever) |
Since 2018, we've overhauled the actual reports generation mechanisms. A previous informal survey on the RH side also showed that while HTML reports are not ubiquitously used they are consumed to some degree. Given those two points, I wonder if this can be closed? Or is the post-processing suggestion above still desirable? |
+1 to close this. The HTML report generation was re-written in #1728, no issues since then. |
Reporting seems to be in a funny state at the moment. We have the old HTML and XML reporting code (the XML stuff seems to be dead right now, or at least, does not run when --report is given). The legacy HTML stuff works, just about, but is ugly and a maintenance headache.
The new Report class is pretty cool and gives /much/ cleaner looking code but only implements PlainTextReport as a concrete class.
I'm also wondering if we shouldn't just turn reporting on by default (and invert --report -> --no-report) since it seems to take up very little runtime.
The text was updated successfully, but these errors were encountered: