Reporting review #90

bmr-cymru · 2012-12-13T16:10:17Z

Reporting seems to be in a funny state at the moment. We have the old HTML and XML reporting code (the XML stuff seems to be dead right now, or at least, does not run when --report is given). The legacy HTML stuff works, just about, but is ugly and a maintenance headache.

The new Report class is pretty cool and gives /much/ cleaner looking code but only implements PlainTextReport as a concrete class.

I'm also wondering if we shouldn't just turn reporting on by default (and invert --report -> --no-report) since it seems to take up very little runtime.

jhjaggars · 2013-04-19T03:18:29Z

I never removed the xml or html reports because they were so far down on the list, but I agree with you. I think there might be some value in implementing an HTML concrete class that uses the reporting stuff and dumping the old things.

RE: inverting --report, I think it's a good idea. I think that there is an issue around here somewhere that I never got around to to make --report on by default.

adam-stokes · 2013-04-25T03:00:11Z

What about dumping a json file with certain metadata that external tools could use for their own reporting? Not saying rip out the existing reporting (well maybe xml b/c its just ugh) but something in addition to whats there.

bmr-cymru · 2013-04-25T09:36:27Z

This is the idea with SOMA (Sos object model archive) - making the archive more discoverable and presenting the data in an abstracted fashion. Discussions about this have been going on since $forever with little actual movement.

adam-stokes · 2013-08-01T19:33:02Z

This is a pretty aggressive time slot for resolving this bug but ill try to get it done by 3.1

adam-stokes · 2013-08-22T20:25:36Z

@bmr-cymru could we setup an irc meeting to discuss how we want to tackle SOMA and also the dbus interface.

Thanks!

adam-stokes · 2013-12-09T14:53:55Z

For the html output generation should we use a template library like cheetah or jinja? Or are we thinking we should manually create the HTML and elements within a HTML report type class?

bmr-cymru · 2014-09-17T08:14:20Z

Moving this to 3.3 as nothing is broken by it and we don't have time to get anything new in for 3.2.

prayther · 2015-06-10T15:26:26Z

augtool dump-xml /files > /tmp/augtool_dump_xml_all_files.xml

augtool, could maybe help with the lenses that have already been created ???

just a thought.

bmr-cymru · 2015-06-10T15:38:48Z

Not really (we've looked at Augeas several times; if we are to use it it'll be via the Python API):
dumping yet-another-cryptic file in an awkward encoding (XML) into the reports does not help anyone.

If we address this it needs to be in a manner that's readily consumable and doesn't just layer on more inconvenience.

Anyone who wants augeas-formatted XML for an sosreport can easily get it right now by just pointing the tool at a report archive.

Amitgb14 · 2016-01-27T06:24:59Z

@bmr-cymru @battlemidget anybody work on this issue?

adam-stokes · 2016-01-27T14:56:54Z

Not yet

Amitgb14 · 2016-01-28T08:00:13Z

I would like to work on this issue, I share what point in my mind.

First report is generate in json format and write temporary file inside /tmp directory.
Create reporting directory and put html_report.py, xml_report.py and plaintext_report.py scripts
and finally generate report in sos.html, sos.txt, sos.xml and sos.json format.

Any suggestions, please share.

bmr-cymru · 2016-01-28T10:49:06Z

First report is generate in json format and write temporary file inside /tmp directory.

Nack; there is no need for this. All the data to be reported is in-memory. Writing it to disk and then reading it back and writing it again is pointless make-work.

Create reporting directory and put html_report.py, xml_report.py and plaintext_report.py scripts

Nack (unless I mis-understood): why do these need to be external scripts? The current project structure uses python modules to assemble various subsystems that interact via defined interfaces. The only time we use an exec() style of interface is when interacting with truly external components (e.g. commands run by plugins or during policy loading and evaluation).

and finally generate report in sos.html, sos.txt, sos.xml and sos.json format.

This is an admirable goal to work toward but I do not think it depends on either point (1) or (2).

Amitgb14 · 2016-01-28T11:23:52Z

Point first 1) : large size of data is not efficient store in main-memory, so it's need to write temporary file and then get back read, reading data required only when writing report(.html, txt and xml).

Point second 2) : reporting scripts can be easily manage and reduce sosreport.py script size, In future developer can easily change report look structure.(example, developer want to change html style or plain text style then it do make easy) and also reduce complexity.

bmr-cymru · 2016-01-28T11:52:05Z

: large size of data is not efficient store in main-memory

It is already there - look at the current reporting code. It iterates over the set of loaded plugins and interrogates them for the data to be stored in the report fields. If you are making a case that that repetitive formatting (for XML, HTML, text, etc.) is inefficient that is a different argument and one that I don't see is solved by merely writing the JSON data out to disk.

reporting scripts can be easily manage and reduce sosreport.py script size,

So would abstracting this out into sos/report.py (and if necessary xmlreport.py, jsonreport.py etc.). This would also drive UP the memory and IO costs that you seem concerned about - each script will start as a new process with a brand new address space. If we are lucky then shared data may reside in the pagecache but if that is then read in anew by those processes we are unlikely to benefit from sharing unless we use complex IO models like memory-mapping (not at all easy in Python).

bmr-cymru · 2016-01-28T11:57:35Z

I think a good first step would be to move all the still-desired reporting functionality out of sosreport.py and into the current report.py - deleting the legacy report code at the same time and re-implementing it using Jesse's classes where it makes sense.

This would help to ensure the interfaces we have are sane and workable and de-clutters the main sosreport.py (another very worthy goal).

I think at this stage making any design decision on the basis of presumed performance improvements is a mistake - Knuth is right - "premature optimisation is the root of all evil". There are known parts of sos that have very suboptimal memory usage right now but the reporting code is certainly not one that I lose any sleep over (PackageManager is a different matter for e.g...).

Amitgb14 · 2016-01-28T12:12:18Z

ohh it's my bad about first point

Amitgb14 · 2016-01-28T12:17:13Z

If i get wrong please correct this : We add xmlreport.py, htmlreport.py as module don't need to call extra process, inside sosreport.py

bmr-cymru · 2016-01-28T12:30:22Z

We add xmlreport.py, htmlreport.py as module don't need to call extra process, inside sosreport.py

Right - I think for now this is the best approach. It keeps to existing project conventions and it would be a big improvement in the code structure and maintainability. If at the end of all that work there are measurable performance concerns then we can look at optimisations like caching or writing data to the file system.

Amitgb14 · 2016-01-28T12:32:20Z

ok 👍 ..

Amitgb14 · 2016-04-04T05:31:52Z

@bmr-cymru, I write small web application to list out and browse the reports.
https://github.com/Amitgb14/sosweb

Amitgb14 · 2016-04-08T12:49:35Z

Is there any update?

TurboTurtle · 2018-07-10T13:55:15Z

Cycling around on this, just dealt with a situation where a sosreport took over 4 hours to run, with the vast majority of that time (3+ hours) spent on generating the reports. I think the reason this happened was the shear volume of files that the sosreport created due to it being run on a heavily utilized OCP node - there were just shy of 114k files in the archive.

That is a lot, but is it really expected to take 3 hours at that volume, or is this indicative of a lower level issue? Also, what consumes the html and xml reports today? Would it be beneficial to dynamically set reporting to be on or off based how large the sosreport is by the time we finish running the plugins?

bmr-cymru · 2018-07-11T08:42:37Z

there were just shy of 114k files in the archive.

Do we know why there was such a volume? I.e. is this sane, either in terms of the node configuration, or what we are attempting to collect?

TurboTurtle · 2018-07-11T13:16:15Z

It was a fairly heavily used OCP node. 150 running containers, another 130 stopped, and a total of 1100 images on it. All the docker plugin bits on that but more importantly the cgroups plugin grabbing /sys/fs/cgroup/* bits for the kubernetes pods which is where the bulk of this came from:

$ find sys/fs/cgroup/ -type f | wc -l
88516

TurboTurtle · 2018-07-23T21:03:15Z

Sorry, that didn't actually answer your question. The volume would be sane for the size of the OpenShift environment it was on, but that is probably in the upper-end of such environments. So I imagine there are other end users running into similarly long run times and just "dealing with it" at the moment.

bmr-cymru · 2018-07-24T11:13:00Z

By biggest problem with reports is it kinda feels like it should be post-processable. We should be able to take an archive, and comprehend it to produce that output, entirely independently of the collection host (it's just pretty printing, effectively).

That way we could turn it off by default and let users do something like:

    $ sos report --html --from sosreport-blah-blah.tar.gz

(or whatever)

TurboTurtle · 2020-07-23T19:04:32Z

Since 2018, we've overhauled the actual reports generation mechanisms. A previous informal survey on the RH side also showed that while HTML reports are not ubiquitously used they are consumed to some degree. Given those two points, I wonder if this can be closed?

Or is the post-processing suggestion above still desirable?

@bmr-cymru @pmoravec

pmoravec · 2020-07-23T20:22:25Z

+1 to close this. The HTML report generation was re-written in #1728, no issues since then.

ghost assigned adam-stokes Aug 1, 2013

ghost assigned bmr-cymru Oct 31, 2013

adam-stokes modified the milestones: 3.2, 3.3 Aug 19, 2014

bmr-cymru modified the milestones: 3.3, 3.2 Sep 17, 2014

Amitgb14 mentioned this issue Feb 9, 2016

[sosreport] Change UI in sos html_report #749

Closed

bmr-cymru mentioned this issue Mar 30, 2016

[sosreport] make JSON report #793

Closed

TurboTurtle modified the milestones: 3.3, 3.7 Jul 10, 2018

bmr-cymru removed this from the 3.7 milestone Mar 26, 2019

TurboTurtle closed this as completed Aug 10, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Reporting review #90

Reporting review #90

bmr-cymru commented Dec 13, 2012

jhjaggars commented Apr 19, 2013

adam-stokes commented Apr 25, 2013

bmr-cymru commented Apr 25, 2013

adam-stokes commented Aug 1, 2013

adam-stokes commented Aug 22, 2013

adam-stokes commented Dec 9, 2013

bmr-cymru commented Sep 17, 2014

prayther commented Jun 10, 2015

bmr-cymru commented Jun 10, 2015

Amitgb14 commented Jan 27, 2016

adam-stokes commented Jan 27, 2016

Amitgb14 commented Jan 28, 2016

bmr-cymru commented Jan 28, 2016

Amitgb14 commented Jan 28, 2016

bmr-cymru commented Jan 28, 2016

bmr-cymru commented Jan 28, 2016

Amitgb14 commented Jan 28, 2016

Amitgb14 commented Jan 28, 2016

bmr-cymru commented Jan 28, 2016

Amitgb14 commented Jan 28, 2016

Amitgb14 commented Apr 4, 2016

Amitgb14 commented Apr 8, 2016

TurboTurtle commented Jul 10, 2018

bmr-cymru commented Jul 11, 2018

TurboTurtle commented Jul 11, 2018

TurboTurtle commented Jul 23, 2018

bmr-cymru commented Jul 24, 2018

TurboTurtle commented Jul 23, 2020

pmoravec commented Jul 23, 2020 •

edited

Loading

Reporting review #90

Reporting review #90

Comments

bmr-cymru commented Dec 13, 2012

jhjaggars commented Apr 19, 2013

adam-stokes commented Apr 25, 2013

bmr-cymru commented Apr 25, 2013

adam-stokes commented Aug 1, 2013

adam-stokes commented Aug 22, 2013

adam-stokes commented Dec 9, 2013

bmr-cymru commented Sep 17, 2014

prayther commented Jun 10, 2015

bmr-cymru commented Jun 10, 2015

Amitgb14 commented Jan 27, 2016

adam-stokes commented Jan 27, 2016

Amitgb14 commented Jan 28, 2016

bmr-cymru commented Jan 28, 2016

Amitgb14 commented Jan 28, 2016

bmr-cymru commented Jan 28, 2016

bmr-cymru commented Jan 28, 2016

Amitgb14 commented Jan 28, 2016

Amitgb14 commented Jan 28, 2016

bmr-cymru commented Jan 28, 2016

Amitgb14 commented Jan 28, 2016

Amitgb14 commented Apr 4, 2016

Amitgb14 commented Apr 8, 2016

TurboTurtle commented Jul 10, 2018

bmr-cymru commented Jul 11, 2018

TurboTurtle commented Jul 11, 2018

TurboTurtle commented Jul 23, 2018

bmr-cymru commented Jul 24, 2018

TurboTurtle commented Jul 23, 2020

pmoravec commented Jul 23, 2020 • edited Loading

pmoravec commented Jul 23, 2020 •

edited

Loading