Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Don't crash or generate invalid XML for empty TOC #6714

Merged
merged 3 commits into from
Dec 31, 2018

Conversation

mosra
Copy link
Contributor

@mosra mosra commented Dec 29, 2018

The new XML TOC implementation from #736 is not handling TOC on pages without @sections correctly, accessing a null pointer. This adds a corresponding test and a fix. Interestingly enough, the docbook and XHTML generators are already handling this case properly.

I also added a test for TOC levels that could lead to bugs like those fixed in #6696 and #6697. The XML output is not affected by these but having an explicit test for it certainly won't hurt.

What concerns me and what I would like to have fixed in additon to merging this PR is that on a larger project (Magnum), doxygen didn't crash on this but rather exited with a success return code and generated empty XML for given page and truncated index.xml. That's similar to the issue with a fix for #6277 that I reported a year ago (also accessing a null pointer, fixed in 1a1fdbe). My question/request is:

  • Why doxygen executed by runtests.py crashes in this case and why doxygen executed on a larger project fails silently, making the error much more confusing and harder to spot? Is it because threads are involved?
  • Is it possible to propagate the aborted operation (from the thread?) all the way to the return code of the doxygen executable, so the crash is never silent?

Fails, in particular causes Doxygen to produce an empty (or truncated)
XML file but returning with a success error code.
Caused the test (079) to fail with a SIGSEGV, but larger projects exit
with a success return code and the generated XML is either truncated or
empty. Weird.
@albert-github
Copy link
Collaborator

@mosra you wrote:

Interestingly enough, the docbook and XHTML generators are already handling this case properly.

This is quite recently fixed when doing some tests regarding XHTML. The underlying problem is that the writing of the XML output follows a quite different way than (X)HTML, LaTeX, RTF, Docbook and ... do (Docbook since the 1.8.15 release). In this case the XML part is easily overlooked.
It would, in my opinion, be beneficial when XML also follows the same strategy as (X)HTML etc.

Doxygen hardly uses threads of its own (only for dot as far as I know) and it is bad that it does crash in one case and not in an other. I think that the not crashing is "luck" in the sense that a not fatal part of the memory is hit and used correctly and in the other case (luckily) the program crashes, maybe some stricter memory access checking (compiler setting) would help but probably at the costs of a, much, longer run time (I always compare it with the Fortran possibility to use do boundary checking on arrays where is is possible to access array elements outside and array without problems unless you compile with boundary checking. In the later case at runtime the out of bound check will seen and stop the program).

@mosra
Copy link
Contributor Author

mosra commented Dec 30, 2018

The underlying problem is that the writing of the XML output follows a quite different way than (X)HTML, LaTeX, RTF, Docbook and ... do

Looking at how much of the logic is repeated for every output format, I'm thinking that the best way would be to have XML as an intermediate format with all others created from it to avoid repeating the same code. That could be for example via XSLT for the XML-based outputs (HTML and Docbook) and via some templating logic (similar as Python's Jinja) for the others. But that's for a separate discussion...

Doxygen hardly uses threads of its own (only for dot as far as I know) [...] I think that the not crashing is "luck" in the sense that a not fatal part of the memory is hit and used correctly

It results in a file being written partially, so there has to be some sort of crash. It's accessing memory at address 0, no amount of "luck" should make it survive. Threads is one explanation, another could be that SIGABRT / SIGSEGV signals are intercepted (for whatever reason) and instead of crashing they get ignored. I know very little about the internals so I don't even know where to look -- and it happens on a project that takes several seconds to process. It's also maybe happening only on Linux, I have no idea how the behavior of this is on Windows. In case you'd want to try to reproduce:

git clone git://github.com/mosra/magnum
git clone git://github.com/mosra/magnum-extras # this one contains the empty page TOC
cd magnum
mkdir build
doxygen Doxyfile-mcss # ignore warnings, these are due to missing tagfiles from other projects

# if it doesn't crash, build/doc/index.xml should be truncated and build/doc/ui.xml empty

@doxygen doxygen merged commit cfe381e into doxygen:master Dec 31, 2018
@albert-github albert-github added bug fixed but not released Bug is fixed in github, but still needs to make its way to an official release XML XML Output labels Dec 31, 2018
@mosra mosra deleted the invalid-xml-for-empty-toc branch December 31, 2018 15:06
@doxygen doxygen removed the fixed but not released Bug is fixed in github, but still needs to make its way to an official release label Aug 8, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug XML XML Output
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants