Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

AttributeError: 'ExhaleNode' object has no attribute 'soup' #36

Closed
ghost opened this issue Aug 1, 2018 · 14 comments
Closed

AttributeError: 'ExhaleNode' object has no attribute 'soup' #36

ghost opened this issue Aug 1, 2018 · 14 comments
Labels

Comments

@ghost
Copy link

ghost commented Aug 1, 2018

I'm encountering this error when processing a rather complex API on the latest exhale version via pip on Python 3. I'm quite sure this has something to do with the Doxygen XML input. Do you have a tip where I can begin to search for the problem in the data or what to do to before I start to debug in the code myself?

Below the stack trace:

(!) Exception caught while parsing:Traceback (most recent call last):
  File "/usr/local/lib/python3.6/dist-packages/exhale/deploy.py", line 390, in explode
    textRoot.parse()
  File "/usr/local/lib/python3.6/dist-packages/exhale/graph.py", line 979, in parse
    self.discoverAllNodes()
  File "/usr/local/lib/python3.6/dist-packages/exhale/graph.py", line 1201, in discoverAllNodes
    cdef = f.soup.doxygen.compounddef
AttributeError: 'ExhaleNode' object has no attribute 'soup'
@svenevs
Copy link
Owner

svenevs commented Aug 1, 2018

Hmm. That smells like a bug in my code... Is your project public?

@ghost
Copy link
Author

ghost commented Aug 1, 2018

Thanks, unfortunately it's closed source. I know that is a bit hindering here. If you can help me with where to look for I try to narrow the problem down. I thought it might be good to first find out what construct in the Doxygen XML causes this. How should I approach this? Should I try to look at the structure of the ExhaleNodes?

@svenevs
Copy link
Owner

svenevs commented Aug 1, 2018

No problem. So I just released v0.2.0, but I seriously doubt that will change anything for you because the code that is breaking for you should be unchanged. Or worse, it may be completely broken...it's crashing my local builds but somehow RTD worked which is why I didn't catch #37 ugh. I'm so mad at myself x0 I'm gonna have to do a new release today 😢

The bug comes at a later stage than when you are crashing though, the bug in v0.2.0 is specific to if you have unions, and will crash when Sphinx reads the rst documents (meaning Exhale is already done "exploding"). What I suggest you do is clone and install exhale as an "editable" install to make debugging simple.

$ pip uninstall exhale
$ cd /some/path
$ git clone https://github.com/svenevs/exhale.git
$ cd exhale
$ pip install -e .

You'll want to make sure to add "verboseBuild": True to exhale_args, and then go here

exhale/exhale/graph.py

Lines 1056 to 1063 in 0377b91

# TODO: change formatting of namespace to provide a listing of all files using it
for f in self.files:
node_xml_contents = utils.nodeCompoundXMLContents(f)
if node_xml_contents:
try:
f.soup = BeautifulSoup(node_xml_contents, "lxml-xml")
except:
utils.fancyError("Unable to parse file xml [{0}]:".format(f.name))

and add this

--- a/exhale/graph.py
+++ b/exhale/graph.py
@@ -1055,6 +1055,9 @@ class ExhaleRoot(object):
         #
         # TODO: change formatting of namespace to provide a listing of all files using it
         for f in self.files:
+            utils.verbose_log(
+                "Processing file [{0}]".format(f.refid), utils.AnsiColors.BOLD_RED
+            )
             node_xml_contents = utils.nodeCompoundXMLContents(f)
             if node_xml_contents:
                 try:

It's probably crashing on your first file? Printing the f.refid because that's what will help you find it in the doxygen XML output (they're usually just named {refid}.xml or something very similar depending on its type), so you can at least open up the file and look around (put on some safety goggles...).

Your problem somewhat confounds me, because if the BeautifulSoup object were not possible to create, it should have already errored out on line 1063.

The line numbers have changed though, and it seems that you are crashing in the file loop well below that, so maybe the object is going out of scope somehow?

exhale/exhale/graph.py

Lines 1210 to 1213 in 0377b91

# Go through every file and see if the refid associated with a node missing a
# file definition location is present in the <programlisting>
for f in self.files:
cdef = f.soup.doxygen.compounddef

So maybe adding something like this will help you figure out which file is actually causing this problem? You should still have access to f.refid and maybe you'll need to look at f.__dict__ and troll around down there:

--- a/exhale/graph.py
+++ b/exhale/graph.py
@@ -1210,6 +1210,9 @@ class ExhaleRoot(object):
         # Go through every file and see if the refid associated with a node missing a
         # file definition location is present in the <programlisting>
         for f in self.files:
+            if not hasattr(f, "soup"):
+                import pdb
+                pdb.set_trace()
             cdef = f.soup.doxygen.compounddef
             # try and find things in the programlisting as a last resort
             programlisting = cdef.find("programlisting")

I'm very interested to know your findings here. How long does Exhale report Doxygen takes to execute? For large APIs maybe I am running out of memory?

@svenevs svenevs added the bug label Aug 1, 2018
@ghost
Copy link
Author

ghost commented Aug 1, 2018

Wow, so helpful, thank you very much!

I also thought a memory problem could be likely. Since I was running in Docker I increased the amount of RAM Docker can use and the affected file did not change. So a memory problem became unlikely.

Doxygen takes 2 min and 28 sec for creating the XML output.

The files for which the error occurs are files that are not in UTF-8. So I guess I should apologize for bothering you with that.

@ghost
Copy link
Author

ghost commented Aug 1, 2018

Converting two files to UTF-8 fixed that problem for me. Now I'm seeing another one. :) Will investigate further.

@svenevs
Copy link
Owner

svenevs commented Aug 1, 2018

Doxygen takes 2 min and 28 sec for creating the XML output.

Dang, this is a massive project. I hope Exhale is able to work, but to be honest now that you've seen some of the code in graph.py...I'm not sure Exhale is ready for the big leagues. If success is had, I'll add a hotfix for #38 because the page-load of the root library document for a project this size will probably be awful.

So a memory problem became unlikely.

It probably isn't the cause of this specific issue, but

  1. I am kind of wasteful about memory. If you do end up running out of memory when Exhale is generating the files, I can describe a potential band-aid.
  2. Breathe is currently using minidom, which has memory leaks. I'm working on this, but it will need more testing. See breathe/#315.
    • Since you likely aren't shooting for building on ReadTheDocs, this probably won't be a show stopper so much as a nuisance.

The files for which the error occurs are files that are not in UTF-8 ... Converting two files to UTF-8 fixed that problem for me

I don't understand. When you say you converted them to UTF-8, you mean files from the Doxygen XML? How did you convert them? My unicode solution was to just "always use it everywhere", which may have been a mistake that your project is revealing.

Basically, I'm happy to help and continue the conversation, but I am worried about this being a waste of time for you. In comparison to Doxygen itself, Exhale is childs-play. It may be better to just customize their HTML output...

@ghost
Copy link
Author

ghost commented Aug 2, 2018

It's just a lot of files and it's in PHP. :) I think I can reduce the amount of files further by specifying more exceptions in the Doxygen config. So far I didn't run into memory issues. In fact Exhale finished successfully, but now I'm running into a breathe issue. :)

I don't understand. When you say you converted them to UTF-8, you mean files from the Doxygen XML?

I meant that I converted the source file to UTF-8, which then changed something with the XML and subsequently worked with Exhale.

In comparison to Doxygen itself, Exhale is childs-play. It may be better to just customize their HTML output...

I thought about that scenario, but since I want the documentation to be as uniform as possible I first want to see the output before I can decide that.
Thanks a lot for your help. If you don't think you have to check something on your side for this non-UTF-8 scenario (actually already Doxygen requires the encoding to be UTF-8) then we should close here.

@svenevs
Copy link
Owner

svenevs commented Aug 2, 2018

So far I didn't run into memory issues. In fact Exhale finished successfully, but now I'm running into a breathe issue.

Yay? Hehe. Lets go ahead and leave this issue open until you are able to get a full build / we can confirm there are no additional errors from me. You can also ask me about breathe stuff, but I am far less familiar with the actual code there -- just familiar with how I use it and how that may cause grief ;)

I meant that I converted the source file to UTF-8 ... actually already Doxygen requires the encoding to be UTF-8

Ok. Basically the thing I'm most curious about is if the problem with encodings came from

tempfile_kwargs = {"encoding": "utf-8"}

and/or this a little lower down

doxygen_input = bytes(doxygen_input, "utf-8")

Those together make it so that stdin, stdout, and stderr are all encoded as UTF-8 for Doxygen. But if you are saying Doxygen actually wants UTF-8, then I think I can sheepishly say that I can keep it. AKA unless there is a confirmed problem with how I communicate with Doxygen (particularly stdin), I'd like to leave it to make sure e.g. Chinese users because I make OUTPUT_DIRECTORY and STRIP_FROM_PATH absolute paths.

@ghost
Copy link
Author

ghost commented Aug 6, 2018

I now went for the pure Doxygen variant as you suggested above because breathe took ages to process all the class files that have been generated. But exhale completed successfully before. Because of the UTF-8 thing: this can be configured in the Doxyfile, but it's safe to assume that files should be utf-8. So I would keep the code like it is, maybe a word of warning about file encodings in the README? I'm not sure.

@svenevs
Copy link
Owner

svenevs commented Aug 26, 2018

Hehe. Uh yes, though Sphinx is also to blame here (or rather, reStructuredText in some senses). I will close this in favor of pending PHP stuff denoted in #39

@svenevs svenevs closed this as completed Aug 26, 2018
@digitalillusions
Copy link

digitalillusions commented Jun 13, 2020

Hi, thought I'd comment here because it's still the same issue. I came across this issue, because I was getting exactly the same error. What fixed it for me, was just adding a log statement after the for statement here

exhale/exhale/graph.py

Lines 1342 to 1345 in 58c6c77

# Go through every file and see if the refid associated with a node missing a
# file definition location is present in the <programlisting>
for f in self.files:
cdef = f.soup.doxygen.compounddef

outputting the refid if it doesn't have the soup attribute

if not hasattr(f, "soup"):
  utils.verbose_log("Processing file [{0}] found no soup.".format(f.refid), utils.AnsiColors.BOLD_RED)

Using this I narrowed down the files which had incorrect encoding. To fix the encoding I simply opened the offending files in VS Code and clicked on the encoding in the bottom row, where I selected to save using UTF-8 as a new encoding.
Just posted this as a reference for some people who might also end up having this issue.

Anyway, thanks for the awesome project. The documentation with Sphinx and RTD theme looks much nicer than Doxygen!

@thclark
Copy link

thclark commented Jul 26, 2020

Hi @svenevs - thanks for your activity here, I love this project and am using it all over!

This issue is still popping up here and there. I have a handle from reading the conversation on how to fix it, but it's a pain to do so.

Perhaps @digitalillusions solution could actually be added to the source, with a more descriptive error? That way we'd get a more informative traceback when things go wrong. Something like:

if not hasattr(f, "soup"):
  utils.verbose_log("Processing file [{0}] found no soup... perhaps it wasn't correctly encoded in UTF8".format(f.refid), utils.AnsiColors.BOLD_RED)

If it helps, I've just pushed up a reproducible test case in an OSS repo at this commit

@9600
Copy link

9600 commented Oct 22, 2021

@svenevs I've just run into this problem as well. The project is at:

https://github.com/myriadrf/LimeSuite/tree/sphinx/docs/sphinx

It is possible that this is simply Exhale/Breath misconfiguration. Would appreciate any suggestions you may have, as would love to have our API documentation generated with this.

@9600
Copy link

9600 commented Oct 23, 2021

UPDATE:

Fixed this by finding the non-UTF8 files and excluding them with:

    "exhaleDoxygenStdin": textwrap.dedent('''
        INPUT = ../../src
        EXCLUDE = ../../src/resources ../../src/ConnectionFTDI/FTD3XXLibrary
    ''')

Though it would be nice if the default error was a bit more helpful and perhaps pointed at the problematic files.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

4 participants