-
-
Notifications
You must be signed in to change notification settings - Fork 405
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
OBConversion reinitialization for WriteString() and WriteFile() #1922
Comments
adalke
added a commit
to adalke/openbabel
that referenced
this issue
Jan 16, 2019
… otherwise the formats won't initialize correctly. (Issue openbabel#1922) This commit includes a new test program which checks WriteString() and WriteFile() for most of the supported formats. Without this patch, several of them will fail, and the FPS format will segfault.
adalke
added a commit
to adalke/openbabel
that referenced
this issue
Jan 18, 2019
…iple records. My first patch set Index=1 in WriteFile() and WriteString() so that the output format plugin could perform any initialization. This worked, but there was still a problem with some formats when saving multiple records, including the CML format. The CML format uses <molecule> as the top-level element when there is only one input molecule, and <cml> as the top-level element when there are two or more molecules. The <cml> element contains a sequence of <molecule> elements. If Index==1 and IsLast()==true then it's a single-molecule file. If Index==1 and IsLast()==false then it's a multi-molecule file. However, OBConversion::Write() always called SetOneObjectOnly(), which set Last=true, so the CML format always thought it was writing a single-record output file. The fix was to remove SetOneObjectOnly() from the Write() and do more intialization in WriteFile() and WriteString(). Previously Write() did Index++ *after* calling the format's WriteMolecule(). This was incorrect. It needs to be done *before*, so the value of GetOutputIndex() matches the number of molecules which were sent as output. (This is a better emulation of how Convert() works.) This in turn means WriteFile() and WriteString() need to reset Index to 0. They then call Write(), which increments Index, so WriteMolecule() always sees Index=1 for the first molecule.
baoilleach
pushed a commit
to baoilleach/openbabel
that referenced
this issue
Apr 11, 2020
… otherwise the formats won't initialize correctly. (Issue openbabel#1922) This commit includes a new test program which checks WriteString() and WriteFile() for most of the supported formats. Without this patch, several of them will fail, and the FPS format will segfault.
baoilleach
pushed a commit
to baoilleach/openbabel
that referenced
this issue
Apr 11, 2020
…iple records. My first patch set Index=1 in WriteFile() and WriteString() so that the output format plugin could perform any initialization. This worked, but there was still a problem with some formats when saving multiple records, including the CML format. The CML format uses <molecule> as the top-level element when there is only one input molecule, and <cml> as the top-level element when there are two or more molecules. The <cml> element contains a sequence of <molecule> elements. If Index==1 and IsLast()==true then it's a single-molecule file. If Index==1 and IsLast()==false then it's a multi-molecule file. However, OBConversion::Write() always called SetOneObjectOnly(), which set Last=true, so the CML format always thought it was writing a single-record output file. The fix was to remove SetOneObjectOnly() from the Write() and do more intialization in WriteFile() and WriteString(). Previously Write() did Index++ *after* calling the format's WriteMolecule(). This was incorrect. It needs to be done *before*, so the value of GetOutputIndex() matches the number of molecules which were sent as output. (This is a better emulation of how Convert() works.) This in turn means WriteFile() and WriteString() need to reset Index to 0. They then call Write(), which increments Index, so WriteMolecule() always sees Index=1 for the first molecule.
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Environment Information
Open Babel version: 2.4.90 (HEAD as of 14 January 2019)
Operating system and version: macOS 10.12.2
Expected Behavior
I expect to be able to write to FPS format using OBConversion without having a segfault. Here are examples of expected behavior:
Example 1 (expected)
Example 2 (expected)
Actual Behavior
The above code on the current Open Babel HEAD (and with 2.4.1) causes segmentation faults:
Steps to Reproduce
Example 1 (actual)
Example 2 (actual)
(Replace the WriteString() with the WriteFile() and there is still a segmentation fault.)
The problem seems to be that WriteString() and WriteFile() on a newly constructed OBConversion() leave the Index at the initial value of 0.
Some of the formats use GetOutputIndex() and, if it is 1, perform an initialization. For the FPS format, the initialization loads the correct fingerprint type. If it is uninitialized then it will deference an unassigned variable, causing the segfault.
I have patched WriteString() and WriteFile(). I will submit a pull request shortly to demonstrate one possible solution.
Related symptoms
issue #1783
Note that this is likely related to issue #1783 . With an un-patched system:
with my patch in place:
The proposed solution in that issue was to enable the 'h' option, which worked because the relevant code sets hexoutput using the following.
It may also be related to issue #852 .
CML output
As another example, the CML output differs between an unpatched system:
and a patched system:
Note the leading XML declaration in the patched version. This is because of the following test in format/xml/cmlformat.cpp :
The text was updated successfully, but these errors were encountered: