-
Notifications
You must be signed in to change notification settings - Fork 14
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Bugfix for handling single fragment in Mol Extractor input data (#19)
* Bugfix for handling a single fragment in input data of RDKit Molecule Extractor node (Fixes #18). Single fragments are treated as expected, resulting in a single output row. If the sanitization option is on, it will be sanitized as multiple fragments would. Updated documentation and regression tests. Additionally, added JRebel startup parameter for test preparation launch file. * Updated copyright information
- Loading branch information
1 parent
6c301e4
commit f445971
Showing
4 changed files
with
872 additions
and
866 deletions.
There are no files selected for viewing
113 changes: 57 additions & 56 deletions
113
....knime.nodes/src/org/rdkit/knime/nodes/molextractor/RDKitMoleculeExtractorNodeFactory.xml
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,56 +1,57 @@ | ||
<?xml version="1.0" encoding="utf-8"?> | ||
<!DOCTYPE knimeNode PUBLIC "-//UNIKN//DTD KNIME Node 2.0//EN" "http://www.knime.org/Node.dtd"> | ||
<knimeNode icon="./default.png" type="Manipulator"> | ||
<name>RDKit Molecule Extractor</name> | ||
|
||
<shortDescription> | ||
Splits up fragment molecules contained in a single RDKit molecule cell and extracts these molecules into separate cells. | ||
</shortDescription> | ||
|
||
<fullDescription> | ||
<intro> | ||
Splits up disconnected fragment molecules contained in a single RDKit molecule cell | ||
and extracts these molecules into separate cells. If the input molecule contains only one fragment | ||
or if the input cell is empty (missing), the input cell will be used as result with the | ||
appropriate reference column. It can either be used with an input table or based on | ||
flow variable input for the molecules and their format. Supported molecule formats are | ||
RDKit Mol cells (when connecting an input table), SMILES, MOL and SDF. <br></br> | ||
Please be aware that auto-conversion (e.g for SMILES input) may fail when connecting an input table. <br></br> | ||
The Advanced Tab offers different options to treat conversion failures, empty input cells | ||
and zero-atom molecules (empty molecules). You may configure the node to fail, to generate empty cells | ||
with or without warning, or to skip the input with or without warning.<br></br> | ||
The node can be used for instance after a Quickform Molecule Input node, which brings up a sketcher in the | ||
KNIME Web Portal. When the user draws multiple molecules at once this node will split up the users | ||
input into multiple molecules. | ||
</intro> | ||
<option name="Table Input - RDKit Mol column">The input column with RDKit Molecules, which may contain multiple disconnected fragments to be extracted.</option> | ||
<option name="Table Input - Reference column (e.g. an ID)">The column to be used as reference column. Its values are assigned to | ||
to the cells with the extracted molecules. You may use the Row ID here or set it to None, in which case the reference column will not be added.</option> | ||
<option name="Variable/Data Input - Molecules">A textual representation of the molecules to be split. Usually, you will | ||
attach a flow variable to control this setting, e.g. coming from a Molecule Sketcher. This setting will only | ||
be used, if no table is connected.</option> | ||
<option name="Variable/Data Input - Format">The format of the molecules: MOL, SDF or SMILES are supported. | ||
This setting will only be used, if no table is connected.</option> | ||
<option name="Output - Column name for extracted molecules">The name to be used for the new column used to store extracted molecules.</option> | ||
<option name="Output - Column name for copied reference data">The name to be used for the new reference column used to store the reference values.</option> | ||
<option name="Advanced - Sanitize fragments">Flag to determine, if fragments shall be sanitized when being extracted. | ||
Selecting this option may lead to additional fragmentation errors. Default is false.</option> | ||
<option name="Advanced - How to react on conversion errors">Define here, how the node shall behave, if | ||
an input molecule to be processed could not be converted into an RDKit molecule. Usually, this is the case | ||
if the molecule is invalid. By default, the node | ||
will generate a missing cell and no warning. | ||
If conversion requires special treatment, you may use the RDKit From Molecule node | ||
to perform the conversion before executing this node. It offers different options for conversion.</option> | ||
<option name="Advanced - How to react on empty (missing) cells">Define here, how the node shall behave, if | ||
an empty (missing) cell is used as input. By default, the node | ||
will generate a missing cell and no warning.</option> | ||
<option name="Advanced - How to react on empty (zero atom) molecules">Define here, how the node shall behave, if | ||
an empty molecule is used as input. This means that the molecule has zero atoms. By default, the node | ||
will generate a missing cell and no warning.</option> | ||
</fullDescription> | ||
|
||
<ports> | ||
<inPort index="0" name="Input table with RDKit Molecules">Input table with RDKit Molecules, which may contain disconnected fragment molecules.</inPort> | ||
<outPort index="0" name="Result table with extracted RDKit Molecules">Output table with extracted fragment molecules.</outPort> | ||
</ports> | ||
</knimeNode> | ||
<?xml version="1.0" encoding="utf-8"?> | ||
<!DOCTYPE knimeNode PUBLIC "-//UNIKN//DTD KNIME Node 2.0//EN" "http://www.knime.org/Node.dtd"> | ||
<knimeNode icon="./default.png" type="Manipulator"> | ||
<name>RDKit Molecule Extractor</name> | ||
|
||
<shortDescription> | ||
Splits up fragment molecules contained in a single RDKit molecule cell and extracts these molecules into separate cells. | ||
</shortDescription> | ||
|
||
<fullDescription> | ||
<intro> | ||
Splits up disconnected fragment molecules contained in a single RDKit molecule cell | ||
and extracts these molecules into separate cells, also sanitizing these molecules if desired. | ||
If the input cell is empty (missing), the input cell will be used as result with the | ||
appropriate reference column. If the input molecule contains only one fragment it will | ||
result in a single row. The node can either be used with an input table or based on | ||
flow variable input for the molecules and their format. Supported molecule formats are | ||
RDKit Mol cells (when connecting an input table), SMILES, MOL and SDF. <br></br> | ||
Please be aware that auto-conversion (e.g for SMILES input) may fail when connecting an input table. <br></br> | ||
The Advanced Tab offers different options to treat conversion failures, empty input cells | ||
and zero-atom molecules (empty molecules). You may configure the node to fail, to generate empty cells | ||
with or without warning, or to skip the input with or without warning.<br></br> | ||
The node can be used for instance after a Quickform Molecule Input node, which brings up a sketcher in the | ||
KNIME Web Portal. When the user draws multiple molecules at once this node will split up the users | ||
input into multiple molecules. | ||
</intro> | ||
<option name="Table Input - RDKit Mol column">The input column with RDKit Molecules, which may contain multiple disconnected fragments to be extracted.</option> | ||
<option name="Table Input - Reference column (e.g. an ID)">The column to be used as reference column. Its values are assigned to | ||
to the cells with the extracted molecules. You may use the Row ID here or set it to None, in which case the reference column will not be added.</option> | ||
<option name="Variable/Data Input - Molecules">A textual representation of the molecules to be split. Usually, you will | ||
attach a flow variable to control this setting, e.g. coming from a Molecule Sketcher. This setting will only | ||
be used, if no table is connected.</option> | ||
<option name="Variable/Data Input - Format">The format of the molecules: MOL, SDF or SMILES are supported. | ||
This setting will only be used, if no table is connected.</option> | ||
<option name="Output - Column name for extracted molecules">The name to be used for the new column used to store extracted molecules.</option> | ||
<option name="Output - Column name for copied reference data">The name to be used for the new reference column used to store the reference values.</option> | ||
<option name="Advanced - Sanitize fragments">Flag to determine, if fragments shall be sanitized when being extracted. | ||
Selecting this option may lead to additional fragmentation errors. Default is false.</option> | ||
<option name="Advanced - How to react on conversion errors">Define here, how the node shall behave, if | ||
an input molecule to be processed could not be converted into an RDKit molecule. Usually, this is the case | ||
if the molecule is invalid. By default, the node | ||
will generate a missing cell and no warning. | ||
If conversion requires special treatment, you may use the RDKit From Molecule node | ||
to perform the conversion before executing this node. It offers different options for conversion.</option> | ||
<option name="Advanced - How to react on empty (missing) cells">Define here, how the node shall behave, if | ||
an empty (missing) cell is used as input. By default, the node | ||
will generate a missing cell and no warning.</option> | ||
<option name="Advanced - How to react on empty (zero atom) molecules">Define here, how the node shall behave, if | ||
an empty molecule is used as input. This means that the molecule has zero atoms. By default, the node | ||
will generate a missing cell and no warning.</option> | ||
</fullDescription> | ||
|
||
<ports> | ||
<inPort index="0" name="Input table with RDKit Molecules">Input table with RDKit Molecules, which may contain disconnected fragment molecules.</inPort> | ||
<outPort index="0" name="Result table with extracted RDKit Molecules">Output table with extracted fragment molecules.</outPort> | ||
</ports> | ||
</knimeNode> |
Oops, something went wrong.