1.5.6 Release Notes
Clone this wiki locally
This release is the first using maven to build the project and lots of changes are related to reorganising directories and fixing build issues. New functionality includes an implementation of the popular extended connectivity fingerprints (ECFP) contributed by Alex Clark and Krishna Dole of [Collaborative Drug Discovery](Collaborative Drug Discovery). Additional changes include improved API utilities, descriptor fixes and improved stereochemistry perception / depiction. As always, a huge thank you to all contributors and reviewers.
Related blog posts
Download options have slightly changed as external dependancies (libraries) are now resolved by maven.
- The release has been deployed to the
- The bundled jar of the library and all required dependencies:
- Additional download options available at sourceforge: https://sourceforge.net/projects/cdk/files/cdk%20%28development%29/1.5.6/
This release brings a reorganisation of the project directory and build system. Some tests (ioformats) are disabled due to cyclic dependancies, maven also picked up a lot of tests that were not run and accounts for the increase in failures.
19961 (-760) tests (ioformats and coverage not run) 23 (+7) failures 0 (-19) coverage failures (disabled) 0 (-1) error
112 John May 7 Alex Clark 5 Stephan Beisken 4 Egon Willighagen 3 Cyrus Harmon 2 Krishna Dole 1 Nina Jeliazkova
1 Cyrus Harmon 71 Egon Willighagen 17 John May
Full change log
- Bumping version for release.
- Added a missing space
- Site index page
- Allow creation of staged site (without testapidocs)
- Including Cyrus, Alex and Krishna as contributors.
- Maven site reports developers as those with commit access. Other developers are listed as contributors.
- Referenced in the doc/dict and now located there.
- Conversion table is related to atom types, no apparent usage.
- Git attribute files allowed the descriptor $Id$ to be tied to a commit. This was recently removed but may be useful again in future.
- git ignored no longer needed
- Moving package documentation for ChemNomParser.
- Element definitions are now an enum, this file has been moved to cdk-build-util.
- HETATM type map for PDB reading.
- Unused CML test frameworks, referenced in io module.
- File not used, but now in correct module location.
- Residual config files left over from ant build.
- Shade plugin requires maven v3.0+.
- Update descriptor algorithms ontology.
- Include simple javadoc tags.
- Consider implicit hydrogens in natural abundance utility.
- Consider implicit hydrogens in exact mass utility.
- Consider implicit hydrogen counts in getNaturalExactMass.
- Fixed bug in CircularFingerprint/biotypes (for FCFP)
- Exceptions being thrown in functional fingerprint generation.
- Unpaired electrons affect hydrogen count.
- Triplet = 2 unpaired electrons, not currently represented.
- Remove language assertion that caused test error. The test assertions were passing and the lang assert was causing an error randomly. This class will be moved to a deprecated module shortly.
- Similarity measure was modified (d1527c6) but value was not updated.
- Check null hydrogen count [bug:1335]
- Random test order causes failure now default option is to ignore aromaticity.
- Pattern implementation for SMARTS queries.
- Move test resources to be local to the test that requires them. These files in particularly have been reused in pcore/io so the originals have been left in place for the moment.
- We can remove the dependence on cdk-extra and valency check by not use the MoleculeFactory. Using cdk-data (test) isn't ideal but is better than extra.
- Dependencies that can be safely removed from cdk-group and cdk-hash.
- Move the hash resources files to the same package as the test that needs them.
- Move resources files required by cdh-hash to that module and eliminates dependence on testdata.
- We can correctly define the descriptors exported by each module.
- Utility in the Matching API provides simplified procedure to assigning a perfect matching.
- Maximum matching using Edmond's blossom algorithm.
- DisjointSetForest generally useful - move to cdk-standard. Left in the same package for now.
- Storage and manipulation of an independent edge-set, a matching. A matching can be used to represent pi bond placement in conjugated/aromatic systems.
- The Vertex Adjacency Magnitude descriptor requires the number of heavy atom - heavy atom bonds as input. The current incorrect calculation retrieves the number of heavy atoms instead. Correct bond number estimation and test cases have been added.
- The rotatable bond count descriptor should not include C-N amides and terminal hydrogen- and hetero-atoms for the extended Lipinki's rule of five implementation (see Veber, D.F. et al., 2002. Journal of medicinal chemistry, 45(12), pp.2615–23). The parameters for the rotatable bond counts descriptor have been modified to that end and an option has been added to exclude simple C-N amides (ignoring tautomeric or charged constitutions).
- Container may be null (allowed) use the default length limit (|V|) from the graph instead.
- A filter to only return cycles without a chord.
- Constructor should still be private - debugging from last commit.
- Explain the minimum length is chosen if a constructor and method limit are provided.
- Adapt the cycle finding to use the length.
- Parse a length limit to the cycle finding algorithm.
- Limit the size of initial cycles discovered.
- Limit the size of shortest paths discovered.
- Allow five valent nitrogen for non-charge separated representation, see pyridinone test case.
- Check for normal valence in aromatic model before anything else.
- Test that three valent nitrogen cations (radical) are not allow to be aromatic [bug:1332]
- Ignore aromatic bonds by default. The option has be deprecated but should be removed in future. Global state, therefore impossible to test reliably.
- Correcting typos.
- Update README.md
- Repository config to use RELEASE and LATEST.
- Empty targets are also a problem for the substructure search. Simple checks avoids the infinite loop.
- Ignore this old test; this convention is not used anymore, AFAIK
- Providing a custom set of invariants before labelling.
- Including JavaDoc for new methods. Would also be better to use Elements from a symbol, currently the method is case sensitive.
- add new MolecularFormulaManipulator.getElementCount static methods that take (formula, IIsotope) and (formula, String)
- remove debug log statements in AtomPlacer.java
- convert some AtomPlacer metohds to static methods and fixup their invocations
- Descriptor $Id$ needs updating to new scheme.
- Use MDLV2000Reader for tests
- Minor tidying of comments
- CircularFingerprinter recyclable
- Change implementation vendor to CDK, since that seems to be the convention.
- Minor edits to comments; remove unused debugging code and import.
- Added missing files
- Descriptors & fingerprints
- Correcting repository id.
- Correct version - previous release was 1.5.5 so the current version should be 1.5.6-SNAPSHOT.
- Update README.md
- Section on the Maven artefacts and repository.
- Update README.md
- Using markdown in README.
- Configuring distribution management - these addresses are only needed for deployment to a remote repository. This repository may be different for each user and will also require a password. The locations can be defined in the ~/.m2/settings.xml file.
- Assembly of an uber-jar, dist-large.
- Correct unit test assertion.
- Inheritance in tests causes problems for dependency analysis. We need to add cdk-diff and mockito to these modules even though the modules only have transitive interactions.
- cdk-diff needed by the interfaces module.
- Using ‘mvm dependency:analyse’ on cdk-base modules to remove dependencies that were declared but not used and include dependencies that were used but not declared. In the second case these occur when the code interacts with a dependency of a dependency. Examples seen here are xml-apis (a dependency of xom) and hamcrest-core (a dependency of hamcrest-all).
- Centralising hamcrest version in parent pom.
- Centralising mockito version in parent pom.
- Centralising JUnit version in parent pom.
- Centralising Guava version in parent pom.
- Centralising Beam dependency version in the parent pom.
- Ignore config files from Eclipse, which it now autogenerates from Maven details
- Default to the CDK version, if not other specificationIdentifier is given
- Update README
- Update README
- Updating readme with maven instructions.
- Bumping copyright year.
- Added project names for each module (same as module name)
- Moving testdata resources to the correct location.
- Using latest version of Beam.
- Dependencies are now resolved using Maven. Version is now in root pom.xml instead of build.props.
- Ignore maven build directory (target/).
- Removing old class from JChemPaint.
- Module build files.
- @cdk.set files - note qsar set is now redundant and picked up from META-INF/services
- Redundant classes.
- Moving source files to separate source trees.
- Moved to cdk-build-util project
- files to remove
- Aromaticity information is lost on descriptor execution unless the 'aromaticity parameter' is set to true. Already perceived -- pre-perceived -- aromaticity information should not be lost automatically if this parameter is set to false. Perception of unset parameters only via the helper method in the AtomContainerManipulator rectifies this issue. Three test cases: 1) Benzene with pre-perceived aromaticity and no aromaticity perception in the descriptor. 2) Benzene with no pre-perceived aromaticity and no aromaticity perception in the descriptor. 3) Benzene with no pre-perceived aromaticity and perception in the descriptor.
- Arbitrary labels in SMILES bracket atoms. The label is set in the pseudo atom.
- Beam 0.5. - resolve a bug that would randomly fail when configuring double-bond stereochemistry (found from the new random order of IStereoElement sets). - parse tetrahedral selenium and sulphur cations (okay according to inchi tech manual) - parser arbitrary labels in bracket atoms - they are never generated output but instead placed in the ‘label’ of a pseudo atom if one wants to handle them.
- Avoid erroneous aromaticity due to abnormal valence.
- Atoms are not added to IQueryAtomContainer causing an index out of bounds exception downstream. Adding the atoms fixes the issue.
- Removed redundant System.out statement.
- Simplified creation of a custom 'fall-back' cycle finder.
- Find all cycles up to (and including) a specified length.
- Point to correct usage in error message
- Remove deprecation warning - this will likely be the best way to set the parameters for SMARTS for a while.
- Ensure an exception is not thrown when a tetrahedral centre has an implicit hydrogen by the target does not. We want matching to be consistent whether the query as implicit/explicit hydrogens and so we say this is not a substructure match (with stereochemistry).
- Implicit hydrogen count should always be 0 or greater.
- Correct filtering of unique atom and bond matches. The filter now uses a new predicate for each successive iterations.
- Also initialise descriptors loaded by name.
- Can’t run devel.xml but this should be working correctly.
- Open JavaDoc Checker now working with the doc tests in cdk-build-utils. We can also remove the compilation taglets / docchecks.
- Custom taglets all work correctly including the IOOptions - the ‘cheminf.bibx’ is now duplicated in the ‘cdk-build-util’ project.
- Incuding the utils in the classpath allows project to build correctly.
- Remove net.sf.* classes - now in ‘cdk-build-utils’
- Check for null input before closing the resource.
- Don't stop the world if a single invalid molecule is found. The iterator is meant for parsing large data sets and should not stop early if an invalid structure was found. If the SMILES could not be parsed the iterator now returns an empty container and set the attempted input as a property.
- Empty molecules are rare but found - this should not halt the reader.
- input.ready() doesn't really do anything.
- Deprecating redundant method.
- Bumping version number - open for patching.