Protein structure descriptors and alignment based on 3D Zernike moments.
See it in action:
-
https://www.rcsb.org : assembly and chain search integrated with other types of searches (text, sequence etc)
-
http://shape.rcsb.org : standalone frontend application that performs assembly and chain search and calculates alignments on the fly, displaying them with NGL
This library implements 3D Zernike moment calculation and normalization as introduced in Canterakis 1996 and Canterakis 1999. Routines are provided for calculation of:
- Trivial rotational invariants (norms of the vectors Ωnl), commonly referred as 3D Zernike Descriptors. Calculation of these descriptors is based on the 3D Zernike Moments library by Marcin Novotni (see Novotni and Klein 2003). The implementation here fixes a bug that causes the invariants of the same order to be cumulative.
- Complete rotational invariants (Canterakis norms), not available in the Novotni library.
- Alignments, based on the complete rotational invariants.
The test directory contains tests that demonstrate how to read PDB-deposited protein structures (with the help of BioJava) and perform Zernike moment invariant calculation and alignment.
See the publication describing this work: Real time structural search of the Protein Data Bank. Guzenko D, Burley SK, Duarte JM. PLoS Computational Biology 2020.
We publish jar artifacts to maven central. In a maven project, you can use this library by adding this dependency:
<dependencies>
<dependency>
<groupId>org.rcsb</groupId>
<artifactId>biozernike</artifactId>
<version>1.0.0-alpha11</version>
</dependency>
</dependencies>
The zernike
package is derived from the "3D Zernike Moments" library by Marcin
Novotni and is distributed under the terms of LGPL v2.0. Note that the
original link to the library is gone
but a clone of it is available in github.
The volume
package is derived from the "gmconvert" program by Takeshi Kawabata
and is distributed under the terms of LGPL v3.0.
The complex
package is derived from the code by John B. Matthews and is distributed
under the terms of LGPL v3.0.
The BioZernike library as a whole is distributed under the terms of the MIT License.