Skip to content
This repository

HTTPS clone URL

Subversion checkout URL

You can clone with HTTPS or Subversion.

Download ZIP
Browse code

hash coding API

  • Loading branch information...
commit 07fb90eb1f73939389f23aff9b20dc44f57a1ea9 1 parent 489aaaf
John May authored February 03, 2013
44  src/main/org/openscience/cdk/hash/AtomHashGenerator.java
... ...
@@ -0,0 +1,44 @@
  1
+/* Copyright (c) 2013. John May <jwmay@users.sf.net>
  2
+ *
  3
+ * Contact: cdk-devel@lists.sourceforge.net
  4
+ *
  5
+ * This program is free software; you can redistribute it and/or
  6
+ * modify it under the terms of the GNU Lesser General Public License
  7
+ * as published by the Free Software Foundation; either version 2.1
  8
+ * of the License, or (at your option) any later version.
  9
+ * All we ask is that proper credit is given for our work, which includes
  10
+ * - but is not limited to - adding the above copyright notice to the beginning
  11
+ * of your source code files, and to any copyright notice that you may distribute
  12
+ * with programs based on this work.
  13
+ *
  14
+ * This program is distributed in the hope that it will be useful,
  15
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
  16
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
  17
+ * GNU Lesser General Public License for more details.
  18
+ *
  19
+ * You should have received a copy of the GNU Lesser General Public License
  20
+ * along with this program; if not, write to the Free Software
  21
+ * Foundation, Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 U
  22
+ */
  23
+package org.openscience.cdk.hash;
  24
+
  25
+import org.openscience.cdk.interfaces.IAtomContainer;
  26
+
  27
+/**
  28
+ * A hash function which generates 64-bit hash codes for the atoms of a
  29
+ * molecule.
  30
+ *
  31
+ * @author John May
  32
+ * @cdk.module interfaces
  33
+ */
  34
+public interface AtomHashGenerator {
  35
+
  36
+    /**
  37
+     * Generate invariant 64-bit hash codes for the atoms of the molecule.
  38
+     *
  39
+     * @param container a molecule
  40
+     * @return atomic hash codes
  41
+     */
  42
+    public long[] generate(IAtomContainer container);
  43
+
  44
+}
47  src/main/org/openscience/cdk/hash/EnsembleHashGenerator.java
... ...
@@ -0,0 +1,47 @@
  1
+/* Copyright (c) 2013. John May <jwmay@users.sf.net>
  2
+ *
  3
+ * Contact: cdk-devel@lists.sourceforge.net
  4
+ *
  5
+ * This program is free software; you can redistribute it and/or
  6
+ * modify it under the terms of the GNU Lesser General Public License
  7
+ * as published by the Free Software Foundation; either version 2.1
  8
+ * of the License, or (at your option) any later version.
  9
+ * All we ask is that proper credit is given for our work, which includes
  10
+ * - but is not limited to - adding the above copyright notice to the beginning
  11
+ * of your source code files, and to any copyright notice that you may distribute
  12
+ * with programs based on this work.
  13
+ *
  14
+ * This program is distributed in the hope that it will be useful,
  15
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
  16
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
  17
+ * GNU Lesser General Public License for more details.
  18
+ *
  19
+ * You should have received a copy of the GNU Lesser General Public License
  20
+ * along with this program; if not, write to the Free Software
  21
+ * Foundation, Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 U
  22
+ */
  23
+package org.openscience.cdk.hash;
  24
+
  25
+import org.openscience.cdk.interfaces.IAtomContainer;
  26
+
  27
+import java.util.Set;
  28
+
  29
+/**
  30
+ * A hash function which generates a single 64-bit hash code for a set of
  31
+ * molecules (ensemble).
  32
+ *
  33
+ * @author John May
  34
+ * @cdk.module interfaces
  35
+ */
  36
+public interface EnsembleHashGenerator {
  37
+
  38
+    /**
  39
+     * Generate invariant 64-bit hash code for an ensemble of molecules.
  40
+     *
  41
+     * @param ensemble an ensemble molecule
  42
+     * @return hash code for the ensemble
  43
+     */
  44
+    public long generate(Set<IAtomContainer> ensemble);
  45
+
  46
+
  47
+}
43  src/main/org/openscience/cdk/hash/MoleculeHashGenerator.java
... ...
@@ -0,0 +1,43 @@
  1
+/* Copyright (c) 2013. John May <jwmay@users.sf.net>
  2
+ *
  3
+ * Contact: cdk-devel@lists.sourceforge.net
  4
+ *
  5
+ * This program is free software; you can redistribute it and/or
  6
+ * modify it under the terms of the GNU Lesser General Public License
  7
+ * as published by the Free Software Foundation; either version 2.1
  8
+ * of the License, or (at your option) any later version.
  9
+ * All we ask is that proper credit is given for our work, which includes
  10
+ * - but is not limited to - adding the above copyright notice to the beginning
  11
+ * of your source code files, and to any copyright notice that you may distribute
  12
+ * with programs based on this work.
  13
+ *
  14
+ * This program is distributed in the hope that it will be useful,
  15
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
  16
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
  17
+ * GNU Lesser General Public License for more details.
  18
+ *
  19
+ * You should have received a copy of the GNU Lesser General Public License
  20
+ * along with this program; if not, write to the Free Software
  21
+ * Foundation, Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 U
  22
+ */
  23
+package org.openscience.cdk.hash;
  24
+
  25
+import org.openscience.cdk.interfaces.IAtomContainer;
  26
+
  27
+/**
  28
+ * A hash function which generates a single 64-bit hash code for a molecule.
  29
+ *
  30
+ * @author John May
  31
+ * @cdk.module interfaces
  32
+ */
  33
+public interface MoleculeHashGenerator {
  34
+
  35
+    /**
  36
+     * Generate invariant 64-bit hash code for a molecule.
  37
+     *
  38
+     * @param container a molecule
  39
+     * @return hash code for the molecule
  40
+     */
  41
+    public long generate(IAtomContainer container);
  42
+
  43
+}

0 notes on commit 07fb90e

Egon Willighagen

What is the idea of having it in interfaces? I prefer only to have stuff there that is needed by some/many modules? This interface seems specific to a hash module? Can you say something on how extensive this hash functionality will be used in the CDK?

John May

I think these should go in interfaces. I tend to always stick every interface at the bottom of the module hierarchy. This could be used for fingerprints, isomorphism, atom typing.. anything where you need fast look up or caching of invariants.

Another example is probably better to show why. Let's say I have a smiles generator, which needs a canonicalize method.

SmilesGenerator generator = new SmilesGenerator(Canonical canonical);

Now I have module label which has MorgonNumbers() tool. I stick the interface in MorgonNumbers as it's where it's needed.

SmilesGenerator generator = new SmilesGenerator(new MorgonNumbers());

then I want one for InChINumbers... I now need to make InChI depend on my MorgonNumbers module or move the interface.

SmilesGenerator generator = new SmilesGenerator(new InChINumbering());

With the interface here I can write a simple wrapping class in the InChIModule which also provides a hash using the InChIKey and not have to depend on this hashcode module.

Egon Willighagen
Please sign in to comment.
Something went wrong with that request. Please try again.