This is a hashed fingerprint based on the shortest path. The idea is to skip the exhaustive path based finger printers where the runtime is exponential.
Java
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Permalink
Failed to load latest commit information.
blog
lib
nbproject
src/fingerprints
test
.gitignore
README.md
build.xml
manifest.mf

README.md

ShortestPathMoleculeFingerprinter

This is a hashed fingerprint based on the shortest path. The idea is to skip the exhaustive path based finger printers where the runtime is exponential.

Nina's example for CDK Hashed Fingerprinter failure.

        "InChI=1S/C23H30/c24-15(25-24)4-2-1-3(2)6(1,2,4,15)5(2,4,15,36(15)32(15)45(4,5,15,36)43(4,5,15)28(4)15)7(1,2,3,4,6,15,27(2)41(1,2,7)35(1,2)7)12(1,2,3,6)10-11(12)13(1,3,10,12,30-31(13)46(3,12,13,30)42(1,3,12)13)19(3,10,11,12)9(6)8(4,5,6)14(4,5,6,9,15)17(8,9,19)16(9,10,11,19)18(8,9,14,17,19,47(8,9,14,17)48(8,9,14)37(8,14)38(8,14)48)20(3,9,10,11,12,13,16,17,19)21(10,11,13,19)22(10,11,19,20,26-39(10,21)22,33(21)34(21)22,40(21)44(11,21,22)49(10,11,21,22)40)23(9,10,11,16,17,18,19,20,21)29(16)52(16,17,18,23)50(16,17,18,23)51(16,17,18,23)52/h28H",
        "InChI=1S/C4Cl12/c5-1-2(5)3(1)4(1,2)7(3)8(3,4)12(1,2,3,4)10(1,2,3,4)6(1,2,5,14(1,2,3,4,5,10,12)16(1,2,3,4,7,8,10)12)13(1,2,3,4,5)9(1,2,3,4,5)11(1,2,3,4,7,13)15(1,2,3,4,7,8,9)13",
        "InChI=1S/C30H28N6O6S4/c37-27-23(54(27)55(23)27)25(27,60(23,27)61(23,27)37)11-12(25)24(11,43-44-25)14-19(9-6-2-3(6,48-2,51(2)6)8(2,6,9)17(6,9,19,52(8)9)28(9,14,19,24,63(8,9,17)19)34(14,17,19,24,57(14,19)28)36(11,12,14,24,28,66(14,24,28)34)40(11,12,24)33(11,12,23,25,27,59(23,25)27)41(11,12,25,36)40)21-15-4-1(47-4)5(4,15,50(1)4)16(4,15,21)29(15,21)22(21)26(29)10-7-18(10)13-20(18,42(13,49-13)58(13)64(13,20,42)65(13,20,42)58)31(7,10,18,38(7)18,53(18)20)46(18,20)45(10,26)39(7,10,26)32(7,10,22,26)35(21,22,26,29,56(22)32)30(15,16,21,22,26,29,62(5,15,16)29,67(15,16,26)29)68(21,22,29)35/h1-2,14,20,22,37H",
        "InChI=1S/C76H68/c77-10(5-15-17(5)27(15,107(15,17)81-17)21-22-25(15,21,27,99(21)27)26(21,22)16(5)18(5)28(16,22,26,100(22)26)108(16,18)82-18)19(77)20(10,78-10)29(19)30(19,20)41(29)31-32(41)46-40-48(32,46,86-106(46)48)58(29,30,31,32,41,102(32)48,110(29,30)41)57(29,30,31,32,41,109(29,30)41)47(31,101(31)57)39-45(31,47,105(47)85-47)53(39)65(39,45,125(39,53,89-53)123(39,45,47)53)66(40,46)54(40,46,90-126(40,54,66)124(40,46,48)54)70(65,66)61-35-36(61,62(35,61,70)69(53,61,65,66,70)72(61,62,65,66,70,129(61,62,69,93-61)133(61,65,69,72)121(53,65)69,130(61,62,70,94-62)134(62,66,70,72)122(54,66)70)76(39,40,45,46,53,54,65,66,69,70)73(31,32,39,40,41,45,46,47,48,57)58)42(35)49-33-37-43(33,49)51(33,49,87-117(33,51)113(33,43)51)59(35,36,42,49,103(49)51,111(35,36)42)50(42,49)34-38-44(34,50)52(34,50,88-118(34,52)114(34,44)52)60(35,36,42,49,50,59,104(50)52,112(35,36)42)75(33,34,42,43,44,49,50,51,52,59)74(33,34,37,38,43,44)63(37,43,115(37,43)83-37)64(38,44,74,116(38,44)84-38)67(37,63,74)55-2-1(4-6-8(11(4,6)79-97(6)11)13-14-9-7(4,12(4,9)80-98(7)12)24(9,13,14,96(9)14)23(6,8,13,14)95(8)13)3(2,55)56(2,55,67)68(38,55,63,64,67,74)71(55,56,63,64,67,74,127(55,56,67,91-55)131(55,63,67,71)119(37,63)67)128(55,56,68,92-56)132(56,64,68,71)120(38,64)68/h8-9,13-14,19-22,27-28H",
        "InChI=1S/C12H37O3/c16-5-1-3-6(1,5)9(1,3,5,16,21-39(9)22-9,10(1,3,5,6,17-5,29(1)30(1)10,41(1,5)18-5,26-48(1,3,6,9)10)43(6)37(6)44(6,10)43)13(3)4-2-7(33(5)34(5)7)8(2,4,13)11(2,4,7,13,23-40(11)24-11,35(7)36(7)11,14(3,4,6,9,13,27(3)6)15(3,4,8,9,11,13)28(4)8)12(2,4,7,8,19-7,31(2)32(2)12,42(2,7)20-7,25-47(2,4,8)12)45(8)38(8)46(8,12)45/h1-4H",
        "InChI=1S/C18H44/c19-45-10-6-7(10,33(6)34(6)7,35(6)36(6)7)11(6,10)14(10,45)3-5(14,31(3)32(3)5)8(3,10,14)15(3,5,14,22-5,44(3,5,8)14,17(6,7,8,10,11,14,45)18(6,7,8,10,11,14,23-8,24-8,38(10)17,48(6,7,10,11,17)49(6,7,10,11,17)18)39(11)50(8,10,11,17,18)52(8,10,11,14,17,18)42(11,17)25-11)4(14,41(14,15)45)9(3,5,15)1-2(9,27(1)28(1)2,29(1)30(1)2)12(1,3,5,9,15)13(1,2,4,9,15,20-4,21-4)16(1,2,3,4,5,9,12,14,15,37(9)13,46(1,2,9,12)13)43(12,26-12)51(4,9,12,13,15,16)47(4,9,12,13,16)40(12)13/h1-8,14,19H",
        "InChI=1S/C32.C18H13PS2.C10H15.C2Cl2.2CF3O3S.2Ru/c1-6-9(1)18(1,6)2-7-8(2)15(7)3-5-4-16-10-11(16)12(10)19-13-14(19)17(6,13,19)21(13,14,19)20(9,18)22(15)23(16)24(3,4,5,22)27(3,5,15,22,23)26(2,7,8,18,20,22)25(1,6,9,18,20,21,32(7,8,15,20,22,24,26)27)31(13,14,17,19,21,23)29(10,11,12,16,23,24)28(4,5,16,22,23,24,27)30(10,11,12,19,21,23,29)31;20-17-12-6-4-10-15(17)19(14-8-2-1-3-9-14)16-11-5-7-13-18(16)21;1-6-7(2)9(4)10(5)8(6)3;3-1-2(3)4-1;2*2-1(3,4)8(5,6)7;;/h;1-13H;1-5H3;;;;;",
        "InChI=1S/C8H16Cl16/c25-1-3-5(1)7(1,3,25,29(3)5)11(1)9(1,19(1,3,5,7,11)17(3,5,7)21(1,3,5,7,19,33(3,5)17,37(5,17)39(3,5,17)21)23(1,3,5,7,9,11,17,19,27(1)19)31(1,7)25)15(3,5,7,11)13(3,5,7,35(3,7)15)14-4-2-6(4,14)8(2,4,14,30(4)6)12(2)10(2,16(4,6,8,12,14)36(4,8)14)20(2,4,6,8,12)18(4,6,8)22(2,4,6,8,20,34(4,6)18,38(6,18)40(4,6,18)22)24(2,4,6,8,10,12,18,20,28(2)20)32(2,8)26(2)8"

 I have resolved this by Shortest path based new fingerprinter.

 Here is the output.

Counter 1 Atoms 52 Bonds 194 Elapsed time 1146 Bitset {17, 87, 213, 228, 249, 306, 453, 503, 623, 753, 797, 831, 841, 858, 1000}

Counter 2 Atoms 16 Bonds 72 Elapsed time 31 Bitset {475, 503, 523, 623, 658, 728, 753, 841, 858, 1000}

Counter 3 Atoms 68 Bonds 197 Elapsed time 951 Bitset {0, 4, 7, 11, 12, 17, 18, 19, 20, 21, 34, 38, 41, 51, 58, 59, 60, 68, 72, 79, 87, 98, 101, 105, 111, 112, 113, 120, 121, 124, 132, 133, 137, 141, 144, 149, 152, 157, 161, 163, 168, 173, 179, 180, 193, 198, 213, 216, 226, 227, 228, 229, 234, 242, 248, 249, 253, 255, 256, 258, 261, 266, 274, 280, 284, 291, 293, 298, 303, 304, 305, 306, 311, 327, 350, 356, 364, 367, 379, 387, 400, 405, 408, 410, 411, 423, 429, 433, 437, 439, 442, 443, 444, 453, 461, 463, 467, 469, 478, 481, 482, 493, 496, 503, 504, 505, 508, 512, 528, 534, 536, 547, 556, 557, 558, 561, 566, 569, 583, 587, 591, 593, 595, 608, 611, 618, 619, 623, 625, 630, 633, 637, 639, 640, 641, 645, 646, 661, 665, 667, 668, 669, 674, 676, 685, 689, 696, 697, 702, 719, 730, 735, 743, 747, 748, 753, 756, 760, 763, 767, 769, 771, 774, 775, 778, 784, 794, 797, 798, 809, 813, 814, 819, 820, 822, 831, 834, 835, 838, 841, 844, 846, 847, 849, 854, 855, 856, 857, 858, 867, 868, 869, 871, 875, 877, 885, 888, 890, 891, 894, 895, 896, 900, 905, 906, 909, 911, 915, 916, 919, 928, 932, 933, 934, 937, 942, 943, 947, 950, 958, 960, 961, 969, 971, 976, 977, 985, 989, 999, 1000, 1008, 1015}

Counter 4 Atoms 134 Bonds 402 Elapsed time 11416 Bitset {6, 7, 10, 12, 13, 14, 17, 22, 28, 42, 44, 47, 48, 49, 52, 58, 62, 67, 79, 81, 82, 86, 87, 93, 119, 120, 132, 133, 134, 137, 139, 141, 152, 157, 162, 167, 169, 170, 173, 179, 183, 185, 188, 189, 193, 195, 198, 200, 201, 203, 206, 207, 213, 214, 224, 228, 234, 240, 246, 249, 254, 257, 261, 272, 280, 285, 290, 293, 298, 300, 302, 306, 311, 319, 320, 321, 341, 352, 354, 364, 368, 394, 407, 419, 421, 424, 434, 443, 444, 445, 446, 447, 448, 449, 453, 456, 459, 467, 468, 470, 473, 477, 484, 492, 494, 501, 503, 506, 509, 512, 518, 522, 525, 528, 540, 542, 550, 552, 564, 566, 567, 570, 571, 575, 581, 586, 592, 595, 603, 605, 610, 611, 613, 614, 617, 618, 623, 624, 630, 634, 637, 638, 644, 645, 650, 652, 659, 663, 669, 671, 673, 674, 677, 679, 683, 685, 686, 693, 705, 713, 718, 723, 732, 734, 736, 738, 747, 748, 753, 755, 761, 763, 764, 765, 770, 771, 772, 775, 777, 780, 783, 784, 792, 797, 807, 808, 809, 812, 824, 826, 831, 833, 838, 841, 844, 846, 848, 850, 851, 855, 856, 858, 861, 869, 872, 881, 893, 894, 895, 900, 908, 913, 922, 923, 924, 927, 932, 933, 934, 947, 949, 950, 952, 956, 960, 963, 966, 969, 984, 987, 989, 996, 1000, 1014, 1018, 1022, 1023}

Counter 5 Atoms 48 Bonds 127 Elapsed time 186 Bitset {17, 38, 79, 139, 179, 183, 189, 191, 228, 248, 267, 306, 329, 366, 372, 405, 411, 425, 453, 459, 472, 503, 528, 588, 612, 616, 623, 753, 797, 839, 841, 858, 882, 895, 926, 930, 951, 954, 977}

Counter 6 Atoms 52 Bonds 177 Elapsed time 344 Bitset {17, 23, 87, 191, 213, 228, 249, 306, 372, 425, 453, 503, 537, 588, 623, 706, 753, 797, 804, 831, 841, 858, 895, 913, 1000}

Counter 7 Atoms 85 Bonds 157 Elapsed time 81 Bitset {2, 9, 27, 28, 41, 45, 53, 57, 79, 80, 87, 97, 102, 103, 105, 110, 123, 131, 135, 139, 144, 153, 179, 182, 183, 185, 203, 213, 217, 223, 225, 228, 231, 243, 246, 248, 253, 254, 276, 280, 283, 285, 288, 304, 313, 320, 329, 352, 366, 373, 375, 380, 386, 391, 392, 394, 396, 401, 409, 412, 424, 427, 431, 443, 444, 446, 453, 469, 475, 477, 479, 493, 503, 509, 512, 518, 522, 528, 530, 537, 539, 560, 571, 575, 579, 581, 594, 598, 601, 612, 615, 617, 623, 629, 644, 650, 656, 658, 662, 666, 676, 689, 691, 744, 746, 753, 777, 784, 805, 813, 841, 858, 863, 871, 878, 891, 895, 896, 904, 917, 947, 952, 967, 977, 987, 994, 1000, 1019, 1020, 1023}

Counter 8 Atoms 40 Bonds 145 Elapsed time 178 Bitset {17, 38, 43, 58, 65, 94, 100, 103, 128, 143, 228, 256, 264, 269, 288, 304, 306, 326, 349, 387, 389, 393, 397, 417, 446, 459, 469, 474, 475, 480, 490, 492, 493, 503, 517, 522, 523, 532, 567, 583, 594, 617, 623, 627, 644, 651, 653, 658, 668, 673, 677, 680, 693, 708, 724, 728, 746, 753, 787, 797, 839, 841, 852, 857, 858, 878, 879, 895, 927, 928, 936, 939, 954, 960, 1000, 1023}