Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Weighted path improvements #817

Merged
merged 10 commits into from Feb 6, 2022
Merged

Weighted path improvements #817

merged 10 commits into from Feb 6, 2022

Conversation

johnmay
Copy link
Member

@johnmay johnmay commented Feb 4, 2022

Additional test cases for WeightedPathDescriptor and performance optimisations. Working out all paths is still going to be slow in general but we can do much better than the currently implementation. The first two commits clean up tests.

  • 47a4f94: Current
  • b1d4d10: Remove temporary double[] array, we can sum as we go
  • a6e8fdc: Only compute the paths once and reuse the results
  • d45c170: Faster path gathering, rather than find all paths between "a" and "a1" we can instead just generate all paths from a
  • 7a039a7: We don't actually need to store the paths and can just weigh/process them as we find them
  • 39fab1b: By collecting the bond paths we can precompute the weight of each bond and do less Math.sqrt() ops.
  • bc05769: We do even better by computing the weights as we go along

Times below in milliseconds based on the example molecules provided in #672.
Having rewritten I think there may be a slight bug in the path summing where by unique paths vs non-unique paths are summed.

Since people are already using this descriptor and the original paper was for a canonical label rather I think this is OK to leave as it is.

Will wait to see if SonarCloud flags anything.

47a4f94 (ms) b1d4d10 (ms) a6e8fdc (ms) d45c170 (ms) 7a039a7 (ms) 39fab1b (ms) bc05769 (ms) Mol No
26293 20398 15681 892 291 251 80 1
30921 24008 18041 792 297 227 44 2
9237 7420 6286 255 94 75 20 3
7717 5858 4596 127 80 64 15 4
6375 4915 3925 324 66 52 12 5
4458 3383 2938 86 54 43 11 6
11461 8938 6985 415 98 77 18 7
7548 5713 4512 347 80 62 14 8
9217 6836 5676 161 103 81 17 9
11109 8464 6673 384 95 76 17 10
6644 5208 4404 375 69 56 13 11
2919 2225 1819 54 30 24 6 12
6498 4710 4057 136 75 59 14 13
5717 4380 3619 345 62 50 12 14
2725 2137 2139 53 33 26 7 15
1499 1172 1006 26 17 13 3 16
1 | C[C@H]1[C@@H]([C@H]([C@H]([C@@H](O1)OC[C@@H]2[C@H]([C@@H]([C@H]([C@@H](O2)OC(=O)[C@]34CCC(C[C@H]3C5=CCC6[C@]7(CC[C@@H](C(C7CC[C@]6([C@@]5(C[C@H]4O)C)C)(C)C)O[C@H]8[C@@H]([C@H]([C@@H]([C@H](O8)CO[C@H]9[C@@H]([C@H]([C@H](CO9)O)O)O[C@H]1[C@@H]([C@H]([C@@H](CO1)O)O)O)O)O)O)C)(C)C)O[C@H]1[C@@H]([C@@H]([C@H]([C@@H](O1)C)O[C@H]1[C@@H]([C@H]([C@@H](CO1)O)O[C@H]1[C@@H]([C@H]([C@@H](CO1)O)O)O)O)O)O)O)O)O)O)O
2 | C[C@H]1[C@@H]([C@H]([C@H]([C@@H](O1)O[C@@H]2[C@H]([C@@H]([C@H](O[C@H]2OC(=O)[C@]34CCC(C[C@H]3C5=CCC6[C@]7(CC[C@@H](C(C7CC[C@]6([C@@]5(C[C@H]4O)C)C)(C)C)O[C@H]8[C@@H]([C@H]([C@@H]([C@H](O8)CO[C@H]9[C@@H]([C@H]([C@H](CO9)O)O)O[C@H]1[C@@H]([C@H]([C@@H](CO1)O)O)O)O)O)O)C)(C)C)CO)O)O)O[C@H]1[C@@H]([C@H]([C@@H]([C@H](O1)CO)O)O)O)O)O[C@H]1[C@@H]([C@H]([C@@H](CO1)O)O[C@H]1[C@@H]([C@H]([C@@H](CO1)O)O)O)O
3 | C/C=C(/C)\C(=O)O[C@H]1[C@@H](C(CC2[C@@]13[C@@H](C[C@@]4([C@@]2(CCC5[C@]4(CCC6[C@@]5(CC[C@@H](C6(C)C)O[C@H]7[C@@H]([C@H]([C@@H]([C@H](O7)C(=O)O)O)O[C@H]8[C@@H]([C@H]([C@H]([C@H](O8)CO)O)O)O[C@H]9[C@@H]([C@@H]([C@H]([C@@H](O9)C)O)O)O[C@H]1[C@@H]([C@H]([C@@H](CO1)O)O)O)O[C@H]1[C@@H]([C@H]([C@H]([C@H](O1)CO)O)O)O)C)C)O[C@H]3O)C)O)(C)C)OC(=O)/C(=C\C)/C
4 | C[C@H]1[C@@H]([C@H]([C@H]([C@@H](O1)O[C@@H]2[C@H]([C@@H]([C@H](O[C@H]2O[C@H]3CC[C@]4(C(C3(C)C)CC[C@@]5(C4CC=C6[C@]5(C[C@H]([C@@]7([C@H]6CC(CC7)(C)C)C(=O)O[C@H]8[C@@H]([C@H]([C@H](CO8)O)O)O[C@H]9[C@@H]([C@H]([C@H]([C@@H](O9)C)O[C@H]1[C@@H]([C@H]([C@@H]([C@H](O1)CO)O)O)O)O)O)O)C)C)C)CO)O[C@H]1[C@@H]([C@H]([C@@H](CO1)O)O)O)O)O)O)O
5 | C[C@@H]1[C@H]([C@@H]([C@@H]([C@H](O1)O[C@H]2[C@@H](CO[C@H]([C@@H]2O)O[C@H]3[C@@H](O[C@H]([C@@H]([C@@H]3O[C@H]4[C@@H]([C@](CO4)(CO)O)O)O)O[C@@H]5[C@H]([C@H](CO[C@H]5OC(=O)[C@@]67CC[C@@]8(C(=CCC9[C@]8(C[C@H](C1[C@@]9(C[C@H]([C@H]([C@@]1(C)CO)O[C@H]1[C@@H]([C@H]([C@@H]([C@H](O1)CO)O)O)O)O)C)O)C)[C@@H]6CC(CC7)(C)C)C)O)O)C)O)O)O)O
6 | C[C@@H]1CC[C@@]2([C@H]([C@H]3[C@@H](O2)[C@H]([C@@H]4[C@@]3(CC[C@H]5[C@H]4CC[C@@H]6[C@@]5(C[C@H]([C@@H](C6)O[C@H]7[C@@H]([C@H]([C@H]([C@H](O7)CO)O[C@H]8[C@@H]([C@H]([C@@H]([C@H](O8)CO)O)O[C@H]9[C@@H]([C@H]([C@@H](CO9)O)O)O)O[C@H]2[C@@H]([C@H]([C@H]([C@H](O2)CO)O)O[C@H]2[C@@H]([C@H]([C@@H]([C@H](O2)CO)O)O)O)O)O)O)O)C)C)O)C)OC1
7 | C[C@@H]1[C@@H]([C@@H]([C@H]([C@@H](O1)OC(=O)[C@]23CCC(C[C@H]2C4=CCC5[C@]6(CC[C@@H]([C@@](C6CC[C@]5([C@@]4(C[C@H]3O)C)C)(C)C=O)O[C@H]7[C@@H]([C@H]([C@@H]([C@H](O7)C(=O)O)O)O[C@H]8[C@@H]([C@H]([C@@H](CO8)O)O)O)O[C@H]9[C@@H]([C@H]([C@H]([C@H](O9)CO)O)O)O)C)(C)C)OC1C(C(C(C(O1)CO)O)O)O)O)OC1C(C(C(C(O1)C)OC1C(C(C(CO1)O)O)O)O)O
8 | C[C@H]1[C@@H]([C@H]([C@H]([C@@H](O1)O[C@@H]2[C@H]([C@@H]([C@H](O[C@H]2O[C@H]3CC[C@]4(C(C3(C)C)CC[C@@]5(C4CC=C6[C@]5(CC[C@@]7([C@H]6CC(CC7)(C)C)C(=O)O[C@H]8[C@@H]([C@H]([C@H](CO8)O)O)O[C@H]9[C@@H]([C@H]([C@H]([C@@H](O9)C)O[C@H]1[C@@H]([C@H]([C@@H]([C@H](O1)CO)O)O)O)O)O)C)C)C)CO)O[C@H]1[C@@H]([C@H]([C@@H](CO1)O)O)O)O)O)O)O
9 | C[C@H]1[C@@H]([C@H]([C@H]([C@@H](O1)O[C@@H]2[C@H]([C@H](CO[C@H]2OC(=O)[C@]34CCC(C[C@H]3C5=CCC6[C@]7(C[C@@H]([C@@H]([C@@](C7[C@@H](C[C@]6([C@@]5(C[C@H]4O)C)C)O)(C)CO)O[C@H]8[C@@H]([C@H]([C@@H]([C@H](O8)CO)O[C@@H]9[C@@H]([C@](CO9)(CO)O)O)O)O)O)C)(C)C)O)O)O)O)O[C@H]1[C@@H]([C@H]([C@@H](CO1)O)O[C@H]1[C@@H]([C@](CO1)(CO)O)O)O
10 | C[C@@H]1[C@@H]([C@@H]([C@H]([C@@H](O1)OC(=O)[C@]23CCC(C[C@H]2C4=CCC5[C@]6(CC[C@@H]([C@@](C6CC[C@]5([C@@]4(C[C@H]3O)C)C)(C)C=O)O[C@H]7[C@@H]([C@H]([C@@H]([C@H](O7)C(=O)O)O)O[C@H]8[C@@H]([C@H]([C@@H](CO8)O)O)O)O[C@H]9[C@@H]([C@H]([C@H]([C@H](O9)CO)O)O)O)C)(C)C)O)OC1C(C(C(C(O1)C)O)O)O)OC1C(C(C(C(O1)C)OC1C(C(C(CO1)O)O)O)O)O
11 | C/C=C(/C)\C(=O)O[C@H]1CC(CC2[C@@]13[C@@H](C[C@@]4([C@@]2(CCC5[C@]4(CCC6[C@@]5(CC[C@@H](C6(C)C)O[C@H]7[C@@H]([C@H]([C@@H]([C@H](O7)C(=O)O)O)O[C@H]8[C@@H]([C@H]([C@H]([C@H](O8)CO)O)O)O[C@H]9[C@@H]([C@@H]([C@H]([C@@H](O9)C)O)O)O[C@H]1[C@@H]([C@](CO1)(CO)O)O)O[C@H]1[C@@H]([C@H]([C@H]([C@H](O1)CO)O)O)O)C)C)O[C@H]3O)C)O)(C)C
12 | C[C@]12CC[C@@H]([C@@](C1CC[C@@]3(C2CC=C4[C@]3(C[C@H]([C@@]5([C@H]4CC(CC5)(C)C)C(=O)O[C@H]6[C@@H]([C@H]([C@@H]([C@H](O6)CO[C@H]7[C@@H]([C@H]([C@@H]([C@H](O7)CO[C@H]8[C@@H]([C@H]([C@H]([C@H](O8)CO)O)O)O)O)O)O)O)O[C@H]9[C@@H]([C@H]([C@@H]([C@H](O9)CO)O)O)O)O)O)C)C)(C)C(=O)O)O[C@H]1[C@@H]([C@H]([C@@H](CO1)O)O)O
13 | C[C@H]1[C@H]2[C@H](CC3[C@@]2(CCC4[C@H]3CC=C5[C@@]4(CC[C@@H](C5)O[C@H]6[C@@H]([C@H]([C@@H]([C@H](O6)CO)O[C@H]7[C@@H]([C@@H]([C@H]([C@@H](O7)C)O)O)O)O)O[C@H]8[C@@H]([C@H]([C@@H]([C@H](O8)CO)O)O[C@H]9[C@@H]([C@H]([C@@H]([C@H](O9)CO)O)O)O)O)C)C)O[C@]11CC[C@@](O1)(C)CO[C@H]1[C@@H]([C@H]([C@@H]([C@H](O1)CO)O)O)O
14 | CC(=O)O[C@H]1CC(CC2[C@]13CO[C@@]24CCC5[C@]6(CC[C@@H]([C@@](C6CC[C@]5([C@@]4(C[C@H]3O)C)C)(C)CO)O[C@H]7[C@@H]([C@H]([C@H](CO7)O[C@H]8[C@@H]([C@H]([C@@H]([C@H](O8)CO)O)O)O[C@H]9[C@@H]([C@H]([C@@H](CO9)O[C@H]1[C@@H]([C@H]([C@@H]([C@H](O1)CO)O)O)O)O)O)O)O[C@H]1[C@@H]([C@H]([C@@H]([C@H](O1)CO)O)O)O)C)(C)C
15 | C[C@H]1[C@@H]([C@H]([C@H]([C@@H](O1)O[C@@H]2[C@H]([C@H]([C@H](O[C@H]2O[C@H]3[C@@H]([C@H](O[C@H]([C@@H]3O[C@H]4[C@@H]([C@H]([C@@H]([C@H](O4)CO)O[C@H]5[C@@H]([C@H]([C@@H](CO5)O)O)O)O)O)O[C@H]6CC[C@@]7(C8CC[C@@]91C2CC(CC[C@]2(CO9)[C@@H](C[C@]1([C@@]8(CCC7C6(C)C)C)C)O)(C)C)C)C(=O)O)O)CO)O)O)O)O)O
16 | C[C@@H]1[C@H]([C@@H]([C@H]([C@@H](O1)O[C@@H]2[C@H](O[C@H]([C@@H]([C@H]2O)O)OC[C@@H]3[C@H]([C@@H]([C@H]([C@@H](O3)OC(=O)[C@@]45CC[C@@]6(C(=CCC7[C@]6(CCC8[C@@]7(C[C@@H]([C@@H]([C@@]8(C)C(=O)O)O)OC(=O)C)C)C)[C@@H]4CC(CC5)(C)C)C)O[C@H]9[C@@H]([C@H]([C@@H]([C@H](O9)CO)O)O)O)O)O)CO)O)O)O



@johnmay
Copy link
Member Author

johnmay commented Feb 4, 2022

Aight, all good now. Pushed one extra one where cdk-dict couldn't see a file which was in cdk-extra.

@sonarcloud
Copy link

sonarcloud bot commented Feb 4, 2022

Kudos, SonarCloud Quality Gate passed!    Quality Gate passed

Bug A 0 Bugs
Vulnerability A 0 Vulnerabilities
Security Hotspot A 0 Security Hotspots
Code Smell A 0 Code Smells

0.0% 0.0% Coverage
0.0% 0.0% Duplication

@egonw egonw merged commit af6a139 into master Feb 6, 2022
@egonw
Copy link
Member

egonw commented Feb 6, 2022

Going to trust you on the updated numbers.

@johnmay johnmay deleted the weighted-path-improvements branch February 9, 2022 15:37
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants