Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Confgen: add option to use symmetry when doing RMS pruning #3813

Merged
merged 4 commits into from
Feb 24, 2021

Conversation

greglandrum
Copy link
Member

This adds an option (and makes it the default) to use symmetry when doing RMS pruning (the equivalent of using getBestRMS()) in the conformation generation code.

In order to avoid "uh oh" moments where the symmetry pruning takes an incredibly long time, setting this option causes Hs to be ignored while doing the RMSD calculations (ignoring Hs is the default anyway).

@@ -1050,6 +1051,49 @@ void embedHelper_(int threadId, int numThreads, EmbedArgs *eargs,
delete positions[i];
}
}

std::vector<std::vector<unsigned int>> getMolSelfMatches(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it would be useful to make this a utility class, something like:

bool symmetry=true;
RMSDSet symm(mol, symmetry);
if (symm.rmsd(conf)) < 0.5) {
   symm.add(conf);
}

Perhaps not for this PR.

if (ssr < ssrThres) {
return false;
for (const auto &match : selfMatches) {
for (auto confi = mol.beginConformers(); confi != mol.endConformers();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should you loop through confs -> matches or matches -> confs?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think it matters? Am I missing something?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The self matches are contiguous in memory and will most likely be faster to access in the inner loop.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, makes sense.
Given the call to the alignment code that's in the loop, I think this is unlikely to be a bottleneck here. I can change it if you think it'll make a difference

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It makes my premature optimization sensor happy.

@bp-kelley
Copy link
Contributor

I don't see any red flags here. This does seem suited for a help class for RMSD which could be generally useful.

@bp-kelley
Copy link
Contributor

@greglandrum Fix the conflicts, and I'll go ahead and merge. I'll test the speed and if there is any hint of a speedup all do a PR

@greglandrum greglandrum merged commit 9d4ac58 into rdkit:master Feb 24, 2021
@greglandrum greglandrum deleted the feat/use_best_rms_in_confgen branch February 24, 2021 07:27
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants