Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix boolean operator in selections to enable using them with within #467

Merged
merged 2 commits into from Sep 9, 2020

Conversation

Luthaf
Copy link
Contributor

@Luthaf Luthaf commented Aug 31, 2020

Fix #371 using GLModel.selectedAtoms to get the set of matching atom for each individual sub-selection, and atom.serial to uniquely identify atoms when merging the sub-selections. Let me know if there is a better way to implement this (maybe directly inside of selectedAtoms?)

I am not sure how I can add tests for this. The linked issue refers to files in tests/auto, and these seems to do visual comparison of the results. My issue is that these tests don't pass on my browser (Firefox 80 on macOS 10.14.6, Intel integrated GPU), so I don't know how to make sure the test I add will work. I manually checked that this works fine for my simple use case (adding the central atom to the within selection), but I'm happy to add more tests.


Regarding code style, I used let instead of var to get proper variable scoping, which should be universally supported nowadays (https://caniuse.com/#feat=let). Let me know if that's an issue, I can try to make it work with var instead.

Copy link
Contributor

@dkoes dkoes left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you please redo this without introducing a bunch of whitespace changes? In addition to breaking the formatting, I can't find where the actual code change is.

Reference test images are generated using Chrome on Ubuntu Linux, but even using the same software, images can sometimes differ with different graphics drivers, which is why the image comparisons can be sorted by degree of change so they can be visually assessed.

A test case would be really helpful (include an image generated with Chrome if possible) so I can see what the goal is.

Thanks!

@Luthaf
Copy link
Contributor Author

Luthaf commented Aug 31, 2020

Can you please redo this without introducing a bunch of whitespace changes? In addition to breaking the formatting, I can't find where the actual code change is.

Done. I had separated the automated whitespace changes in its own commit to allow commit-by-commit review, so removing them was easy =)

I'll try to generate images this week when I get to a Linux machine.

@@ -1791,26 +1791,50 @@ $3Dmol.GLModel = (function() {
}
}else{//"or" and "and"
if(key == "and"){
var and=sel[key];//array format
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So I think the goal here is to use selectedAtoms recursively which should handle within etc. statements unlike atomIsSelected. This seems like a good idea.

Using index instead of serial should be more robust (index is suppose to be the position in the atoms array, whereas serial can be set in the PDB file).

Since we are keying on index, the implementation would be more efficient if they were all put in a Set and then the intersection/union taken. This would also support refactoring or/and to eliminate some duplicate code (since the only difference is the set operation). Using set lookup is O(1) vs the current O(n) search.

It would also be a good idea to memoize the result of the selectedAtoms call for even better efficiency.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Using index instead of serial should be more robust (index is suppose to be the position in the atoms array, whereas serial can be set in the PDB file).

Is index part of AtomSpec too? If so I can add it to #468.

I agree on the memoization & use of Set, I'll change this to use them.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd rather leave it an internal property.

break;
} else { //"or" and "and"
// cache the evaluation of the selection for better performance
if (sel[key].__cached_results === undefined) {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would also be a good idea to memoize the result of the selectedAtoms call for even better efficiency.

This is how I've implemented it, caching the full evaluation of the and/or part of the selection. My only concern here is that the selection could change without the code realizing with something like

const selection = {and: [{serial: 4}, {resn: "ALA"}]};
model.selectedAtoms(selection);

// the selection changes, but we don't know it!
selection.and[0].serial = 67;
model.selectedAtoms(selection);

I can see two ways to solve this: add the hash of the selection with the cached result (something like md5(JSON.stringify(sel)), and check that this did not change since last evaluation; or clean/remove __cached_results when starting a new evaluation of the selection.

The second solution require that there is a single entry point for selection evaluation, which could be GLModel.selectedAtoms. Currently, it looks like users could also use atomIsSelected as an entry point, so I could make this a private function.

As a side note, it looks like atomIsSelected do not accept the same selection elements as selectedAtoms (stuff like within/byres is not supported). Should this function be made private anyway?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rather than hashing, my preference would be to copy the initial selection object. At little more efficient computationally, a little less efficient memory-wise. We would need to write a helper function that recursively traverses the selection object copying fields as needed (no need to copy if a cache already exists). It could also precompute the cached selections. Then instead of atomIsSelected computing the selections, it will check for the existence of a cache, if it exists it uses it, otherwise it does what it does now. This keeps backwards compatibility but makes the common path more efficient.

What do you think?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Where would the copy of the selection object take place? In selectedAtoms or in atomIsSelected? I guess selectedAtoms would make the most sense here.

@dkoes dkoes merged commit 6dffe55 into 3dmol:master Sep 9, 2020
@Luthaf
Copy link
Contributor Author

Luthaf commented Sep 9, 2020

Thanks for merging this! I'll do a follow-up PR to deal with the caching/copying of the selection.

Where do you think the copy of the selection object take place? In selectedAtoms or in atomIsSelected? I guess selectedAtoms would make the most sense here.

@Luthaf Luthaf deleted the boolean-ops-selections branch September 9, 2020 22:35
@dkoes
Copy link
Contributor

dkoes commented Sep 9, 2020

Yes, selectedAtoms would be the best place.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

better boolean selection operators
2 participants