New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
FastNS
yields wrong results on simple test case
#2345
Comments
Seems to be similar to #2229. Unfortunately there seems to be a tedious bug when putting the atoms in the sub cells. |
Yeah, I'd seen your issue there before and it's not a stretch to think they're related. Not promising that your issue remains open with no progress so much later though... this seems like a pretty important and fundamental bug. |
@Linux-cpp-lisp fair point – but with no-one getting paid for full-time MDA support, even important issues often wait for someone desperate enough to dig into the code. I can ping @ayushsuhane @richardjgowers @zemanj and see if anyone has any ideas. In the worst case we might have to disable FastNS for now and fall back to PKD. |
@Linux-cpp-lisp can you please make input files (e.g. a GRO or PDB file with the atoms and box information) and code available that we can use for testing and potentially for including with the tests (i.e., must be under GPL v2)? |
@orbeckst of course, and their volunteer work is very (very) appreciated. I realize now that my comment could be read as a lament about support; I only meant that the time the issue had spent open was likely a sign that the underlying bug is non-trivial. The code above in the original post is a self contained test-case; consider this post from me as the author making it available to you under GPL v2. It's interesting to note that if I save the structure (
Which is still wrong, but different. If I load from a CIF file, instead, I get the right results:
I think this is because the cell vectors are rounded after loading in the PDB (someone in the other issue had mentioned rounding as a possible cause of the overall issue):
|
Thank you! My comment was a bit over-sensitive and defensive – I am not always sure people understand how much work it is to even keep a medium-sized open source project such as MDA in a useable state, let alone improving it and fixing bugs. I appreciate your clarifying comment! The more we get feedback the better. The insight that the precision of the box might be important and that the FastNS algorithm might be very sensitive to something that it shouldn't be overly sensitive to is interesting. Thanks for taking the time to give detailed feedback. I would guess for every user like you there are between 10 and 100 who decide that it's too much effort and move on. This is, of course a valid choice because everybody is busy. But that's why we value it even more when someone takes the time to submit well-researched bug reports. I just wish we could always and immediately make best use of this information. However, hopefully by triangulating bugs it will become easier to fix them eventually. By any chance, did you try the periodic KD-Tree https://www.mdanalysis.org/docs/documentation_pages/lib/pkdtree.html to see if it has similar precision issues? |
@orbeckst: Not at all, and I'm fairly familiar, at least at a small scale, with just how much effort open-source upkeep is. I haven't but I can give that a try when I have a chance, since I need to (at least temporarily) replace |
@Linux-cpp-lisp I just ran the current development branch against your example code with the ASE-generated fcc lattice and got even worse results. I put a couple of Test codeimport numpy as np
from MDAnalysis.lib.distances import distance_array, apply_PBC
from MDAnalysis.lib.mdamath import triclinic_box, triclinic_vectors
from MDAnalysis.lib.nsgrid import FastNS
from ase.build import fcc111
#search radius:
cutoff = 3.2
print("selected cutoff: {}".format(cutoff))
#generate FCC system with ASE:
slab = fcc111('Ag', size=(5, 5, 4), vacuum=10.0)
tcbox = triclinic_box(*slab.cell)
#ASE-generated box vectors:
print("\nASE box vectors:\n"
"[{:15.10f} {:15.10f} {:15.10f}]\n"
"[{:15.10f} {:15.10f} {:15.10f}]\n"
"[{:15.10f} {:15.10f} {:15.10f}]"
"".format(*(slab.cell[0]), *(slab.cell[1]), *(slab.cell[2])))
#MDA box from generated ASE box:
print("\nMDAnalysis box:\n"
"[{:15.10f} {:15.10f} {:15.10f} {:15.10f} {:15.10f} {:15.10f}]"
"".format(*tcbox))
#MDA box vectors:
tcbox_vecs = triclinic_vectors(tcbox)
print("\nMDAnalysis box vectors:\n"
"[{:15.10f} {:15.10f} {:15.10f}]\n"
"[{:15.10f} {:15.10f} {:15.10f}]\n"
"[{:15.10f} {:15.10f} {:15.10f}]"
"".format(*(tcbox_vecs[0]), *(tcbox_vecs[1]), *(tcbox_vecs[2])))
#bruteforce distance search using lib.distances.distance_array:
da = distance_array(slab.positions, slab.positions, box=tcbox)
bf_indices = []
bf_cn = []
for i in range(da.shape[0]):
bf_cn.append(len([ix for ix in range(da.shape[1]) \
if 0.0 < da[i, ix] < cutoff]))
#grid-based distance search using lib.nsgrid.FastNS:
fns = FastNS(cutoff=cutoff, coords=slab.positions, box=tcbox, pbc=True)
ns = fns.self_search()
ns_cn = [len(nsi) for nsi in ns.get_indices()]
print("\nBrute force coordination numbers:\n"
"CNs: {}\n"
"counts: {}".format(*(np.unique(bf_cn, return_counts=True))))
print("\nFastNS coordination numbers:\n"
"CNs: {}\n"
"counts: {}".format(*(np.unique(ns_cn, return_counts=True)))) Test output
Obviously, this is even worse than what you got. Furthermore, while playing around with the test code, I discovered an additional bug: Test output for a (3,3,4) fcc lattice
Now that there are only two 2 cells in the x- and y-direction, Sources of Error
Quick(?)-and-dirty solution
Proper solution: complete re-implementationCurrently, there are multiple sources for precision issues in the
A new implementation should therefore incorporate the following changes:
I'm looking forward to everyone's comments/suggestions! |
@zemanj thanks for the detailed analysis. Give the issues that are coming to light with FastNS, should we disable it (or make it explicitly opt-in with a warning?) for the time being and fall back to the slower approaches? |
If nobody finds the time to properly fix this thing before the next release we have to disable it until we have a replacement. I wouldn't expect making broken code opt-in could be valuable for anyone but maybe it's technically easier than throwing it out altogether. |
@orbeckst I'm taking a look - from initial investigations, freud may be mishandling this as well, but I need to check more closely before I make a definitive conclusion. |
@richardjgowers thank you so much for fixing this issue!! Is there a way in code to check if the installed MDAnalysis contains your fix? As in, if I want to write: if ns_grid_ok:
# use FastNS
else:
# use ASE neighbor_list what would Also, just out of curiosity, is there any sense of when there will be a PyPI release that includes this fix? Maybe @orbeckst can speak to that? Once again, thanks all for your work on this issue and code. |
@IAlibay the fix will be on 1.1.0, right? Any ETA?
So checking MDAnalysis.__version__ should be enough.
… Am 3/18/21 um 18:51 schrieb Alby M. ***@***.***>:
@richardjgowers thank you so much for fixing this issue!!
Is there a way in code to check if the installed MDAnalysis contains your fix? As in, if I want to write:
if ns_grid_ok:
# use FastNS
else:
# use ASE neighbor_list
what would ns_grid_ok be?
Also, just out of curiosity, is there any sense of when there will be a PyPI release that includes this fix? Maybe @orbeckst can speak to that?
Once again, thanks all for your work on this issue and code.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub, or unsubscribe.
|
Yes that's correct.
We have a tentative deadline of this Sunday for new code contributions for 1.1.0. We'll put a code freeze on then and aim to release shortly after (we will need a bit of time to a) check that Windows works again, b) create a final list of max tested dependency versions). |
Expected behavior
FastNS
misses neighbors in a perfect 111 FCC slab (an ideal, oversimplified case).Actual behavior
Instead of identifying 9 or 12 neighbors for every atom,
FastNS
identifies 9 or 12 for most atoms, with some strange exceptions:Or see the structure, with atoms colored by coordination number:
Code to reproduce the behavior
(Requires
ase
.)The Ag-Ag distance in this system is 2.89Å, so a 3.2Å cutoff is quite sufficient.
The results are identical for cutoffs of 3.0Å and 2.9Å. At a cutoff of 2.5Å, as expected, all atoms have a CN of 0.
The results are correct, for example, using
pymatgen
's (very, very, very slow)CutOffDictNN
:Currently version of MDAnalysis
python -V
)? Python 3.6.8 :: Anaconda, Inc.The text was updated successfully, but these errors were encountered: