-
Notifications
You must be signed in to change notification settings - Fork 59
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Bug filtering large system of waters #276
Comments
Also |
wow, it's really weird, no? |
It's getting more and more weird.
Have to find out if this is a regression in VMD and if it's also related to the bond guessing. Might also be two separate issues though |
Ok this is a bug in our PSFwriter which overflows fields. Will fix now |
Fixed the PSF writing bug Was independent of bond guessing bug which will need more investigation now |
This is just bound to be fun: In [52]: mol = Molecule('./structure.pdb')
In [53]: np.sum(mol.atomselect('water and name O'))
Out[53]: 32229
In [54]: np.sum(mol.atomselect('water'))
Out[54]: 96688
In [55]: 32229*3
Out[55]: 96687
In [56]: mol = Molecule('./structure.prmtop')
In [57]: np.sum(mol.atomselect('water'))
Out[57]: 294849 |
do you need help, @stefdoerr ? |
The issue is very probably in the C code of the atomselect, so it will be tricky. Feel free to look at it of course but I think it will be quite some work. |
Works fine in VMD >Main< (test-htmd) 44 % mol new structure.pdb autobonds off
1
>Main< (test-htmd) 45 % set x [atomselect top "water"]
atomselect1
>Main< (test-htmd) 46 % $x num
294849 |
also, we need to create tests for |
Ok found the issue. It's not related to bonds at all since these are apparently not used for the "water" selection (you can do it with 0 coordinates everywhere). It's related to the segids. When we read The issue here is that VMD C code probably assumes that segids are 2 characters as per PDB format. In our case though with all those waters, we have enough TERs to reach triple-digit segids. The only way I can think of to deal with this is to check number of letters in segids and if they go over 2 drop them and throw a warning during atomselect that the segid was not used. |
that sounds like a good idea. it's still weird to be how |
Ok so I was partially wrong about my 2 character assumption. It works up to around 10.000+ segids and then it breaks. I assume we don't work much with such huge systems and hence didn't notice till now. Works great if I just use alternating resids |
Segids or resids? |
Breaks between 20000 and 50000 segids. Go figure. I will stop here trying to find the exact limit. |
@j3mdamas segids |
that is one of weirdest bugs ever... but how would the alternating resids thing work? |
0 until a change in segid happens then 1 until a change happens etc. |
I agree. |
Done, fixed. Pushing it now. In [2]: mol = Molecule('./structure.pdb')
2017-03-15 17:00:43,136 - htmd.molecule.readers - WARNING - Reading PDB file with more than 99999 atoms. Bond information can be wrong.
In [3]: mol.filter('not water')
2017-03-15 17:00:48,173 - htmd.molecule.vmdparser - WARNING - More than 3 characters were used for segnames. Due to limitations in VMD atomselect segids will be dropped for the atomselection.
2017-03-15 17:00:49,565 - htmd.molecule.molecule - INFO - Removed 294849 atoms. 7414 atoms remaining in the molecule. |
Seems to be related with bond calculation. Since at each successive step if finds more molecules as water and keeps removing. Found at
loro:/tmp/test-htmd/structure.pdb
. Works find withprmtop
I guess because it contains the correct bonds.The text was updated successfully, but these errors were encountered: