Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

issue with the tutorial Chignolin_Coarse-Grained_Tutorial #23

Closed
pojeda opened this issue Mar 11, 2022 · 16 comments
Closed

issue with the tutorial Chignolin_Coarse-Grained_Tutorial #23

pojeda opened this issue Mar 11, 2022 · 16 comments

Comments

@pojeda
Copy link

pojeda commented Mar 11, 2022

Hi,

I am trying the Chignolin tutorial but I got some errors at the step "PDB to PSF conversion":

---------------------------------------------------------------------------
InvalidIndexError                         Traceback (most recent call last)
Input In [2], in <cell line: 6>()
      3 PDB_file = 'data/chignolin_cln025.pdb'
      4 PSF_file = 'data/chignolin_ca_top.psf'
----> 6 pdb2psf_CA(PDB_file, PSF_file, bonds = True, angles = False)

File torchmd_cg/utils/psfwriter.py:7, in pdb2psf_CA(pdb_name_in, psf_name_out, bonds, angles)
      6 def pdb2psf_CA(pdb_name_in, psf_name_out, bonds=True, angles=True):
----> 7     mol = Molecule(pdb_name_in)
      8     mol.filter("name CA")
     10     n = mol.numAtoms

...

File pandas/core/indexes/base.py:5637, in Index._check_indexing_error(self, key)
   5633 def _check_indexing_error(self, key):
   5634     if not is_scalar(key):
   5635         # if key is not a scalar, directly raise an error (the code below
   5636         # would convert to numpy arrays and raise later any way) - GH29926
-> 5637         raise InvalidIndexError(key)

InvalidIndexError: []

Do you know about this issue?

@MaciejMajew
Copy link
Contributor

Hi! It seems like moleculekit is not able to read the molecule. You should check if your PDB file is not broken.

@MaciejMajew
Copy link
Contributor

If you can share the file you're trying to convert I can take a look

@pojeda
Copy link
Author

pojeda commented Mar 15, 2022

I attached the file

files.zip

@MaciejMajew
Copy link
Contributor

I can't reproduce your error. You might be better off just reinstalling the package in the fresh environment. Sorry I cannot be of much help here. Maybe @stefdoerr will know better.

@stefdoerr
Copy link
Collaborator

I can't reproduce it either. I would also suggest installing a clean env with the latest moleculekit etc.

@pojeda
Copy link
Author

pojeda commented Mar 18, 2022

I rebuilt torchmd-cg in a new env but still I get the same problem at the step of pdb to psf conversion. My only guess now is the python version, I am using 3.9.6. Have you installed and tested torchmd-cg with this python version?

InvalidIndexError                         Traceback (most recent call last)
Input In [4], in <cell line: 6>()
      3 PDB_file = 'data/chignolin_cln025.pdb'
      4 PSF_file = 'data/chignolin_ca_top.psf'
----> 6 pdb2psf_CA(PDB_file, PSF_file, bonds = True, angles = False)

File /env-torchmd/lib/python3.9/site-packages/torchmd_cg/utils/psfwriter.py:7, in pdb2psf_CA(pdb_name_in, psf_name_out, bonds, angles)
      6 def pdb2psf_CA(pdb_name_in, psf_name_out, bonds=True, angles=True):
----> 7     mol = Molecule(pdb_name_in)
      8     mol.filter("name CA")
     10     n = mol.numAtoms

File /env-torchmd/lib/python3.9/site-packages/moleculekit/molecule.py:299, in Molecule.__init__(self, filename, name, **kwargs)
    296 self.viewname = name
    298 if filename is not None:
--> 299     self.read(filename, **kwargs)

File /env-torchmd/lib/python3.9/site-packages/moleculekit/molecule.py:1147, in Molecule.read(self, filename, type, skip, frames, append, overwrite, keepaltloc, guess, guessNE, _logger, **kwargs)
   1145 for rr in readers:
   1146     try:
-> 1147         mol = rr(fname, frame=frame, topoloc=tmppdb, **kwargs)
   1148     except FormatError:
   1149         continue

File /env-torchmd/lib/python3.9/site-packages/moleculekit/readers.py:1100, in PDBread(filename, mode, frame, topoloc, validateElements, uniqueBonds)
   1098 if "element" in parsedtopo:
   1099     idx, newelem = pdbGuessElementByName(parsedtopo.element, parsedtopo.name)
-> 1100     parsedtopo.at[idx, "element"] = newelem
   1102 for field in topodtypes:
   1103     if (
   1104         field in parsedtopo
   1105         and topodtypes[field] == str
   1106         and parsedtopo[field].dtype == object
   1107     ):

File /env-torchmd/lib/python3.9/site-packages/pandas/core/indexing.py:2274, in _AtIndexer.__setitem__(self, key, value)
   2271     self.obj.loc[key] = value
   2272     return
-> 2274 return super().__setitem__(key, value)

File /env-torchmd/lib/python3.9/site-packages/pandas/core/indexing.py:2229, in _ScalarAccessIndexer.__setitem__(self, key, value)
   2226 if len(key) != self.ndim:
   2227     raise ValueError("Not enough indexers for scalar access (setting)!")
-> 2229 self.obj._set_value(*key, value=value, takeable=self._takeable)

File /env-torchmd/lib/python3.9/site-packages/pandas/core/frame.py:3869, in DataFrame._set_value(self, index, col, value, takeable)
   3867 else:
   3868     series = self._get_item_cache(col)
-> 3869     loc = self.index.get_loc(index)
   3871 # setitem_inplace will do validation that may raise TypeError
   3872 #  or ValueError
   3873 series._mgr.setitem_inplace(loc, value)

File /env-torchmd/lib/python3.9/site-packages/pandas/core/indexes/range.py:388, in RangeIndex.get_loc(self, key, method, tolerance)
    386         except ValueError as err:
    387             raise KeyError(key) from err
--> 388     self._check_indexing_error(key)
    389     raise KeyError(key)
    390 return super().get_loc(key, method=method, tolerance=tolerance)

File /env-torchmd/lib/python3.9/site-packages/pandas/core/indexes/base.py:5637, in Index._check_indexing_error(self, key)
   5633 def _check_indexing_error(self, key):
   5634     if not is_scalar(key):
   5635         # if key is not a scalar, directly raise an error (the code below
   5636         # would convert to numpy arrays and raise later any way) - GH29926
-> 5637         raise InvalidIndexError(key)

InvalidIndexError: []

@stefdoerr
Copy link
Collaborator

Can you try just the following please and see if it works?

from moleculekit.molecule import Molecule
mol = Molecule('data/chignolin_cln025.pdb')

@pojeda
Copy link
Author

pojeda commented Mar 18, 2022

the output of those lines is:

---------------------------------------------------------------------------
InvalidIndexError                         Traceback (most recent call last)
Input In [2], in <cell line: 2>()
      1 from moleculekit.molecule import Molecule
----> 2 mol = Molecule('data/chignolin_cln025.pdb')

File /env-torchmd/lib/python3.9/site-packages/moleculekit/molecule.py:299, in Molecule.__init__(self, filename, name, **kwargs)
    296 self.viewname = name
    298 if filename is not None:
--> 299     self.read(filename, **kwargs)

File /env-torchmd/lib/python3.9/site-packages/moleculekit/molecule.py:1147, in Molecule.read(self, filename, type, skip, frames, append, overwrite, keepaltloc, guess, guessNE, _logger, **kwargs)
   1145 for rr in readers:
   1146     try:
-> 1147         mol = rr(fname, frame=frame, topoloc=tmppdb, **kwargs)
   1148     except FormatError:
   1149         continue

File /env-torchmd/lib/python3.9/site-packages/moleculekit/readers.py:1100, in PDBread(filename, mode, frame, topoloc, validateElements, uniqueBonds)
   1098 if "element" in parsedtopo:
   1099     idx, newelem = pdbGuessElementByName(parsedtopo.element, parsedtopo.name)
-> 1100     parsedtopo.at[idx, "element"] = newelem
   1102 for field in topodtypes:
   1103     if (
   1104         field in parsedtopo
   1105         and topodtypes[field] == str
   1106         and parsedtopo[field].dtype == object
   1107     ):

File /env-torchmd/lib/python3.9/site-packages/pandas/core/indexing.py:2274, in _AtIndexer.__setitem__(self, key, value)
   2271     self.obj.loc[key] = value
   2272     return
-> 2274 return super().__setitem__(key, value)

File /env-torchmd/lib/python3.9/site-packages/pandas/core/indexing.py:2229, in _ScalarAccessIndexer.__setitem__(self, key, value)
   2226 if len(key) != self.ndim:
   2227     raise ValueError("Not enough indexers for scalar access (setting)!")
-> 2229 self.obj._set_value(*key, value=value, takeable=self._takeable)

File /env-torchmd/lib/python3.9/site-packages/pandas/core/frame.py:3869, in DataFrame._set_value(self, index, col, value, takeable)
   3867 else:
   3868     series = self._get_item_cache(col)
-> 3869     loc = self.index.get_loc(index)
   3871 # setitem_inplace will do validation that may raise TypeError
   3872 #  or ValueError
   3873 series._mgr.setitem_inplace(loc, value)

File /env-torchmd/lib/python3.9/site-packages/pandas/core/indexes/range.py:388, in RangeIndex.get_loc(self, key, method, tolerance)
    386         except ValueError as err:
    387             raise KeyError(key) from err
--> 388     self._check_indexing_error(key)
    389     raise KeyError(key)
    390 return super().get_loc(key, method=method, tolerance=tolerance)

File /env-torchmd/lib/python3.9/site-packages/pandas/core/indexes/base.py:5637, in Index._check_indexing_error(self, key)
   5633 def _check_indexing_error(self, key):
   5634     if not is_scalar(key):
   5635         # if key is not a scalar, directly raise an error (the code below
   5636         # would convert to numpy arrays and raise later any way) - GH29926
-> 5637         raise InvalidIndexError(key)

InvalidIndexError: []

@stefdoerr
Copy link
Collaborator

Please paste here the results of these two commands

conda list moleculekit
conda list pandas

@pojeda
Copy link
Author

pojeda commented Mar 18, 2022

I am not using conda as the center where I work don't support it. I use pip only.

@stefdoerr
Copy link
Collaborator

Yes ok that explains the issues. Dependency handling is very bad in pip and many packages don't even exist or are on different versions.
Is it not possible for you to install Miniconda in your home directory?

If not, then show me also:

pip show pandas

But I think you will run into many more issues down the line if you try to make it work with pip.

@pojeda
Copy link
Author

pojeda commented Mar 18, 2022

Name: pandas
Version: 1.4.1
Summary: Powerful data structures for data analysis, time series, and statistics
Home-page: https://pandas.pydata.org
Author: The Pandas Development Team
Author-email: pandas-dev@python.org
License: BSD-3-Clause
Location: /env-torchmd/lib/python3.9/site-packages
Requires: python-dateutil, pytz, numpy
Required-by: seaborn, moleculekit

@stefdoerr
Copy link
Collaborator

That looks correct. And pip show moleculekit?

@pojeda
Copy link
Author

pojeda commented Mar 18, 2022

Is there any issues with the license "unknown" setting?

Name: moleculekit
Version: 0.9.14
Summary: A molecule reading/writing and manipulation package.
Home-page: https://github.com/acellera/moleculekit/
Author: Acellera
Author-email: info@acellera.com
License: UNKNOWN
Location: /env-torchmd/lib/python3.9/site-packages
Requires: networkx, numpy, tqdm, pandas, scipy
Required-by:

@stefdoerr
Copy link
Collaborator

that's too old. try pip install moleculekit==1.1.8

@pojeda
Copy link
Author

pojeda commented Mar 18, 2022

Hi, that command creates some issues with metadata, I used the following one and it worked:

pip install --upgrade --no-cache-dir --use-deprecated=legacy-resolver moleculekit==1.1.8

Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants