Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BUG: A Molecule object in the cclib TaskDocument can't be printed #406

Closed
Andrew-S-Rosen opened this issue Jun 27, 2023 · 5 comments · Fixed by #411
Closed

BUG: A Molecule object in the cclib TaskDocument can't be printed #406

Andrew-S-Rosen opened this issue Jun 27, 2023 · 5 comments · Fixed by #411

Comments

@Andrew-S-Rosen
Copy link
Member

Minimal example:

from atomate2.common.schemas.cclib import TaskDocument

task = TaskDocument.from_logfile(".", "orca.out")
print(task)

orca.out.txt

Traceback:

ValueError                                Traceback (most recent call last)
File ~/software/miniconda/envs/quacc/lib/python3.10/site-packages/IPython/core/formatters.py:708, in PlainTextFormatter.__call__(self, obj)
    701 stream = StringIO()
    702 printer = pretty.RepresentationPrinter(stream, self.verbose,
    703     self.max_width, self.newline,
    704     max_seq_length=self.max_seq_length,
    705     singleton_pprinters=self.singleton_printers,
    706     type_pprinters=self.type_printers,
    707     deferred_pprinters=self.deferred_printers)
--> 708 printer.pretty(obj)
    709 printer.flush()
    710 return stream.getvalue()

File ~/software/miniconda/envs/quacc/lib/python3.10/site-packages/IPython/lib/pretty.py:410, in RepresentationPrinter.pretty(self, obj)
    407                         return meth(obj, self, cycle)
    408                 if cls is not object \
    409                         and callable(cls.__dict__.get('__repr__')):
--> 410                     return _repr_pprint(obj, self, cycle)
    412     return _default_pprint(obj, self, cycle)
    413 finally:

File ~/software/miniconda/envs/quacc/lib/python3.10/site-packages/IPython/lib/pretty.py:778, in _repr_pprint(obj, p, cycle)
    776 """A pprint that just redirects to the normal repr function."""
    777 # Find newlines and replace them with p.break_()
--> 778 output = repr(obj)
    779 lines = output.splitlines()
    780 with p.group():

File ~/software/miniconda/envs/quacc/lib/python3.10/site-packages/pydantic/utils.py:411, in pydantic.utils.Representation.__repr__()

File ~/software/miniconda/envs/quacc/lib/python3.10/site-packages/pydantic/utils.py:390, in genexpr()

File ~/software/miniconda/envs/quacc/lib/python3.10/site-packages/pydantic/utils.py:390, in genexpr()

File ~/software/miniconda/envs/quacc/lib/python3.10/site-packages/pymatgen/core/structure.py:3113, in IMolecule.__repr__(self)
   3111 outs = ["Molecule Summary"]
   3112 for s in self:
-> 3113     outs.append(s.__repr__())
   3114 return "\n".join(outs)

File ~/software/miniconda/envs/quacc/lib/python3.10/site-packages/pymatgen/core/sites.py:211, in Site.__repr__(self)
    210 def __repr__(self):
--> 211     return f"Site: {self.species_string} ({self.coords[0]:.4f}, {self.coords[1]:.4f}, {self.coords[2]:.4f})"

ValueError: Unknown format code 'f' for object of type 'numpy.str_'

The issue is in task["attributes"]["molecule_unoriented"].sites I believe. Something is getting stored as a np.str_ instead of str. Annoying.

Tagging @janosh in case this has something to do with Pymatgen directly, but I don't think it does. I'll revisit this in a few days.

@JaGeo
Copy link
Member

JaGeo commented Jun 27, 2023

I had something similar with the Bandstructure Object. materialsproject/pymatgen#3096

@Andrew-S-Rosen
Copy link
Member Author

Interesting... thanks for sharing!

@janosh
Copy link
Member

janosh commented Jun 27, 2023

I'm guessing this is a deserialization bug in cclib.io.ccread which is used to parse the logfile

cclib_obj = ccread(logfile, logging.ERROR)

and then asked for the atomcoords

coords = cclib_obj.atomcoords

Coordinates clearly shouldn't be parsed as strings but floats. I don't think going from np.str_ to str would help here, so I think materialsproject/pymatgen#3096 is unrelated.

We could accommodate for string types in pymatgen by casting coords to floats here but would definitely be cleaner to not read coordinates as strings in the first place.

@Andrew-S-Rosen
Copy link
Member Author

Andrew-S-Rosen commented Jun 28, 2023

I'm closing this as not reproducible.

I did

conda create --name atomate2 python=3.10
pip install pymatgen atomate2 cclib

and ran the above code block. It works fine. However, in another environment I have, it does not work. Both have the same versions of Python, pymatgen, cclib, atomate2, and numpy, so I'm at a loss. I'll just pray that it doesn't come up again, but if it does, I'll re-open the issue.

@Andrew-S-Rosen
Copy link
Member Author

I've reopened it because I can now reproduce the issue! lol. It only happens with the main branch of Atomate2, probably because that's when I last made a change to the TaskDocument. Okay, now for me to dig into. Sorry about the noise, friends.

conda create --name atomate2 python=3.10
pip install pymatgen cclib
pip install git+https://github.com/materialsproject/atomate2.git

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants