Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unintuitive behavior when saving to .cif related to space groups #1230

Closed
RoberTnf opened this issue Aug 6, 2018 · 5 comments
Closed

Unintuitive behavior when saving to .cif related to space groups #1230

RoberTnf opened this issue Aug 6, 2018 · 5 comments

Comments

@RoberTnf
Copy link
Contributor

@RoberTnf RoberTnf commented Aug 6, 2018

Summary

  • Space group information lost when saving to cif

Example code

import pymatgen as mg
from pymatgen import Lattice, Structure, Molecule

# Set up a cubic structure and write to a CIF.
lattice = mg.Lattice.cubic(4.2)
s1 = mg.Structure(lattice, ["Cs", "Cl"],[[0, 0, 0], [0.5, 0.5, 0.5]])
s1.to(filename="s1.cif")

In the cif file:

_symmetry_space_group_name_H-M   'P 1'

It should be

_symmetry_space_group_name_H-M   Pm-3m

This works as intended

import pymatgen as mg
from pymatgen import Lattice, Structure, Molecule
from pymatgen.io.cif import CifWriter

# Set up a cubic structure and write to a CIF.
lattice = mg.Lattice.cubic(4.2)
s1 = mg.Structure(lattice, ["Cs", "Cl"],[[0, 0, 0], [0.5, 0.5, 0.5]])
writer = CifWriter(s1, symprec=0.01)
writer.write_file("s1.cif")

Suggested solution

Maybe symprec should be accessible when saving a structure using the method to? I'm submitting a PR for it.

@mkhorton

This comment has been minimized.

Copy link
Contributor

@mkhorton mkhorton commented Aug 6, 2018

My personal opinion on this is that Structure.to is a convenience method and if you want more customization you should use CifWriter directly.

Your proposed solution of adding a symprec kwarg to Structure.to is fairly harmless, but it’s also very specific to just CIF output. This could also be confusing if people expect the symprec arg to have an effect when they use Structure.to to output a different file format. It might also be awkward because you could imagine adding kwargs for many other options for the other supported output formats too, which would lead to Structure.to becoming over-complicated.

Also, I’d argue that this is not unexpected behavior because pymatgen’s Structure is implicitly P1. Symmetry information isn’t stored; it’s computed on demand. Outputting a symmetrized CIF is actually lossy depending on your symmetry precision. Perhaps SymmetrizedStructure should output a symmetrized CIF by default though.

@RoberTnf

This comment has been minimized.

Copy link
Contributor Author

@RoberTnf RoberTnf commented Aug 6, 2018

I see, makes sense. Then what's the purpose of _symmetry_space_group_name_H-M in this case? Wouldn't it be better to not write that line, to avoid people trusting it (like I was).

EDIT: same for _symmetry_Int_Tables_number

@mkhorton

This comment has been minimized.

Copy link
Contributor

@mkhorton mkhorton commented Aug 6, 2018

Yeah, I don't know the answer to that for sure. I can definitely see why this would be confusing; you're not the first person to flag this.

I would say that the purpose of these tags is to tell the software parsing the CIF file what symmetry operations it needs to apply to the list of atomic positions to construct the crystal. This is important to many tools, e.g. visualization tools, etc.

In this sense, supplying P1 is technically correct, it says: "we've supplied all the atomic positions, the only symmetry operation you need to apply is the identity." I'm not sure if this is strictly required by the standard or if P1 is assumed by default if symmetry information is not supplied, but in any case I'd expect many third-party tools that parse CIF to break if we didn't include this line.

Perhaps a remedy to this would be to add a human-readable comment to the file, explaining it's been outputted in a P1 setting but also providing the calculated symmetry within the given symmetry tolerance.

@RoberTnf

This comment has been minimized.

Copy link
Contributor Author

@RoberTnf RoberTnf commented Aug 6, 2018

I see. Maybe even just expanding the docstring, indicating that there are more advanced options would suffice. Now I realize it does says so in the online documentation, so I'm probably the one to blame here!

@mkhorton

This comment has been minimized.

Copy link
Contributor

@mkhorton mkhorton commented Aug 6, 2018

Yes, if you wanted to change it I would propose a change here:

self._cf = CifFile(d)

The default comment is # generated using pymatgen. But we could add an additional line iff symprec is None, something like # crystal has been written in a P1 setting but pymatgen detects crystal symmetry as {...} within default symmetry tolerances\n# use pymatgen.io.cif.CifWriter to output a symmetrized version of this file in a {...} setting or something along those lines. And pass that comment to the lineCifFile(comment=...).

@mkhorton mkhorton closed this Aug 7, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
2 participants
You can’t perform that action at this time.