Join GitHub today
GitHub is home to over 40 million developers working together to host and review code, manage projects, and build software together.Sign up
Unintuitive behavior when saving to .cif related to space groups #1230
In the cif file:
It should be
This works as intended
import pymatgen as mg from pymatgen import Lattice, Structure, Molecule from pymatgen.io.cif import CifWriter # Set up a cubic structure and write to a CIF. lattice = mg.Lattice.cubic(4.2) s1 = mg.Structure(lattice, ["Cs", "Cl"],[[0, 0, 0], [0.5, 0.5, 0.5]]) writer = CifWriter(s1, symprec=0.01) writer.write_file("s1.cif")
My personal opinion on this is that Structure.to is a convenience method and if you want more customization you should use CifWriter directly.
Your proposed solution of adding a symprec kwarg to Structure.to is fairly harmless, but it’s also very specific to just CIF output. This could also be confusing if people expect the symprec arg to have an effect when they use Structure.to to output a different file format. It might also be awkward because you could imagine adding kwargs for many other options for the other supported output formats too, which would lead to Structure.to becoming over-complicated.
Also, I’d argue that this is not unexpected behavior because pymatgen’s Structure is implicitly P1. Symmetry information isn’t stored; it’s computed on demand. Outputting a symmetrized CIF is actually lossy depending on your symmetry precision. Perhaps SymmetrizedStructure should output a symmetrized CIF by default though.
Yeah, I don't know the answer to that for sure. I can definitely see why this would be confusing; you're not the first person to flag this.
I would say that the purpose of these tags is to tell the software parsing the CIF file what symmetry operations it needs to apply to the list of atomic positions to construct the crystal. This is important to many tools, e.g. visualization tools, etc.
In this sense, supplying P1 is technically correct, it says: "we've supplied all the atomic positions, the only symmetry operation you need to apply is the identity." I'm not sure if this is strictly required by the standard or if P1 is assumed by default if symmetry information is not supplied, but in any case I'd expect many third-party tools that parse CIF to break if we didn't include this line.
Perhaps a remedy to this would be to add a human-readable comment to the file, explaining it's been outputted in a P1 setting but also providing the calculated symmetry within the given symmetry tolerance.
Yes, if you wanted to change it I would propose a change here:
The default comment is