Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

StructureData constructor should accept ASE object with zero cell data #5248

Closed
danielhollas opened this issue Dec 6, 2021 · 7 comments · Fixed by #5341
Closed

StructureData constructor should accept ASE object with zero cell data #5248

danielhollas opened this issue Dec 6, 2021 · 7 comments · Fixed by #5341
Assignees
Labels
type/feature request status undecided

Comments

@danielhollas
Copy link
Collaborator

Is your feature request related to a problem? Please describe

StructureData(ase=ase.Atoms(...)) fails if the cell parameters are not provided to the ase.Atoms() constructor.

import ase
ase_struct = ase.Atoms(symbols="H", positions=[(0.0, 0.0, 0.0)])
ase.struct.cell
>  Cell([0.0, 0.0, 0.0])
StructureData(ase=ase_struct)
> 
/opt/conda/lib/python3.8/site-packages/aiida/orm/nodes/data/structure.py in __init__(self, cell, pbc, ase, pymatgen, pymatgen_structure, pymatgen_molecule, **kwargs)
    745 
    746             if ase is not None:
--> 747                 self.set_ase(ase)
    748 
    749             if pymatgen is not None:

/opt/conda/lib/python3.8/site-packages/aiida/orm/nodes/data/structure.py in set_ase(self, aseatoms)
    804         if is_ase_atoms(aseatoms):
    805             # Read the ase structure
--> 806             self.cell = aseatoms.cell
    807             self.pbc = aseatoms.pbc
    808             self.clear_kinds()  # This also calls clear_sites

/opt/conda/lib/python3.8/site-packages/aiida/orm/nodes/data/structure.py in cell(self, value)
   1585     def cell(self, value):
   1586         """Set the cell."""
-> 1587         self.set_cell(value)
   1588 
   1589     def set_cell(self, value):

/opt/conda/lib/python3.8/site-packages/aiida/orm/nodes/data/structure.py in set_cell(self, value)
   1594             raise ModificationNotAllowed('The StructureData object cannot be modified, it has already been stored')
   1595 
-> 1596         the_cell = _get_valid_cell(value)
   1597         self.set_attribute('cell', the_cell)
   1598 

/opt/conda/lib/python3.8/site-packages/aiida/orm/nodes/data/structure.py in _get_valid_cell(inputcell)
     52 
     53     if abs(calc_cell_volume(the_cell)) < _VOLUME_THRESHOLD:
---> 54         raise ValueError('The cell volume is zero. Invalid cell.')
     55 
     56     return the_cell

ValueError: The cell volume is zero. Invalid cell.

Describe the solution you'd like

StructureData should accept the ASE default cell, perhaps converting it to aiida default.

Describe alternatives you've considered

Keeping this behavior might be useful for consistency?

Additional context

Quality of life for quantum chemists working in the gas phase. :-)

@danielhollas danielhollas added the type/feature request status undecided label Dec 6, 2021
@danielhollas danielhollas changed the title StructureData constructor should accept ASE object without cell data StructureData constructor should accept ASE object with zero cell data Dec 6, 2021
@ltalirz
Copy link
Member

ltalirz commented Feb 1, 2022

Just ran into this as well and I strongly second this feature request.

When creating an empty StructureData, the cell (if unspecified) currently defaults to this

if cell is None:
cell = [[1., 0., 0.], [0., 1., 0.], [0., 0., 1.]]

I don't know why this was considered a better default value than None (for starters, it is not distinguishable from an actual cell [[1., 0., 0.], [0., 1., 0.], [0., 0., 1.]]).

Unless there are good reasons for this default value that escape me, I would vote to replace it by None.
Besides making no sense when using StructureData to store a non-periodic structure, I would argue that using a finite cell as a default is dangerous, since in combination with the default

if pbc is None:
pbc = [True, True, True]

it may give the impression that a periodic calculation can be performed on a structure for which no cell was specified, leading to garbage results.

Thoughts @giovannipizzi @sphuber ?

Edit: I will prepare a PR for this.

@ltalirz ltalirz self-assigned this Feb 1, 2022
@ltalirz
Copy link
Member

ltalirz commented Feb 1, 2022

P.S. ASE uses a default of

In [2]: from ase.build import molecule

In [3]: m=molecule('H2O')

In [4]: m.cell
Out[4]: Cell([0.0, 0.0, 0.0])

In [5]: m.pbc
Out[5]: array([False, False, False])

I don't have a strong opinion on [0,0,0] (which is clearly invalid) vs None.
Maybe [0,0,0] will be "more" backwards compatible.

@sphuber
Copy link
Contributor

sphuber commented Feb 1, 2022

I didn't write the original StructureData so cannot comment on the reasoning for the choice of defaults. I think that this is another manifestation of trying to use a single class to do multiple things, i.e., represent solids and molecules at the same time. This leads to inconsistencies and suprises in the interface and behavior. Same thing happens with the KpointsData.

So ideally we would improve this, but since the StructureData is used so much, the main problem here is backwards compatibility. I am not sure who might be relying on this behavior and so what will break when the defaults for the cell and the pbc are changed.

What are you suggesting as a change though? Would you suggest adopting the ASE default for the cell? I am not sure if [0, 0, 0] is better. Probably they have a good reason for it, but I don't see it yet.

@ltalirz
Copy link
Member

ltalirz commented Feb 1, 2022

Here's a possible PR that uses None as the default cell #5341

@giovannipizzi
Copy link
Member

Thanks. I designed the original class; I agree that in the end it would have probably better to define multiple classes for different dimensionalities, as pymatgen does, but I followed the ASE approach to have only one and define pbc (also, having multiple classes has pros and cons of course, maybe alleviated if there is a common parent class).

The issue is that the fix in #5341 only works for molecules; what about 2D and 1D materials? If we change this we need to change the default cell also for those cases.
Note that in 2D the matrix is not a 2x2 but a 2x3 (you have 2 vectors in 3D). It becomes however also non-obvious which of the three coordinates is periodic, however.
Probably then the best is to use, instead of None, a 3x3 zero matrix; so that in 2D you have one rows of zeros, and in 1D two rows of zeros.

A few things to do then:

  • fix the get_volume accordingly. For 1D and 2D we would need a get_length and get_surface, though?
  • for fix: allow StructureData without specified cell #5341, can you check what happens for the KpointsData class when these cells are passed in set_cell? Just to avoid that we get unclear exceptions. For 0D anyway it should say you cannot get k-points because there is no periodicity (or we might decide that the correct thing to do is to return a Gamma-only k-mesh). For 1D and 2D, I'm not sure if this is currently taken care of? Actually this change (with rows of zeros) can be used to catch these cases and raise an appropriate exception that explains why we would need a specific 2D or 1D KpointsData class, or they need to convert the system to 3D and add the third vector

Also as a general comment: we should not enforce that the cell is not None but it should be only the default - there are numerical methods where you want to say the system is 0D, but still define a box (similarly for 1D, 2D). Probably it's what you did in #5341, but just mentioning for clarify and for the future, if people read this issue.

@atztogo
Copy link
Contributor

atztogo commented Feb 1, 2022

I assume here the structure means that made of atoms. There are a variety of symmetries of structures. Periodicity is one of them. Crystal is made of combination of periodicity and rotation + fractional translation. For example, plane symmetry and layer symmetry (for so-called 2D materials?) are different. We might not consider many types structures that may be categorized by symmetry. We may want to describe nanotubes in the best symmetrical way. I feel it is good to discuss which types of structural symmetries the StructureData would like to cover (if this is a serious issue). So I think @sphuber's comment above is only the starting point we can choose, i.e., we should design it assuming we don't know many things about structures.

@ltalirz
Copy link
Member

ltalirz commented Feb 2, 2022

Hi Togo, thanks for your comment!
There is certainly space to extend the StructureData in many ways, one of them being the incorporation of symmetry representations.

In order not to derail this discussion (which is about a much smaller fix), I would suggest opening a new issue & thread for this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
type/feature request status undecided
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants