In [78]:
%aiida
from aiida import orm
#StructureData = DataFactory('atomistic.structure') # or: from aiida_atomistic.data.structure.structure import StructureData
StructureData = orm.StructureData

# Implementation of the new StructureData 

## New data type and classes

In the following we define the new StructureData class, here named `StructureData_prototype` for simplicity, as it should be implemented in the `aiida-atomistic` package. 
We then define the `PropertyCollector` class, which manages all the properties taking care of all data validation and storage instructions. Instance of this class will be created within the initialization of the StructureData instance and is stored under the `properties` attribute. 

In the final implementation, original StructureData will be named `LegacyStructureData` and StructureData_prototype `StructureData`.

## Rules for the properties:

- properties are stored under the `properties` attribute of the `StructureData` instance. This attribute is immutable. `STATUS: done`;
- the `properties` attribute is a dictionary, keys being the name of each property and values being a dictionary with the information about the property, which we will store when the StructureData node is stored in the AiiDA db. `STATUS: storage part to be done`;
- information about a property can be provided in only one format (we may implement, in each property, some method to translate from common formats to the accepted one). `STATUS: todo the possiblity to have property class methods`;
- no direct change of the properties is allowed. `STATUS: done`;
- backward compatibility: structure.pbc and structure.kinds are still there. `STATUS: done`;
- always possible to access the list of defined and of supported properties. `STATUS: done`;
- support for custom properties. `STATUS: todo`;
- constructors: adapt methods like `from_ase` to support also the interface with the properties stored in a given `Atoms` object. `STATUS: todo`.

## Backward-compatibility:

The new StructureData class will be a subclass of the old StructureData. This is the best way to ensure 
backward compatibility while avoiding code duplication. This enhance also how this new AiiDA data type is indeed and extension of the original StructureData.
In the future, the entire orm.StructureData module will be moved into the `aiida-atomistic` package and the `pbc` and `kinds` attribute will be removed.

In [156]:
import typing
from typing import Any,Dict,List
from pydantic import BaseModel, ConfigDict, Field
from abc import ABCMeta

######################decorator to allow no calls outside the initialization step: actually, we do not need/use it. 
import inspect

def allow_no_calls_decorator(func):
    """
    This decorator checks if the methods are called 
    from the class, i.e. during the instance creation,
    or via a call which is performed by the user. The
    last way is not allowed. In this way, we essentially 
    protect our code against misuse of the API.    
    """
    def wrapper(*args, **kwargs):
        frame = inspect.currentframe()

        try:
            locals = frame.f_back.f_locals

            if locals.get('self', None) is args[0]:
                print("Called from this class!")
                func(*args, **kwargs)
            else:
                raise NotImplementedError("This method cannot be called directly!")
        finally:
            pass
    return wrapper

##########################################

################################################## Start: Base Class for a property:
class BaseProperty(BaseModel):
    # Pydantic2 syntax: #model_config = ConfigDict(frozen=True, extra='forbid')
    parent: StructureData = Field(
        #init_var=True   # Does not show when dumping the model (but I think it works only in pydantic 2)
        )
    
    """
    parent: Data = Field(  #Data is ok if we do not redefine the aiida-core Data class.
        #init_var=True   # Does not show when dumping the model (but I think it works only in pydantic 2)
        )
    """
    class Config:
        frozen = True                  # No changes allowed: immutability
        extra = 'forbid'               # No extra arguments or attributes allowed.
        arbitrary_types_allowed = True # You can remove if also StructureData inherits from BaseModel.


################################################## End: Base Class for a property.


################################################## Start: PBC property:

class Pbc(BaseProperty):
    """
    The pbc property. 
    It is different from the pbc attribute directly accessible from the StructureData object.
    """
    value: List[bool] = Field(min_items=3, max_items=3, default=[True,True,True])
    
    @classmethod
    def from_string(cls, dimensionality:str = "3D"):
        if dimensionality=="3D":
            pbc = [True]*3
        elif dimensionality=="0D":
            pbc = [False]*3
        else:
            raise ValueError
        
        return {'value': pbc}

################################################## End: PBC property.

################################################## Start: Template Custom property:

class CustomProperty(BaseProperty):
    """
    Template for custom properties.
    To set it in the StructureData, you need to provide the name of the property in a given format.
    For example:
        structure = StructureData_prototype(
            ase=atoms,
            properties = {
            "pbc": {"value": [True,True,True]},
            "custom_collinear_magnetization": {"value": [1,1,1]},
            },
        ) 
    """
    value: Any = Field()

################################################## End: PBC property.


class PropertyInfo:
    # Might store the parameters passed to property
    def __init__(self,):
        self.value = None

def Property():
    # If we define parameters, store them in the PropertyInfo
    return PropertyInfo()




In [157]:
################################################## Start: Mixin classes:

class PropertyMixinMetaclass(ABCMeta):
    
    """
    This class attach the setter and getter method for a property, 
    as defined in the HasPropertyMixin class, respectively with the
    _set_property and _template_property methods. 
    If we use only a constructor for the creation of the StructureData
    together with the properties, we do not need it.
    
    We do not allow to set any property after the creation of the instance: 
    ===> we do not set the `fset` attribute.
    """

    def __new__(mcs, name, bases, namespace, **kwargs):  # noqa C901
        cls = super().__new__(mcs, name, bases, namespace, **kwargs)

        for attr, type_hint in typing.get_type_hints(cls).items():
            if isinstance(getattr(cls, attr), PropertyInfo):
                assert issubclass(type_hint, BaseProperty)
                cls._valid_properties.add(attr)
                func_get = lambda self, type_hint=type_hint, attr=attr: self._template_property(type_hint=type_hint, attr=attr)
                
                # We do not allow to set any property after the creation of the instance: 
                #===> WE DO NOT NEED TO SET THE `fset` ATTRIBUTE. 
                # Here below, we leave it there for now, in case it is needed in the future.
                # I define also a setter, TOTEST if we stay immutable wrt other ways to change the property:
                func_set = lambda self, pname=None, pvalue=None: self._set_property(pname, pvalue)
                setattr(cls, attr, property(fget=func_get,fset=func_set))

        
        return cls            

class HasPropertyMixin(metaclass=PropertyMixinMetaclass):
    _valid_properties = set()

    def _template_property(self, type_hint, attr):
        try:
            return type_hint(
                parent=self._parent,
                **self.get_property_attribute(attr)
            )
        except: 
            return type_hint(
                parent=self._parent,
            )

    # This function is never used:
    @allow_no_calls_decorator
    def _set_property(self, pname=None, pvalue=None):

        try:
            #internal check, using the Pydantic initialization.
            type_hint = typing.get_type_hints(self)[pname]
            prop = type_hint(
                parent=self._parent,
                **pvalue
            )
            
            self._database_wise_setter(pname, pvalue)
            return
        except KeyError: 
            return None
    
    def _database_wise_setter(self, pname, pvalue):
        property_attributes = self._parent.base.attributes.get("_property_attributes").copy()
        property_attributes[pname] = pvalue
        self._parent.base.attributes.set("_property_attributes",property_attributes)
        return
        
    def get_valid_properties(self):
        # Get the implemented properties
        return self._valid_properties.copy()

    def get_defined_properties(self):
        # Get the properties that you already set
        property_attributes = self._parent.base.attributes.get("_property_attributes")
        return list(set(self.get_valid_properties()).intersection(
            property_attributes.keys()
        ))
        
################################################## End: Mixin classes.

## Definition of the `PropertyCollector` and `StructureData_protoype` classes.

In [158]:
class PropertyCollector(HasPropertyMixin):
    """
    This class is the one used in the StructureData to manage the properties. 
    In principle, it cannot be modified after creation, i.e. is immutable. This respects 
    the `immutability principle` of a StructureData node and requires the creation of a new 
    StructureData instance in case we need to add/update/delete a property.
    
    In the init method, we need also to provide the parent StructureData node, which will have the
    PropertyCollector instance as attribute, and a dictionary with the properties. We then loop on the 
    keys of the properties to initialise all our properties, respecting the rules imposed in each class. 
    
    In principle, we are going to hide every module and information on properties here, in such a way to
    leave as clean as possible the StructureData module.
    
    #### Need of a crystal structure:
    The idea is that we can no more initialise the StructureData without any information, or at least we
    cannot initialise any properties without crystal structure information. This is related to consistency
    checks. 
    So we need some check in the StructureData also, that block or give empty PropertyCollector attribute 
    (StructureData.properties) case no crystal structure is defined.
    
    #### Property format:
    The properties are stored exactly as they are provided in the construction of the class instance: in 
    this way, we do not have ambiguities when the properties are used or loaded from the database/repository.
    To facilitate this, we may provided some `translation methods` from and to the format allowed in the property.
    """
    
    # Supported properties below:
    pbc: Pbc = Property()
    custom: CustomProperty = Property()
    
    def __init__(self, parent, properties: Dict[str, Dict[str, Any]] = {}):
        
        if not isinstance(properties, dict):
            raise ValueError(f"The `properties` input is not of the right type. Expected '{type(dict())}', received '{type(properties)}'.")
        
        self._parent = parent # Parent StructureData object
        
        # properties: Dictionary containing the properties. The key is the name of the property and the value                                                           
        # is an instance of the corresponding Property subclass value.
        super().__init__()
        
        self.inspect_property(properties)
        
        self._property_attributes = properties
        # Store the properties in the StructureData node.
        #self._parent.base.attributes.set('_property_attributes',{})
        self._parent.base.attributes.set('_property_attributes',self._property_attributes)
    
    
    def get_property_attribute(self, key):
        # In AiiDA this could be self.base.attrs['properties'][key] or similar
        return self._property_attributes[key]    
    
    # This function is never used:
    @allow_no_calls_decorator
    def set_property(self, pname=None, pvalue=None):
        
        if not pname in self._valid_properties:
            raise NotImplementedError(f"Property '{pname}' is not yet supported. Use the 'get_valid_properties' method to see the available properties.")
        
        print('Setting {} to {}'.format(pname, str(pvalue)))
        return self._set_property(self,pname=pname, pvalue=pvalue)
    
    def inspect_property(self,properties):
        """
        Method used to understand if we are defining supported/unsupported properties. 
        Here there should be also the detection of custom properties, which 
        have a defined prefix.
        """
        for pname,pvalue in properties.items():
            if pname not in self.get_valid_properties():
                raise NotImplementedError(f"Property '{pname}' is not yet supported.\nSupported properties are: {self.get_valid_properties()}")
            # custom properties:
            #elif pname in self.get_valid_properties():
            #    raise NotImplementedError(f"Property '{pname}' is not yet supported.\nSupported properties are: {self.get_valid_properties()}")
            elif not pvalue:
                raise ValueError(f"Property '{pname}' has not value provided.")
            elif len(pvalue)==0:
                raise ValueError(f"Property '{pname}' is empty.")
            elif not isinstance(pvalue, dict):
                raise ValueError(f"The '{pname}' value is not of the right type. Expected '{type(dict())}', received '{type(pvalue)}'.")
    
    
class StructureData_prototype(StructureData):
    
    """
    Extension of the StructureData class. 
    The main new feature is the possibility to store the properties associated to a given system.
    For example it is possible to store magnetization, hubbard U and V, under the `properties` attribute.
    This attribute is created when the StructureData instance is generated. 
    """
    
    def __init__(
        self, 
        cell=None,
        pbc=None,
        ase=None,
        pymatgen=None,
        pymatgen_structure=None,
        pymatgen_molecule=None,
        properties: Dict[str, Dict[str, Any]] = {},
        **kwargs,) -> None:
        """
        The '_property_attribute', has to be stored in self., not in cls. as in the first version of the prototype
        Otherwise we have info in the cls, not in the self.
        """
        if not isinstance(properties, dict):
            raise ValueError(f"The `properties` input is not of the right type. Expected '{type(dict())}', received '{type(properties)}'.")
        
        super().__init__(
            cell,
            pbc,
            ase,
            pymatgen,
            pymatgen_structure,
            pymatgen_molecule,
            **kwargs,
        )
        
        # Private property attribute
        self._property_attribute = PropertyCollector(parent=self, properties=properties)
    
    # Setting the properties attribute as immutable.
    # The only drawback is that the `_properties_attribute` one can still be modified. 
    @property
    def properties(self):
        return self._property_attribute

    @properties.setter
    def properties(self,value):
        raise AttributeError("Cannot change the value of a read-only attribute")
    
        

## next steps: 
These can be the commit message moreless.

- DONE: block the change of `properties` in StructureData. HOW: with the setter... and _properties.
- DONE: not allow the fake re-assignement of properties.pbc (it does not really change, but the user may think that he changed successfully the property).For example: set_property and _set_property does not change anything, but they also do not give errors if we call them directly. this is misledaing. HOW: adding the decorator to the two mentioned methods. 
- HOWEVER: not setting the fset, will make never possible to call the set_property method. so in the end it is not the frame that its working. ===> in principle we do not need such methods.
- non supported properties are not excepted by the class. why? seems there is no more a check... understand why
  - DONE: the problem is that we use the attributs only looping on the cls items. which of course does not contain other things excepts the supported properties. HOW: I added a check in the init of the PCollector. (def inspect_property(self,properties))
  - DONE: however, I see that the valid properties are defined multiple times, as if we were running in a loop calling several times the `__new__` method. HOW: using a set() instead of a list(). not really solved, but it works fine.
  - custom properties: check If it is popssible to do it easil;y or not qwith template metdhos.... like a general one CustomProperty.... or we allow the definition of a class... dangeous


### Initialization and immutability

#### Just the crystal structure:

In [159]:
from ase import Atoms

unit_cell = [[3.5, 0.0, 0.0], [0.0, 3.5, 0.0], [0.0, 0.0, 3.5]]

atoms = Atoms('LiLi', [[0.0, 0.0, 0.0],[1.5, 1.5, 1.5]], cell = [1,1,1])
atoms.set_cell(unit_cell, scale_atoms=False)
atoms.set_pbc([True,True,True])

structure = StructureData(ase=atoms)

print(structure.cell)
print(structure.sites)
print(structure.kinds)   # <== backward-compatibility
print(structure.pbc)     # <== backward-compatibility

[[3.5, 0.0, 0.0], [0.0, 3.5, 0.0], [0.0, 0.0, 3.5]]
[<Site: kind name 'Li' @ 0.0,0.0,0.0>, <Site: kind name 'Li' @ 1.5,1.5,1.5>]
[<Kind: name 'Li', symbol 'Li'>]
(True, True, True)


#### Adding also properties:

In [140]:
structure = StructureData_prototype(
        ase=atoms,
        properties={
                'pbc': {'value':[True,False,True]},
                },
        )

print(structure.cell)
print(structure.sites)
print(structure.pbc)

[[3.5, 0.0, 0.0], [0.0, 3.5, 0.0], [0.0, 0.0, 3.5]]
[<Site: kind name 'Li' @ 0.0,0.0,0.0>, <Site: kind name 'Li' @ 1.5,1.5,1.5>]
(True, True, True)


In [141]:
structure.properties.pbc

Pbc(parent=<StructureData_prototype: uuid: faf8f216-f8f2-4bab-8d0a-2653da2956cf (unstored)>, value=[True, False, True])

In [142]:
structure.properties.pbc.value=5

TypeError: "Pbc" is immutable and does not support item assignment

In [143]:
structure.properties =  0
print(structure.properties)

AttributeError: Cannot change the value of a read-only attribute

### Get the lists of supported and defined properties

In [144]:
structure.properties.get_defined_properties()

['pbc']

In [145]:
structure.properties.get_valid_properties()

{'pbc'}

#### Trying to set some unsupported property:

In [146]:
structure = StructureData_prototype(
        ase=atoms,
        properties={
                'pbc': {'value':[True,False,True]},
                'unsupported_property':{} 
        })

NotImplementedError: Property 'unsupported_property' is not yet supported.
Supported properties are: {'pbc'}

#### Custom properties definition:

How should be the format? 

- something that can be stored in the database
- some template? maybe we should provide a class name CustomProperty, which has already pre-defined inputs...

#### Type checking for the `properties` input and its content

In [147]:
structure = StructureData_prototype(
        ase=atoms,
        properties=[('pbc',{'value':[True,False,True]})])

ValueError: The `properties` input is not of the right type. Expected '<class 'dict'>', received '<class 'list'>'.

In [148]:
structure = StructureData_prototype(
        ase=atoms,
        properties={
                'pbc': [{'value':[True,False,True]}],
        })

ValueError: The 'pbc' value is not of the right type. Expected '<class 'dict'>', received '<class 'list'>'.

### Using the classmethod `from_string` to obtain the right pbc format

We have to call it from an instance of the `StructureData`. 
This is due to the fact that the `properties` attribute is properly created only when we create and instance of the StructureData.

In [151]:
print(StructureData_prototype().properties.pbc.from_string("3D"))
print(StructureData_prototype().properties.pbc.from_string("0D"))

{'value': [True, True, True]}
{'value': [False, False, False]}
