Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for Enum types to aiida.orm.utils.serialize #5218

Merged

Conversation

sphuber
Copy link
Contributor

@sphuber sphuber commented Nov 4, 2021

Fix #5217

Enum's are often used data types that currently cannot be used in
process inputs, even if the ports are non_db, because even the
non_db inputs are serialized as part of the entire Process instance
that is serialized when a checkpoint is created.

The custom YAML serializer and deserializer that are defined in the
aiida.orm.utils.serialize module that are used for serializing process
instances and their inputs, are updated with a dumper and a loader for
Enum instances. The full class identifier is generated from the
default object loader provided by plumpy and concatenated by the |
sign and the enum's values, it is represented as a YAML scalar.

The loader uses the same object loader to load the enum class from the
identifier string and reconstruct the enum instance from the serialized
value.

The original tests were written in tests/common/test_serialize.py and
are moved to tests/orm/utils/test_serialize.py to conform with the
standard to mirror the package hierarchy. The tests are converted to
pytest style and a new test is added by performing a round trip of an
enum instance (de)serialization.

Enum's are often used data types that currently cannot be used in
process inputs, even if the ports are `non_db`, because even the
`non_db` inputs are serialized as part of the entire `Process` instance
that is serialized when a checkpoint is created.

The custom YAML serializer and deserializer that are defined in the
`aiida.orm.utils.serialize` module that are used for serializing process
instances and their inputs, are updated with a dumper and a loader for
`Enum` instances. The full class identifier is generated from the
default object loader provided by `plumpy` and concatenated by the `|`
sign and the enum's values, it is represented as a YAML scalar.

The loader uses the same object loader to load the enum class from the
identifier string and reconstruct the enum instance from the serialized
value.

The original tests were written in `tests/common/test_serialize.py` and
are moved to `tests/orm/utils/test_serialize.py` to conform with the
standard to mirror the package hierarchy. The tests are converted to
`pytest` style and a new test is added by performing a round trip of an
enum instance (de)serialization.
@sphuber
Copy link
Contributor Author

sphuber commented Nov 4, 2021

@giovannipizzi @bosonie this should provide what we need as discussed. @chrisjsewell @ramirezfranciscof feel free to review if you'd like.

@codecov
Copy link

codecov bot commented Nov 4, 2021

Codecov Report

Merging #5218 (2779052) into develop (7cea5be) will decrease coverage by 0.02%.
The diff coverage is 50.00%.

Impacted file tree graph

@@             Coverage Diff             @@
##           develop    #5218      +/-   ##
===========================================
- Coverage    81.22%   81.21%   -0.01%     
===========================================
  Files          532      532              
  Lines        37344    37357      +13     
===========================================
+ Hits         30328    30334       +6     
- Misses        7016     7023       +7     
Flag Coverage Δ
django 76.06% <50.00%> (-0.01%) ⬇️
sqlalchemy 75.16% <50.00%> (-0.01%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files Coverage Δ
aiida/orm/utils/serialize.py 91.87% <50.00%> (-8.13%) ⬇️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 7cea5be...2779052. Read the comment docs.

Copy link
Member

@giovannipizzi giovannipizzi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok with me thanks! Just to double check two things:

  • this means that one needs to be able to import the Enum python declaration, right? (I think it's OK
  • This does not yet allow to store this in the DB as one of the inputs when non_db=False? If so, would it be possible to store the string value in the DB in this case, or there are issues with this?

@sphuber
Copy link
Contributor Author

sphuber commented Nov 5, 2021

this means that one needs to be able to import the Enum python declaration, right?

Yes, otherwise the deserialization will fail. I think there is no way around this and is an acceptable limitation.

This does not yet allow to store this in the DB as one of the inputs when non_db=False? If so, would it be possible to store the string value in the DB in this case, or there are issues with this?

It does indeed not, but in general we do not allow to store arbitrary inputs in the database. All inputs have to be nodes so they can be properly linked up. The only exception are the metadata.options inputs of the CalcJob class which are stored as attributes.

@sphuber sphuber merged commit 8acdc24 into aiidateam:develop Nov 5, 2021
@sphuber sphuber deleted the feature/5217/serialize-support-enum branch November 5, 2021 11:46
@giovannipizzi
Copy link
Member

Wouldn't it be useful to also have a way (as an alternative) to store an input as a string (Str), but with automatic validation coming from the values of an Enum? This would be useful to store the inputs of the common workflows, for instance (e.g. the ElectronicType)

@sphuber
Copy link
Contributor Author

sphuber commented Nov 8, 2021

Might be useful indeed. Would be multiple ways to go about it. Maybe most straightforward would be to implement the Enum data plugin which automatically (de)serializes into the value and identifier. Opened #5224 and will look at implementing this a bit later.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Add support for Enum instances in the aiida.orm.utils.serialize YAML serializers
3 participants