Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix potential inefficiency in aiida.tools.data.cif converters #3098

Merged
merged 1 commit into from Jun 30, 2019

Conversation

sphuber
Copy link
Contributor

@sphuber sphuber commented Jun 28, 2019

Fixes #3097

Recently, the BackendNode interface and implementation was changed
significantly with respect to attributes and extras. As a result, the
values on unstored nodes are not cleaned until the node is stored.
That means that unstored nodes can contain attributes and extras with
uncleaned values. For example an unstored Dict node can contained
stored Float nodes. This change becomes problematic in the
implementation of _get_aiida_structure_pymatgen_inline and
_get_aiida_structure_ase_inline of aiida.tools.data.cif. The
parameters keyword argument can be an unstored Dict node, if the
functions are run with store_provenance=False. Since it is unstored,
it can contain stored nodes for the values within them. For example,
the site_tolerance can be a stored Float. Since they are overloaded
native float objects, they can be passed to pymatgens CifParser
without it complaining. However, now whenever that class references the
value, since it is a stored node, the ModelWrapper will cause the
model instance to be refreshed from the database, which becomes
prohibitively expensive. Simply calling clean_value on the
parameters beforehand, will cause these nodes to be dereferenced into
their base values, which solves the issue.

Recently, the `BackendNode` interface and implementation was changed
significantly with respect to attributes and extras. As a result, the
values on unstored nodes are not cleaned *until* the node is stored.
That means that unstored nodes can contain attributes and extras with
uncleaned values. For example an unstored `Dict` node can contained
stored `Float` nodes. This change becomes problematic in the
implementation of `_get_aiida_structure_pymatgen_inline` and
`_get_aiida_structure_ase_inline` of `aiida.tools.data.cif`. The
`parameters` keyword argument can be an unstored `Dict` node, if the
functions are run with `store_provenance=False`. Since it is unstored,
it can contain *stored* nodes for the values within them. For example,
the `site_tolerance` can be a stored `Float`. Since they are overloaded
native float objects, they can be passed to pymatgens `CifParser`
without it complaining. However, now whenever that class references the
value, since it is a stored node, the `ModelWrapper` will cause the
model instance to be refreshed from the database, which becomes
prohibitively expensive. Simply calling `clean_value` on the
`parameters` beforehand, will cause these nodes to be dereferenced into
their base values, which solves the issue.
@coveralls
Copy link

coveralls commented Jun 28, 2019

Coverage Status

Coverage decreased (-0.008%) to 73.973% when pulling 2557a8c on sphuber:fix_3097_tools_cif_converters into e232f94 on aiidateam:develop.

Copy link
Member

@giovannipizzi giovannipizzi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the clear explanation!

@sphuber sphuber merged commit 1578227 into aiidateam:develop Jun 30, 2019
@sphuber sphuber deleted the fix_3097_tools_cif_converters branch June 30, 2019 20:07
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Potential inefficiency in aiida.tools.data.cif CifData converters
3 participants