Skip to content

Commit 67cc8fd

Browse files
committed
Improve lazy loading caveats
- Improve documentation of lazy loading (caveats section) - Add additional checks and options when working with lazy-loaded IDSs
1 parent f5fc791 commit 67cc8fd

File tree

6 files changed

+117
-10
lines changed

6 files changed

+117
-10
lines changed

docs/source/lazy_loading.rst

Lines changed: 35 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -35,12 +35,45 @@ Caveats of lazy loaded IDSs
3535

3636
Lazy loading of data may speed up your programs, but also comes with some limitations.
3737

38-
1. IMASPy **assumes** that the underlying data entry is not modified.
38+
1. Some functionality is not implemented or works differently for lazy-loaded IDSs:
39+
40+
- Iterating over non-empty nodes works differently, see API documentation:
41+
:py:meth:`imaspy.ids_structure.IDSStructure.iter_nonempty_`.
42+
- :py:meth:`~imaspy.ids_structure.IDSStructure.has_value` is not implemented for
43+
lazy-loaded structure elements.
44+
- :py:meth:`~imaspy.ids_toplevel.IDSToplevel.validate` will only validate loaded
45+
data. Additional data might be loaded from the backend to validate coordinate
46+
sizes.
47+
- :py:meth:`imaspy.util.print_tree` will only print data that is loaded when
48+
:py:param:`~imaspy.util.print_tree.hide_empty_nodes` is ``True``.
49+
- :py:meth:`imaspy.util.visit_children`:
50+
51+
- When :py:param:`~imaspy.util.visit_children.visit_empty` is ``False``
52+
(default), this method uses
53+
:py:meth:`~imaspy.ids_structure.IDSStructure.iter_nonempty_`. This raises an
54+
error for lazy-loaded IDSs, unless you set
55+
:py:param:`~imaspy.util.visit_children.accept_lazy` to ``True``.
56+
- When :py:param:`~imaspy.util.visit_children.visit_empty` is ``True``, this
57+
will iteratively load `all` data from the backend. This is effectively a
58+
full, but less efficient, ``get()``\ /\ ``get_slice()``. It will be faster
59+
if you don't use lazy loading in this case.
60+
61+
- IDS conversion through :py:meth:`imaspy.convert_ids
62+
<imaspy.ids_convert.convert_ids>` is not implemented for lazy loaded IDSs. Note
63+
that :ref:`Automatic conversion between DD versions` also applies when lazy
64+
loading.
65+
- Lazy loaded IDSs are read-only, setting or changing values, resizing arrays of
66+
structures, etc. is not allowed.
67+
- You cannot :py:meth:`~imaspy.db_entry.DBEntry.put`,
68+
:py:meth:`~imaspy.db_entry.DBEntry.put_slice` or
69+
:py:meth:`~imaspy.ids_toplevel.IDSToplevel.serialize` lazy-loaded IDSs.
70+
71+
2. IMASPy **assumes** that the underlying data entry is not modified.
3972

4073
When you (or another user) overwrite or add data to the same data entry, you may end
4174
up with a mix of old and new data in the lazy loaded IDS.
4275

43-
2. After you close the data entry, no new elements can be loaded.
76+
3. After you close the data entry, no new elements can be loaded.
4477

4578
>>> core_profiles = data_entry.get("core_profiles", lazy=True)
4679
>>> data_entry.close()
@@ -50,8 +83,6 @@ Lazy loading of data may speed up your programs, but also comes with some limita
5083
RuntimeError: Cannot lazy load the requested data: the data entry is no longer
5184
available for reading. Hint: did you close() the DBEntry?
5285

53-
3. IDSs that are lazy loaded are read-only, and you cannot :code:`put()` or
54-
:code:`put_slice()` them.
5586
4. Lazy loading has more overhead for reading data from the lowlevel: it is therefore
5687
more efficient to do a full :code:`get()` or :code:`get_slice()` when you intend to
5788
use most of the data stored in an IDS.

imaspy/_util.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -110,7 +110,7 @@ def _make_tree(structure, hide_empty_nodes=True, *, tree=None):
110110

111111
iterator = structure
112112
if hide_empty_nodes and isinstance(structure, IDSStructure):
113-
iterator = structure.iter_nonempty_()
113+
iterator = structure.iter_nonempty_(accept_lazy=True)
114114
for child in iterator:
115115
if isinstance(child, IDSPrimitive):
116116
if not child.has_value:

imaspy/ids_convert.py

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -341,6 +341,11 @@ def convert_ids(
341341
factory: Existing IDSFactory to use for as target version.
342342
target: Use this IDSToplevel as target toplevel instead of creating one.
343343
"""
344+
if toplevel._lazy:
345+
raise NotImplementedError(
346+
"IDS conversion is not implemented for lazy-loaded IDSs"
347+
)
348+
344349
ids_name = toplevel.metadata.name
345350
if target is None:
346351
if factory is None:

imaspy/ids_structure.py

Lines changed: 65 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -160,16 +160,78 @@ def _dd_parent(self) -> IDSBase:
160160
@property
161161
def has_value(self) -> bool:
162162
"""True if any of the children has a non-default value"""
163+
if self._lazy:
164+
raise NotImplementedError(
165+
"`has_value` is not implemented for lazy-loaded structures."
166+
)
163167
for _ in self.iter_nonempty_():
164168
return True
165169
return False
166170

167-
def iter_nonempty_(self) -> Generator[IDSBase, None, None]:
171+
def iter_nonempty_(self, *, accept_lazy=False) -> Generator[IDSBase, None, None]:
168172
"""Iterate over all child nodes with non-default value.
169173
170174
Note:
171175
The name ends with an underscore so it won't clash with IDS child names.
176+
177+
Caution:
178+
This method works differently for lazy-loaded IDSs.
179+
180+
By default, an error is raised when calling this method for lazy loaded
181+
IDSs. You may set the keyword argument ``accept_lazy`` to ``True`` to
182+
iterate over all `loaded` child nodes with a non-default value. See below
183+
example for lazy-loaded IDSs.
184+
185+
Examples:
186+
187+
.. code-block:: python
188+
:caption: ``iter_nonempty_`` for fully loaded IDSs
189+
190+
>>> import imaspy.training
191+
>>> entry = imaspy.training.get_training_db_entry()
192+
>>> cp = entry.get("core_profiles")
193+
>>> list(cp.iter_nonempty_())
194+
[
195+
<IDSStructure (IDS:core_profiles, ids_properties)>,
196+
<IDSStructArray (IDS:core_profiles, profiles_1d with 3 items)>,
197+
<IDSNumericArray (IDS:core_profiles, time, FLT_1D)>
198+
numpy.ndarray([ 3.98722186, 432.93759781, 792. ])
199+
]
200+
201+
.. code-block:: python
202+
:caption: ``iter_nonempty_`` for lazy-loaded IDSs
203+
204+
>>> import imaspy.training
205+
>>> entry = imaspy.training.get_training_db_entry()
206+
>>> cp = entry.get("core_profiles", lazy=True)
207+
>>> list(cp.iter_nonempty_())
208+
RuntimeError: Iterating over non-empty nodes of a lazy loaded IDS will
209+
skip nodes that are not loaded. Set accept_lazy=True to continue.
210+
See the documentation for more information: [...]
211+
>>> list(cp.iter_nonempty_(accept_lazy=True))
212+
[]
213+
>>> # An empty list because nothing is loaded. Load `time`:
214+
>>> cp.time
215+
<IDSNumericArray (IDS:core_profiles, time, FLT_1D)>
216+
numpy.ndarray([ 3.98722186, 432.93759781, 792. ])
217+
>>> list(cp.iter_nonempty_(accept_lazy=True))
218+
[<IDSNumericArray (IDS:core_profiles, time, FLT_1D)>
219+
numpy.ndarray([ 3.98722186, 432.93759781, 792. ])]
220+
221+
Keyword Args:
222+
accept_lazy: Accept that this method will only yield loaded nodes of a
223+
lazy-loaded IDS. Non-empty nodes that have not been loaded from the
224+
backend are not iterated over. See detailed explanation above.
172225
"""
226+
if self._lazy and not accept_lazy:
227+
raise RuntimeError(
228+
"Iterating over non-empty nodes of a lazy loaded IDS will skip nodes "
229+
"that are not loaded. Set accept_lazy=True to continue. "
230+
"See the documentation for more information: "
231+
"https://sharepoint.iter.org/departments/POP/CM/IMDesign/"
232+
"Code%20Documentation/IMASPy-doc/generated/imaspy.ids_structure."
233+
"IDSStructure.html#imaspy.ids_structure.IDSStructure.iter_nonempty_"
234+
)
173235
for child in self._children:
174236
if child in self.__dict__:
175237
child_node = getattr(self, child)
@@ -219,7 +281,8 @@ def _validate(self) -> None:
219281
# Common validation logic
220282
super()._validate()
221283
# IDSStructure specific: validate child nodes
222-
for child in self.iter_nonempty_():
284+
# accept_lazy=True: users are warned in IDSToplevel.validate()
285+
for child in self.iter_nonempty_(accept_lazy=True):
223286
child._validate()
224287

225288
def _xxhash(self) -> bytes:

imaspy/ids_toplevel.py

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -259,7 +259,8 @@ def validate(self):
259259

260260
def _validate(self):
261261
# Override to skip the self.metadata.type.is_dynamic check in IDSBase._validate
262-
for child in self.iter_nonempty_():
262+
# accept_lazy=True: users are warned in IDSToplevel.validate()
263+
for child in self.iter_nonempty_(accept_lazy=True):
263264
child._validate()
264265

265266
@needs_imas

imaspy/util.py

Lines changed: 9 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -16,7 +16,7 @@
1616
logger = logging.getLogger(__name__)
1717

1818

19-
def visit_children(func, node, *, leaf_only=True, visit_empty=False):
19+
def visit_children(func, node, *, leaf_only=True, visit_empty=False, accept_lazy=False):
2020
"""Apply a function to node and its children
2121
2222
IMASPy objects generally live in a tree structure. Similar to Pythons
@@ -36,6 +36,9 @@ def visit_children(func, node, *, leaf_only=True, visit_empty=False):
3636
* ``False``: All nodes, including internal nodes
3737
3838
visit_empty: When set to True, also apply the function to empty nodes.
39+
accept_lazy: See documentation of :py:param:`iter_nonempty_()
40+
<imaspy.ids_structure.IDSStructure.iter_nonempty_.accept_lazy>`. Only
41+
relevant when :param:`visit_empty` is False.
3942
4043
Example:
4144
.. code-block:: python
@@ -53,7 +56,7 @@ def visit_children(func, node, *, leaf_only=True, visit_empty=False):
5356
iterator = node
5457
if not visit_empty and isinstance(node, IDSStructure):
5558
# Only iterate over non-empty nodes
56-
iterator = node.iter_nonempty_()
59+
iterator = node.iter_nonempty_(accept_lazy=accept_lazy)
5760

5861
for child in iterator:
5962
visit_children(func, child, leaf_only=leaf_only, visit_empty=visit_empty)
@@ -71,6 +74,10 @@ def resample(node, old_time, new_time, homogeneousTime=None, inplace=False, **kw
7174
def print_tree(structure, hide_empty_nodes=True):
7275
"""Print the full tree of an IDS or IDS structure.
7376
77+
Caution:
78+
With :py:param:`hide_empty_nodes` set to ``True``, lazy-loaded IDSs will only
79+
show loaded nodes.
80+
7481
Args:
7582
structure: IDS structure to print
7683
hide_empty_nodes: Show or hide nodes without value.

0 commit comments

Comments
 (0)