added support of slicing for IDSStructArray #20

prasad-sawantdesai · 2025-02-27T13:03:49Z

Added support of slicing for IDSStructArray

import imas

entry = imas.DBEntry("imas:mdsplus?user=public;pulse=116000;run=4;database=ITER_MD;version=3", "r")  
walids = entry.get("wall", autoconvert=False) # lazy=True
units = walids.description_2d[0].vessel.unit
print(units[8:])
print(units[0:100])
print(units[4:])
print(units[5])

maarten-ic · 2025-02-28T09:02:44Z

imas/ids_struct_array.py

+                if stop is None:
+                    stop = sys.maxsize
+
+                for i in range(start or 0, stop, step or 1):


Hi Prasad, Python slices are weird and have lots of edge cases with negative indices... the current logic doesn't quite capture that.

For example:

>>> import imas >>> entry = imas.DBEntry("imas:hdf5?path=./test", "w") 09:44:08 INFO Parsing data dictionary version 4.0.0 @dd_zip.py:166 >>> cp = entry.factory.core_profiles() >>> cp.profiles_1d.resize(10) >>> cp.ids_properties.homogeneous_time = 1 >>> cp.time = [*range(10)] 09:45:05 INFO Assigning incorrect type 'int64' to <IDSNumericArray (IDS:core_profiles, time, empty FLT_1D)>, attempting automatic conversion. @ids_primitive.py:483 >>> entry.put(cp) >>> cp.profiles_1d[:-5] [<IDSStructure (IDS:core_profiles, profiles_1d[0])>, <IDSStructure (IDS:core_profiles, profiles_1d[1])>, <IDSStructure (IDS:core_profiles, profiles_1d[2])>, <IDSStructure (IDS:core_profiles, profiles_1d[3])>, <IDSStructure (IDS:core_profiles, profiles_1d[4])>] >>> entry.get("core_profiles", lazy=True).profiles_1d[:-5] Traceback (most recent call last): File "<stdin>", line 1, in <module> File "/home/maarten/projects/iter-python/imas-python/imas/ids_struct_array.py", line 137, in __getitem__ return self.value[item] ~~~~~~~~~~^^^^^^ TypeError: 'NoneType' object is not subscriptable

Edge cases with negative indices:

Negative start, e.g. [-5:5], all items from the fifth from the end to the fifth from the beginning

Negative stop, e.g. [2:-2], all items from the third to the second-to-last

Negative increment, e.g. [::-1] all items in reverse order

What is the the use case for this feature? With a list comprehension users could capture (most) scenarios as well:

p1ds = [cp.profiles_1d[i] for i in range(2, 5)] # in reverse for p1d in reversed(cp.profiles_1d): ...

Thank you for reviewing @maarten-ic

Actually, this issue arises when using slicing for wall IDs, as shown below. There is a workaround—you can convert it into a list—but then you lose the IDSStructArray metadata. It would be convenient for users to work it as a list, what do you think @olivhoenen but of course we need to consider edge cases.
units = list(walids.description_2d[0].vessel.unit)

import imaspy as imas entry = imas.DBEntry("imas:mdsplus?user=public;pulse=116000;run=4;database=ITER_MD;version=3", "r") walids = entry.get("wall", autoconvert=False) units = walids.description_2d[0].vessel.unit # units = list(walids.description_2d[0].vessel.unit) workaround print(units[8:]) # 09:01:55 INFO Parsing data dictionary version 4.0.0 @dd_zip.py:166 # 09:01:59 INFO Parsing data dictionary version 3.40.0 @dd_zip.py:166 # Traceback (most recent call last): # File "/home/ITER/sawantp1/git/idstoolsimaspy/idstools/testbug.py", line 6, in <module> # print(units[8:]) # ~~~~~^^^^ # File "/work/imas/opt/EasyBuild/software/IMASPy/1.2.0-intel-2023b/lib/python3.11/site- # packages/imaspy/ids_struct_array.py", line 126, in __getitem__ # list_idx = int(item) ^^^^^^^^^ TypeError: int() argument must be a string, a bytes-like object or a real number, not 'slice'

Hi Prasad,

you can convert it into a list—but then you lose the IDSStructArray metadata

I don't understand the drawback: with the slicing you have implemented, unit[8:] is also a list, right?

when using slicing for wall IDs, as shown below

Yes, sure, but what's the actual use case for doing this? 🙂 I don't understand why a user would want to print all units except the first 8 (as you do in your code listing).

Note that I found out that slice objects have a method that give you the start/stop/step values for a sequence of given length: https://docs.python.org/3/reference/datamodel.html#slice.indices

This came up in the context of @prasad-sawantdesai work on IDStools, where different scripts are receiving IMAS URIs that may contain fragments (#ids:occ/ids_path) where the ids_path has slicing (as described in https://imas-data-dictionary.readthedocs.io/en/latest/_downloads/9e7ffd162c0237b61062528affd5fe2a/IDS-path-syntax.md).

What would be the easiest/recommended way to go from an ids_path string that contains slicing indices to an actual list of objects?

ps: in the case of the list of unit from the wall IDS, the machine description contains many different types of limiter and vessel units, not all of them make sense to overlay when plotting for example an equilibrium psi map

Hi Olivier,

where the ids_path has slicing

This PR doesn't really solve that case either. I could still not do wall.description_2d[0].vessel.unit[8:].element[:].name: the slice at unit[8:] will return a list that has no attribute element.

What would be the easiest/recommended way to go from an ids_path string that contains slicing indices to an actual list of objects?

This is actually a surprisingly tricky problem! And it mostly has to do with the possibility that IDSs don't have homogeneous array sizes deep down in the tree.

ibex has an implementation of this and, since it converts everything to JSON (which supports ragged arrays), it can avoid some of the difficulty. Looks like it is mostly implemented in _expand_single_path_element in this file. Though quickly scanning it, it looks like it doesn't handle the negative indices either (which is probably fine for the ibex use case).

I see two options for implementing your use case, with option 1 the easiest to implement and option 2 more generally useful 🙂

Add a method to IDSPath, for example IDSPath.goto_slices that returns a list of elements that match the slice syntax. It would be an extended form of IDSPath.goto (which returns a single element and raises a NotImplementedError when a slice is used in the path),

Implement slices on arrays of structures, which return a new object (e.g. IDSSlice). The IDSSlice can keep track of all objects that matched the slice expression, and allow further slicing of child elements, getting the matching elements, etc. That could look as follows:

>>> ids_slice = wall.description_2d[0].vessel.unit[8:] >>> ids_slice <IDSSlice (IDS:wall, description_2d[0].vessel.unit[8:], 4 matches)> >>> ids_slice.element <IDSSlice (IDS:wall, description_2d[0].vessel.unit[8:].element, 4 matches)> >>> ids_slice.element[:] <IDSSlice (IDS:wall, description_2d[0].vessel.unit[8:].element[:], 6 matches)> >>> ids_slice.element[:].name <IDSSlice (IDS:wall, description_2d[0].vessel.unit[8:].element[:].name, 6 matches)> >>> list(ids_slice.element[:].name) [<IDSString0D (IDS:wall, description_2d[0].vessel.unit[8].element[0].name)>, <IDSString0D (IDS:wall, description_2d[0].vessel.unit[8].element[1].name)>, <IDSString0D (IDS:wall, description_2d[0].vessel.unit[9].element[0].name)>, ... ]

In either case you'll need to decide how to handle edge cases. For example, what does it mean when you do unit[:2].element[2] when unit[0] has 4 elements and unit[1] has only 2 elements. Is this an IndexError? Or does this return a match of [unit[0].element[2]] and just ignore that there is no unit[1].element[2]?

There's probably more dragons hiding when implementing something like this -- there's a reason that it's not currently implemented in IDSPath 😉

I'm not sure why inhomogeneity is a problem there: if a user ask for a slice that does not exist we have an out-of-bound error, whether it is homogeneous or not (admittedly the risk increase in inhomogeneous case, but that should not be a concern for the implementation). IMO index-errors shall not be swallowed if they happen in nested elements, the error shall then just be propagated.

What do you think @prasad-sawantdesai, would something like 1 or 2 be useful in this case? 2 looks more appealing to me (but maybe that's because of the code snippet 🙄 ).

I will try to implement second option :)

added support of slicing for IDSStructArray

39d8a73

prasad-sawantdesai requested review from olivhoenen and maarten-ic February 27, 2025 13:03

maarten-ic reviewed Feb 28, 2025

View reviewed changes

olivhoenen requested a review from Simon-McIntosh March 7, 2025 10:40

removed pull_request event

6704375

maarten-ic mentioned this pull request Mar 17, 2025

Removed pull_request event to prevent workflow from running twice #26

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

added support of slicing for IDSStructArray #20

added support of slicing for IDSStructArray #20

Uh oh!

prasad-sawantdesai commented Feb 27, 2025

Uh oh!

maarten-ic Feb 28, 2025

Uh oh!

prasad-sawantdesai Mar 3, 2025

Uh oh!

maarten-ic Mar 4, 2025

Uh oh!

olivhoenen Mar 7, 2025

Uh oh!

maarten-ic Mar 7, 2025

Uh oh!

olivhoenen Mar 13, 2025

Uh oh!

prasad-sawantdesai Mar 14, 2025

Uh oh!

Uh oh!

added support of slicing for IDSStructArray #20

Are you sure you want to change the base?

added support of slicing for IDSStructArray #20

Uh oh!

Conversation

prasad-sawantdesai commented Feb 27, 2025

Uh oh!

maarten-ic Feb 28, 2025

Choose a reason for hiding this comment

Uh oh!

prasad-sawantdesai Mar 3, 2025

Choose a reason for hiding this comment

Uh oh!

maarten-ic Mar 4, 2025

Choose a reason for hiding this comment

Uh oh!

olivhoenen Mar 7, 2025

Choose a reason for hiding this comment

Uh oh!

maarten-ic Mar 7, 2025

Choose a reason for hiding this comment

Uh oh!

olivhoenen Mar 13, 2025

Choose a reason for hiding this comment

Uh oh!

prasad-sawantdesai Mar 14, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!