ODS leaf type consolidation and arrayification #74

Karel-van-de-Plassche · 2019-07-15T16:55:25Z

For a non-standard use-case (parsing a ahead of time unknown nested structure) I needed to convert all leaf values from strings to ints/floats if possible, and to consolidated arrays of ODS's into a single array of floats/ints. I'm happy to open a pull request for these functions, but I'm not sure if you want to support this functionality and if so, where, e.g. as method of the ODS structure? Or function in omas_utils.

orso82 · 2019-07-15T22:14:20Z

@Karel-van-de-Plassche can you provide an example to better understand your use case?

Karel-van-de-Plassche · 2019-07-16T06:29:01Z

Come to think of it, maybe it's better to handle this on parse-time instead of after-the-fact. However, first 'dumbly' parsing the file, looking at it in OMFIT, and then converting what needed was a convenient workflow for me.

The use case was this, I had a namelist with unknown types/fields e.g.

BoundCondPanel.current.fixedTime                            :⋅
BoundCondPanel.current.option                               : Constant Value
BoundCondPanel.current.source                               : Ex-File
BoundCondPanel.current.tpoly.select[0]                      : false
BoundCondPanel.current.tpoly.select[10]                     : false
BoundCondPanel.current.tpoly.select[11]                     : false

I then just parsed it with everything a string and used OMAS' 'deep' access to fill an ODS, then converted everything to the right type using

def is_int(string):
    try:
        int(string)
    except ValueError:
        return False
    return True

def is_float(string):
    try:
        float(string)
    except ValueError:
        return False
    return True

def convert_leafs(self):
    for item in self.keys():
        val = self.getraw(item)
        if isinstance(val, ODS):
            val.convert_leafs()
        else:
            if val == 'false':
                self.setraw(item, False)
            elif val == 'true':
                self.setraw(item, False)
            elif is_int(val):
                self.setraw(item, int(val))
            elif is_float(val):
                self.setraw(item, float(val))

Then I had 1D/2D arrays of ODSs, which can not be nicely plotted in OMFIT, so I converted everything to numpy arrays using

def consolidate_arraylike_leafs(self):
    for item in self.keys():
        val = self.getraw(item)
        if isinstance(val, ODS):
            if len(val.paths()[0]) != 1 or not is_int(val.paths()[0][0]):
                # Is not an array, recurse
                val.consolidate_arraylike_leafs()
            else:
                # Is an array, try to convert
                child = val.getraw(0)
                if isinstance(child, str):
                    # Represent all strings in arrays as a single dtype
                    dtype = 'U128'
                elif isinstance(child, np.ndarray):
                    # Copy the dtype of the first underlying array
                    dtype = child.dtype
                else:
                    # This is probably is singular value, just copy the dtype
                    dtype = type(child)

                if isinstance(child, np.ndarray):
                    # If it is already an numpy array, we don't need to convert, just to stack if possible
                    subarrs = val.values()
                    for ii in range(len(subarrs)):
                        if subarrs[ii].dtype != dtype:
                            if dtype == np.dtype(float) and subarrs[ii].dtype == np.dtype(int):
                                # The dtype should be float, but for this subarray is int
                                subarrs[ii] = subarrs[ii].astype(float)
                            elif dtype == np.dtype(float) and np.issubdtype(np.dtype(str), subarrs[ii].dtype):
                                # The dtype should be float, but for this subarray is str
                                subarrs[ii] = np.full_like(subarrs[ii], np.nan, dtype=float)
                            elif dtype == np.dtype(int) and np.issubdtype(np.dtype(str), subarrs[ii].dtype):
                                # The dtype should be int, but for this subarray is str
                                print('Converting unicode str to int')
                                print('Define this conversion!')
                                embed()
                            else:
                                # No sane way to define this conversion
                                print('Cannot convert, giving up on {!s}!'.format(val))
                    try:
                        arr = np.vstack(subarrs)
                    except:
                        print('Cannot stack, giving up on {!s}!'.format(val))
                    else:
                        self.setraw(item, arr)
                    continue # No need to convert stuff, continue the loop

                # Deal with missing values, use -99999 for ints (UGLY!)
                if isinstance(child, float):
                    vals = [dtype(val) if is_float(val) else None for val in val.values()]
                elif isinstance(child, int):
                    vals = [dtype(val) if is_int(val) else -99999 for val in val.values()]
                else:
                    vals = val.values()

                try:
                    arr = np.fromiter(vals, dtype, count=len(val))
                except:
                    print('UNEXPECTED CONVERSION ERROR in {!s}'.format(val))
                    arr = "CONVERSION ERROR"

                self.setraw(item, arr)

However, it might be advisable to not have something like this in OMAS, as it might be better to 'force' people to do these conversions at compile time.

smithsp · 2019-07-16T17:48:53Z

@Karel-van-de-Plassche Across which dimension were your array or list of ODS's? An existing dimension of the underlying IMAS structure, or across something like shot or scale length? Perhaps you would appreciate a concatenate functionality, which adds a dimension to the collection of ODS's. Also OMFIT has an OMFITcollection, which helps to collect leaves as indicated. However, I tried to show a sample for OMFITcollection, and it failed, so I will open a bug report about that.

mention #74

orso82 · 2019-07-16T20:44:53Z

@Karel-van-de-Plassche I just pushed a commit to allow passing a input_data_process_functions within the omas_environment. See https://github.com/gafusion/omas/blob/master/omas/examples/ods_process_input_data.py

mention #74

orso82 · 2019-07-16T20:56:29Z

@Karel-van-de-Plassche your arrayfication use-case also prompted me to add this example
https://github.com/gafusion/omas/blob/master/omas/examples/across_ODSs.py
It is not exactly what you want, but may still be of interest

Karel-van-de-Plassche · 2019-07-17T10:06:40Z

Thanks for your additions and suggestions! What I was trying to achieve, is to change something that was naively parsed, which resulted in something like this:

which is converted by running the consolidation script twice to (where value is now a 2D numpy array):

Find attached the OMFIT tree (sorry, just started learning, is this the way to share?)
JETTO.zip

orso82 · 2019-07-18T23:57:15Z

@Karel-van-de-Plassche it looks like you are working on a new JETTO module?
That would be nice 👍

Was this the JETTO input or output files that you were trying to parse?
Depending on what you are trying to achieve you may want to consider creating some dedicated OMFIT classes to handle the JETTO specific files. You may inherit from the ODS class, but this may or may not be the best approach...

Karel-van-de-Plassche · 2019-07-19T07:17:15Z

For now I was mostly looking around what was available. I am looking what kind of python scripts for JETTO are lying around, and was giving my own parser a small upgrade. I like the recursive tree and path-based access from ODS, but I'm looking for something a bit more standalone than OMFIT right now. I put it in OMFIT as I like the tree display, and it never hurts to keep future extensibility in mind.

This was a JAMS input file (that then generates the jetto input files via a GUI)

orso82 · 2019-07-19T15:43:14Z

@Karel-van-de-Plassche sounds good. I'll point out that the OMFIT classes can be imported without running OMFIT itself, and also that the framework does not need to be run from a GUI.

At any rate, any python parser/writer is welcome, and if it ever happens that someone else wants to write a JETTO module we'll make sure to put them in contact with you :)

smithsp mentioned this issue Jul 16, 2019

ods_sample fails for 0.44.1 #75

Closed

orso82 added a commit that referenced this issue Jul 16, 2019

Added input_data_process_functions as omas_environment

36f1a52

mention #74

orso82 added a commit that referenced this issue Jul 16, 2019

Added example on how to use ODSs for traversing multiple ODSs

b5e6ba5

mention #74

orso82 closed this as completed Nov 18, 2019

Karel-van-de-Plassche mentioned this issue Nov 26, 2019

Conversion between tabular/tensor-like data and nested tree data #78

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ODS leaf type consolidation and arrayification #74

ODS leaf type consolidation and arrayification #74

Karel-van-de-Plassche commented Jul 15, 2019

orso82 commented Jul 15, 2019

Karel-van-de-Plassche commented Jul 16, 2019

smithsp commented Jul 16, 2019

orso82 commented Jul 16, 2019

orso82 commented Jul 16, 2019

Karel-van-de-Plassche commented Jul 17, 2019

orso82 commented Jul 18, 2019

Karel-van-de-Plassche commented Jul 19, 2019 •

edited

Loading

orso82 commented Jul 19, 2019

ODS leaf type consolidation and arrayification #74

ODS leaf type consolidation and arrayification #74

Comments

Karel-van-de-Plassche commented Jul 15, 2019

orso82 commented Jul 15, 2019

Karel-van-de-Plassche commented Jul 16, 2019

smithsp commented Jul 16, 2019

orso82 commented Jul 16, 2019

orso82 commented Jul 16, 2019

Karel-van-de-Plassche commented Jul 17, 2019

orso82 commented Jul 18, 2019

Karel-van-de-Plassche commented Jul 19, 2019 • edited Loading

orso82 commented Jul 19, 2019

Karel-van-de-Plassche commented Jul 19, 2019 •

edited

Loading