Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ODS leaf type consolidation and arrayification #74

Closed
Karel-van-de-Plassche opened this issue Jul 15, 2019 · 9 comments
Closed

ODS leaf type consolidation and arrayification #74

Karel-van-de-Plassche opened this issue Jul 15, 2019 · 9 comments

Comments

@Karel-van-de-Plassche
Copy link
Contributor

For a non-standard use-case (parsing a ahead of time unknown nested structure) I needed to convert all leaf values from strings to ints/floats if possible, and to consolidated arrays of ODS's into a single array of floats/ints. I'm happy to open a pull request for these functions, but I'm not sure if you want to support this functionality and if so, where, e.g. as method of the ODS structure? Or function in omas_utils.

@orso82
Copy link
Member

orso82 commented Jul 15, 2019

@Karel-van-de-Plassche can you provide an example to better understand your use case?

@Karel-van-de-Plassche
Copy link
Contributor Author

Come to think of it, maybe it's better to handle this on parse-time instead of after-the-fact. However, first 'dumbly' parsing the file, looking at it in OMFIT, and then converting what needed was a convenient workflow for me.

The use case was this, I had a namelist with unknown types/fields e.g.

BoundCondPanel.current.fixedTime                            :⋅
BoundCondPanel.current.option                               : Constant Value
BoundCondPanel.current.source                               : Ex-File
BoundCondPanel.current.tpoly.select[0]                      : false
BoundCondPanel.current.tpoly.select[10]                     : false
BoundCondPanel.current.tpoly.select[11]                     : false

I then just parsed it with everything a string and used OMAS' 'deep' access to fill an ODS, then converted everything to the right type using

def is_int(string):
    try:
        int(string)
    except ValueError:
        return False
    return True

def is_float(string):
    try:
        float(string)
    except ValueError:
        return False
    return True

def convert_leafs(self):
    for item in self.keys():
        val = self.getraw(item)
        if isinstance(val, ODS):
            val.convert_leafs()
        else:
            if val == 'false':
                self.setraw(item, False)
            elif val == 'true':
                self.setraw(item, False)
            elif is_int(val):
                self.setraw(item, int(val))
            elif is_float(val):
                self.setraw(item, float(val))

Then I had 1D/2D arrays of ODSs, which can not be nicely plotted in OMFIT, so I converted everything to numpy arrays using

def consolidate_arraylike_leafs(self):
    for item in self.keys():
        val = self.getraw(item)
        if isinstance(val, ODS):
            if len(val.paths()[0]) != 1 or not is_int(val.paths()[0][0]):
                # Is not an array, recurse
                val.consolidate_arraylike_leafs()
            else:
                # Is an array, try to convert
                child = val.getraw(0)
                if isinstance(child, str):
                    # Represent all strings in arrays as a single dtype
                    dtype = 'U128'
                elif isinstance(child, np.ndarray):
                    # Copy the dtype of the first underlying array
                    dtype = child.dtype
                else:
                    # This is probably is singular value, just copy the dtype
                    dtype = type(child)

                if isinstance(child, np.ndarray):
                    # If it is already an numpy array, we don't need to convert, just to stack if possible
                    subarrs = val.values()
                    for ii in range(len(subarrs)):
                        if subarrs[ii].dtype != dtype:
                            if dtype == np.dtype(float) and subarrs[ii].dtype == np.dtype(int):
                                # The dtype should be float, but for this subarray is int
                                subarrs[ii] = subarrs[ii].astype(float)
                            elif dtype == np.dtype(float) and np.issubdtype(np.dtype(str), subarrs[ii].dtype):
                                # The dtype should be float, but for this subarray is str
                                subarrs[ii] = np.full_like(subarrs[ii], np.nan, dtype=float)
                            elif dtype == np.dtype(int) and np.issubdtype(np.dtype(str), subarrs[ii].dtype):
                                # The dtype should be int, but for this subarray is str
                                print('Converting unicode str to int')
                                print('Define this conversion!')
                                embed()
                            else:
                                # No sane way to define this conversion
                                print('Cannot convert, giving up on {!s}!'.format(val))
                    try:
                        arr = np.vstack(subarrs)
                    except:
                        print('Cannot stack, giving up on {!s}!'.format(val))
                    else:
                        self.setraw(item, arr)
                    continue # No need to convert stuff, continue the loop

                # Deal with missing values, use -99999 for ints (UGLY!)
                if isinstance(child, float):
                    vals = [dtype(val) if is_float(val) else None for val in val.values()]
                elif isinstance(child, int):
                    vals = [dtype(val) if is_int(val) else -99999 for val in val.values()]
                else:
                    vals = val.values()

                try:
                    arr = np.fromiter(vals, dtype, count=len(val))
                except:
                    print('UNEXPECTED CONVERSION ERROR in {!s}'.format(val))
                    arr = "CONVERSION ERROR"

                self.setraw(item, arr)

However, it might be advisable to not have something like this in OMAS, as it might be better to 'force' people to do these conversions at compile time.

@smithsp
Copy link
Member

smithsp commented Jul 16, 2019

@Karel-van-de-Plassche Across which dimension were your array or list of ODS's? An existing dimension of the underlying IMAS structure, or across something like shot or scale length? Perhaps you would appreciate a concatenate functionality, which adds a dimension to the collection of ODS's. Also OMFIT has an OMFITcollection, which helps to collect leaves as indicated. However, I tried to show a sample for OMFITcollection, and it failed, so I will open a bug report about that.

@orso82
Copy link
Member

orso82 commented Jul 16, 2019

@Karel-van-de-Plassche I just pushed a commit to allow passing a input_data_process_functions within the omas_environment. See https://github.com/gafusion/omas/blob/master/omas/examples/ods_process_input_data.py

@orso82
Copy link
Member

orso82 commented Jul 16, 2019

@Karel-van-de-Plassche your arrayfication use-case also prompted me to add this example
https://github.com/gafusion/omas/blob/master/omas/examples/across_ODSs.py
It is not exactly what you want, but may still be of interest

@Karel-van-de-Plassche
Copy link
Contributor Author

Thanks for your additions and suggestions! What I was trying to achieve, is to change something that was naively parsed, which resulted in something like this:

image

which is converted by running the consolidation script twice to (where value is now a 2D numpy array):

image

Find attached the OMFIT tree (sorry, just started learning, is this the way to share?)
JETTO.zip

@orso82
Copy link
Member

orso82 commented Jul 18, 2019

@Karel-van-de-Plassche it looks like you are working on a new JETTO module?
That would be nice 👍

Was this the JETTO input or output files that you were trying to parse?
Depending on what you are trying to achieve you may want to consider creating some dedicated OMFIT classes to handle the JETTO specific files. You may inherit from the ODS class, but this may or may not be the best approach...

@Karel-van-de-Plassche
Copy link
Contributor Author

Karel-van-de-Plassche commented Jul 19, 2019

For now I was mostly looking around what was available. I am looking what kind of python scripts for JETTO are lying around, and was giving my own parser a small upgrade. I like the recursive tree and path-based access from ODS, but I'm looking for something a bit more standalone than OMFIT right now. I put it in OMFIT as I like the tree display, and it never hurts to keep future extensibility in mind.

This was a JAMS input file (that then generates the jetto input files via a GUI)

@orso82
Copy link
Member

orso82 commented Jul 19, 2019

@Karel-van-de-Plassche sounds good. I'll point out that the OMFIT classes can be imported without running OMFIT itself, and also that the framework does not need to be run from a GUI.

At any rate, any python parser/writer is welcome, and if it ever happens that someone else wants to write a JETTO module we'll make sure to put them in contact with you :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants