### a ytBaseModel pydantic class experiment

this notebook subclasses pydantic's `BaseModel` class to create an abstract `ytBaseModel` class that includes some business for executing the corresponding methods. 

The `ytBaseModel` class:
* uses the `inspect.getfullargspec` within `ytBaseModel._run()` to retrieve the expected argument order of the yt method and then calls the yt method using the values in the `ytBaseModel` attributes.
* checks if any of args being passed to the yt call are themselves `ytBaseModel` instances, in which case `ytBaseModel._run()` gets called for that argument.
* uses a protected dictionary attribute, `_arg_mapping`, to map any argument names we have changed betwen yt's internal calls and the pydantic class. `_args_mapping['yt_name'] -> 'schema_name'`.

So here's the base class:

In [18]:
from pydantic import BaseModel
from inspect import getfullargspec


class ytBaseModel(BaseModel):
    _arg_mapping: dict = {}  # mapping from internal yt name to schema name

    def _run(self):
        # this method actually executes the yt code 
        
        # first make sure yt is imported and then get our function handle. This assumes
        # that our class name exists in yt's top level api.
        import yt
        func = getattr(yt, type(self).__name__)
        print(f"pulled func {func}")

        # now we get the arguments for the function:
        # func_spec.args, which lists the named arguments and keyword arguments.
        # ignoring vargs and kw-only args for now...
        # see https://docs.python.org/3/library/inspect.html#inspect.getfullargspec
        func_spec = getfullargspec(func)

        # the list that we'll use to eventually call our function
        the_args = []

        # the argument position number at which we have default values (a little hacky, should
        # be a better way to do this, and not sure how to scale it to include *args and **kwargs)
        n_args = len(func_spec.args)  # number of arguments
        if func_spec.defaults is None:
            # no default args, make sure we never get there...
            named_kw_start_at = n_args + 1
        else:
            # the position at which named keyword args start
            named_kw_start_at = n_args - len(func_spec.defaults)
        print(f"keywords start at {named_kw_start_at}")

        # loop over the call signature arguments and pull out values from our pydantic class .
        # this is recursive! will call _run() if a given argument value is also a ytBaseModel.
        for arg_i, arg in enumerate(func_spec.args):
            # check if we've remapped the yt internal argument name for the schema
            if arg in self._arg_mapping:
                arg = self._arg_mapping[arg]

            # get the value for this argument. If it's not there, attempt to set default values 
            # for arguments needed for yt but not exposed in our pydantic class
            try:
                arg_value = getattr(self, arg)
            except AttributeError:
                if arg_i >= named_kw_start_at:
                    # we are in the named keyword arguments, grab the default
                    # the func_spec.defaults tuple 0 index is the first named
                    # argument, so need to offset the arg_i counter
                    default_index = arg_i - named_kw_start_at
                    arg_value = func_spec.defaults[default_index]
                else:
                    raise AttributeError

            # check if this argument is itself a ytBaseModel for which we need to run
            # this should make this a fully recursive function?
            # if hasattr(arg_value,'_run'):
            if isinstance(arg_value, ytBaseModel):
                print(f"{arg_value} is a ytBaseModel, calling {arg_value}._run() now...")
                arg_value = arg_value._run()

            the_args.append(arg_value)
        print(the_args)
        return func(*the_args)

Now we'll create two new classes for `load` and `SlicePlot`:

In [19]:
class load(ytBaseModel):
    filename: str
    _arg_mapping: dict = {"fn": "filename"}

class SlicePlot(ytBaseModel):
    ds: load = None
    normal: str = 'x'
    field: tuple = ('all', 'Density')
    _arg_mapping: dict = {"fields": "field"}

now let's instantiate some classes:

In [56]:
ds = load(filename="IsolatedGalaxy/galaxy0030/galaxy0030")
slc = SlicePlot(ds=ds, dim='x',field=("PartType0","Density"))

so these objects are normal pydantic classes:

In [57]:
ds.schema()

{'title': 'load',
 'type': 'object',
 'properties': {'filename': {'title': 'Filename', 'type': 'string'}},
 'required': ['filename']}

In [58]:
slc.schema()

{'title': 'SlicePlot',
 'type': 'object',
 'properties': {'ds': {'$ref': '#/definitions/load'},
  'normal': {'title': 'Normal', 'default': 'x', 'type': 'string'},
  'field': {'title': 'Field',
   'default': ('all', 'Density'),
   'type': 'array',
   'items': {}}},
 'definitions': {'load': {'title': 'load',
   'type': 'object',
   'properties': {'filename': {'title': 'Filename', 'type': 'string'}},
   'required': ['filename']}}}

but now we can use .run() to execute!

In [59]:
slc._run()

pulled func <function SlicePlot at 0x119707488>
keywords start at 1
filename='IsolatedGalaxy/galaxy0030/galaxy0030' is a ytBaseModel, calling filename='IsolatedGalaxy/galaxy0030/galaxy0030'._run() now...
pulled func <function load at 0x118f006a8>
keywords start at 1
[]


yt : [ERROR    ] 2021-04-16 13:49:43,668 None of the arguments provided to load() is a valid file
yt : [ERROR    ] 2021-04-16 13:49:43,669 Please check that you have used a correct path


YTOutputNotIdentified: Supplied () {}, but could not load!

In [7]:
from pydantic import BaseModel
from inspect import getfullargspec
from typing import Optional


class ytBaseModel(BaseModel):
    _arg_mapping: dict = {}  # mapping from internal yt name to schema name
    _yt_operation: Optional[str]
        
    def _run(self):
        # this method actually executes the yt code 
        
        # first make sure yt is imported and then get our function handle. This assumes
        # that our class name exists in yt's top level api.
        import yt
        print(self._yt_operation)
        
        funcname = getattr(self, "_yt_operation", type(self).__name__ )        
        func = getattr(yt, funcname)
        print(f"pulled func {func}")

        # now we get the arguments for the function:
        # func_spec.args, which lists the named arguments and keyword arguments.
        # ignoring vargs and kw-only args for now...
        # see https://docs.python.org/3/library/inspect.html#inspect.getfullargspec
        func_spec = getfullargspec(func)

        # the list that we'll use to eventually call our function
        the_args = []

        # the argument position number at which we have default values (a little hacky, should
        # be a better way to do this, and not sure how to scale it to include *args and **kwargs)
        n_args = len(func_spec.args)  # number of arguments
        if func_spec.defaults is None:
            # no default args, make sure we never get there...
            named_kw_start_at = n_args + 1
        else:
            # the position at which named keyword args start
            named_kw_start_at = n_args - len(func_spec.defaults)
        print(f"keywords start at {named_kw_start_at}")

        # loop over the call signature arguments and pull out values from our pydantic class .
        # this is recursive! will call _run() if a given argument value is also a ytBaseModel.
        for arg_i, arg in enumerate(func_spec.args):
            # check if we've remapped the yt internal argument name for the schema
            if arg in self._arg_mapping:
                arg = self._arg_mapping[arg]

            # get the value for this argument. If it's not there, attempt to set default values 
            # for arguments needed for yt but not exposed in our pydantic class
            print(arg)
            try:
                arg_value = getattr(self, arg)
            except AttributeError:
                if arg_i >= named_kw_start_at:
                    # we are in the named keyword arguments, grab the default
                    # the func_spec.defaults tuple 0 index is the first named
                    # argument, so need to offset the arg_i counter
                    default_index = arg_i - named_kw_start_at
                    arg_value = func_spec.defaults[default_index]
                else:
                    raise AttributeError

            # check if this argument is itself a ytBaseModel for which we need to run
            # this should make this a fully recursive function?
            # if hasattr(arg_value,'_run'):
            if isinstance(arg_value, ytBaseModel) or isinstance(arg_value, ytParameter):
                print(f"{arg_value} is a {type(arg_value)}, calling {arg_value}._run() now...")
                arg_value = arg_value._run()

            the_args.append(arg_value)
        print(the_args)
        return func(*the_args) 
    
class ytParameter(BaseModel):    
    _skip_these = ['comments']
    
    def _run(self):
        p = [getattr(self,key) for key in self.schema()['properties'].keys() if key not in self._skip_these]
        if len(p) > 1:
            raise ValueError("whoops. ytParameter instances can only have single values")
        return p[0]


In [60]:
class Dataset(ytBaseModel):
    """ 
    The dataset model to load and that will be drawn from for other classes. Filename is the only required field. 
    """
    filename: str
    name: str = "Data for Science"
    comments: Optional[str] 
    grammar: str = "registration"
    _yt_operation: str = "load"
    _arg_mapping: dict = {'fn' : 'filename'}

class ytModel(ytBaseModel):
    '''
    An example for a yt analysis schema using Pydantic
    '''
    Load: Dataset

    class Config:
        title = 'yt example'
        underscore_attrs_are_private = True
        
    def _run(self):
        # for the top level model, we override this. Nested objects will still be recursive!
        att = getattr(self, "Load")
        return att._run()

In [65]:
validated_json = {'Load': {"filename": "IsolatedGalaxy/galaxy0030/galaxy0030"}}

In [68]:
yt_mod = ytModel(Load = validated_json["Load"])

In [69]:
yt_mod

ytModel(Load=Dataset(filename='IsolatedGalaxy/galaxy0030/galaxy0030', name='Data for Science', comments=None, grammar='registration'))

In [70]:
ds = yt_mod._run()

AttributeError: module 'yt' has no attribute 'Dataset'