Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

xarray.core.variable.as_variable() part of the public API? #1303

Closed
benbovy opened this issue Mar 9, 2017 · 5 comments · Fixed by #1422
Closed

xarray.core.variable.as_variable() part of the public API? #1303

benbovy opened this issue Mar 9, 2017 · 5 comments · Fixed by #1422

Comments

@benbovy
Copy link
Member

benbovy commented Mar 9, 2017

Is it safe to use xarray.core.variable.as_variable() externally? I guess that currently it is not.

I have a specific use case where this would be very useful.

I'm working on a package that heavily uses and extends xarray for landscape evolution modeling, and inside a custom class for model parameters I want to be able to create xarray.Variable objects on the fly from any provided object, e.g., a scalar value, an array-like, a (dims, data[, attrs]) tuple, another xarray.Variable, a xarray.DataArray... exactly what xarray.core.variable.as_variable() does.

Although I know that Variable objects are not needed in most use cases, in this specific case a clean solution would be the following

import xarray as xr

class Parameter(object):

    def to_variable(self, obj):
        return xr.as_variable(obj)
        # ... some validation logic on, e.g., data type, value bounds, dimensions...
        # ... add default attributes to the created variable (e.g., units, description...)

I don't think it is a viable option to copy as_variable() and all its dependent code in my package as it seems to have quite a lot of logic implemented.

A workaround using only public API would be something like:

class Parameter(object):

    def to_variable(self, obj):
        return xr.Dataset(data_vars={'v': obj}).variables['v']

but it feels a bit hacky.

@rabernat
Copy link
Contributor

rabernat commented Mar 9, 2017 via email

@benbovy
Copy link
Member Author

benbovy commented Mar 9, 2017

mmh not sure if a custom datastore would be most appropriate for my problem, which actually is not about loading (writing) data of a given format into (from) a Dataset.

Ultimately, what I'm trying to do is implementing an API for creating numerical model components, which is heavily inspired from Django ORM (i.e., how models and fields are defined), and use the xarray data structures as an interface.

Let me show a more detailed (though still very incomplete) example of how it would look like:

class Parameter(object):
    def __init__(self, default=None, allowed_values=None, exclude_dims=None,
                 include_dims=None, bounds=(None, None), attrs=None):
        # ...

    def to_variable(self, obj):
        return xr.as_variable(obj)
        # ... some validation logic


class Model(object):
    def __init__(**params):
        # ...

    def to_dataset(self):
        # ...

    def run(self, ds):
        # ... run simulation for this model


class StreamPower(Model):
    k = Parameter(default=7e-5, bounds=(0., None), attrs={'units': 'm**2/y'})
    m = Parameter(default=0.4)
    n = Parameter(default=1)

    class Meta:
        short_name = 'spow'
>>> model = StreamPower(k=('time', [7e-5, 8e-5, 10e-5]), m=0.5)
>>> ds = model.to_dataset()
>>> ds
<xarray.Dataset>
Dimensions:  (time: 3)
Dimensions without coordinates: time
Data variables:
    spow__k  (time) float64 7e-05 8e-05 0.0001
    spow__m  float64 0.5
    spow__n  int64 1
>>> ds.spow__k
<xarray.DataArray 'spow__k' (time: 3)>
array([  7.000000e-05,   8.000000e-05,   1.000000e-04])
Dimensions without coordinates: time
Attributes:
    units: m**2/y

@rabernat
Copy link
Contributor

rabernat commented Mar 9, 2017

Just wanted to link to a somewhat related discussion happening in brian-rose/climlab#50.

@benbovy
Copy link
Member Author

benbovy commented Mar 9, 2017

Thanks for the link @rabernat!

climlab looks very nice and actually very close to what I'm trying to do with landscape evolution modeling (just prototyping at the moment, no repository on GH yet).

What I want to do is something that is a bit more deeply integrated with xarray than what you suggest for climlab. The basic idea is that a parameter of a model (i.e., a climlab.Process) can be either a scalar, an array-like with space and/or time dimensions and/or even its own dimension (useful for exploring a parameter space), or a function (callable) of other model parameters/inputs/state variables (it this case it is also an output which inherits the dimensions from the other variables used for its computation). I think that a lot of logic implemented in xarray can be reused for handling this.

@shoyer
Copy link
Member

shoyer commented Mar 9, 2017

Indeed, this isn't public API currently. But I would not be opposed to making it public, assuming it's documented in a sufficiently clear way (which I think it is).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants