ENH: finite difference method `diff` #495

andreas-h · 2015-07-25T16:33:20Z

adds diff method to both DataArray and Dataset. closes #490

Still to be done:

tests

Possible enhancements:

allow numeric axis instead of dim for DataArray objects. The signature would then be DataArray.diff(dim=None, n=1, axis=None) and I would need to check that exactly one of dim and axis is None. I find it a bit ugly; DataArray.diff(n=1, dim=None, axis=None) would be nicer in my view. But then the signatures for DataArray.diff and Dataset.diff would be different, which is also not nice.
allow specifying the new coordinate array explicitly via a coord kwarg instead of just taking the coordinate values of the upper bounds.

What do you think?

andreas-h · 2015-07-25T17:15:10Z

Actually, the current implementation leads to 0.0 for arrays inside a Dataset which don't have dim as a dimension. I personally would find it more intuitive if those arrays would not be touched at all.

Do you agree, or do you prefer this 0.0 for unaffected arrays?

shoyer · 2015-07-27T02:55:50Z

This looks very nice -- thanks for putting together the PR!

allow numeric axis instead of dim for DataArray objects.

I wouldn't bother with this. It's not so elegant, and dim already takes care of all the desired functionality.

allow specifying the new coordinate array explicitly via a coord kwarg instead of just taking the coordinate values of the upper bounds.

I wouldn't allow full control here, but maybe a keyword argument for choosing whether to take the "lower", "upper" or "mean" labels would be appropriate.

Actually, the current implementation leads to 0.0 for arrays inside a Dataset which don't have dim as a dimension. I personally would find it more intuitive if those arrays would not be touched at all.

I agree. It would be better to skip those variables entirely, like the current behavior for aggregation functions like mean.

shoyer · 2015-07-27T02:58:23Z

xray/core/dataset.py

+            return self
+        if n < 0:
+            raise ValueError('order `n` must be non-negative but got {}'
+                             ''.format((n)))


nit: one extra pair of parenthesis around n

andreas-h · 2015-07-27T15:27:48Z

I wouldn't allow full control here, but maybe a keyword argument for
choosing whether to take the "lower", "upper" or "mean" labels would be
appropriate.

I don't see how 'mean' could work, as the coord might be of dtype str.
But I'm implementing 'upper' and 'lower'. If the user wants something
different, she can always just swap the coord manually.

andreas-h · 2015-07-27T15:47:17Z

I'm not really happy with my implementation; the coordinate/variable handling in lines 1870-1884 are far from elegant. But is there a nicer way to do this?

shoyer · 2015-07-27T18:42:39Z

xray/core/dataset.py

+            if dim in var.dims or not var.dims:
+                if name not in self.coords:
+                    variables[name] = end[name] - start[name]
+                    variables[dim] = variables[name].coords[dim]


I would try:

kwargs_new = kwargs_end if coord_new == 'upper' else kwargs_start for name, var in iteritems(self.variables): if dim in var.dims: if name in self.data_vars: variables[name] = var.isel(**kwargs_end) - var.isel(**kwargs_start) else: # don't do arithmetic on coordinates variables[name] = var.isel(**kwargs_new) else: # these variables should be unchanged variables[name] = var # this private constructor preserves existing coordinates # it's also much faster, because it doesn't need to do validation difference = self._replace_vars_and_dims(variables)

this private constructor preserves existing coordinates

it's also much faster, because it doesn't need to do validation

difference = self._replace_vars_and_dims(variables)

_replace_vars_and_dims gave me an error because the size of the
dimension on which I do the diff is changing. Was I using it wrongly?

Let me test your PR and debug this. My guess is that some of the variables were ending up with inconsistent dimension sizes (the Dataset constructor resolves that by doing an outer join of the index labels).

shoyer · 2015-07-27T18:45:06Z

I don't see how 'mean' could work, as the coord might be of dtype str.
But I'm implementing 'upper' and 'lower'. If the user wants something
different, she can always just swap the coord manually.

Indeed, this would fail for string dtypes. I can see some possible utility in centered coordinates, especially for second order differences. We can certainly leave this for later, though.

shoyer · 2015-07-27T20:02:03Z

Just made a PR against your branch with my suggestion: https://github.com/andreas-h/xray/pull/1

It looks like it's working now.

shoyer · 2015-07-28T05:17:16Z

This could also use a bit of documentation -- at least mention on "What's New" and in the API docs.

shoyer · 2015-07-28T05:17:56Z

xray/core/dataset.py

+        if n == 0:
+            return self
+        if n < 0:
+            raise ValueError('order `n` must be non-negative but got {}'


please add a test that catches each of these exceptions

shoyer · 2015-08-19T20:00:27Z

@andreas-h we're going to release v0.6 in the next day or two. I'd love to include this enhancement in it if you have the time to finish it up :).

andreas-h · 2015-08-20T13:11:49Z

That's strange, I cannot reproduce the test failure on my system ... Any ideas?

shoyer · 2015-08-20T15:41:07Z

See http://stackoverflow.com/questions/19668395/str-format-for-python-2-6-gives-error-where-2-7-does-not

shoyer · 2015-08-20T15:45:01Z

Could you try doing rebase -i master to pick out your commits? Otherwise github shows everything since the merge (look at the commit tab on this PR).

shoyer · 2015-08-20T17:13:30Z

xray/core/dataset.py

+        # prepare new coordinate
+        if label == 'upper':
+            kwargs_new = kwargs_end
+        elif label == 'lower':


looks like we still need a unit test for label='lower'

ENH: finite difference method `diff`

shoyer · 2015-08-21T18:13:00Z

OK, merging.

Thank you!

shoyer reviewed Jul 27, 2015
View reviewed changes

shoyer reviewed Jul 28, 2015
View reviewed changes

shoyer reviewed Aug 20, 2015
View reviewed changes

ENH: finite difference method diff

a112dce

shoyer added a commit that referenced this pull request Aug 21, 2015

Merge pull request #495 from andreas-h/diff

f5575b2

ENH: finite difference method `diff`

shoyer merged commit f5575b2 into pydata:master Aug 21, 2015

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ENH: finite difference method `diff` #495

ENH: finite difference method `diff` #495

andreas-h commented Jul 25, 2015

andreas-h commented Jul 25, 2015

shoyer commented Jul 27, 2015

shoyer Jul 27, 2015

andreas-h commented Jul 27, 2015

andreas-h commented Jul 27, 2015

shoyer Jul 27, 2015

andreas-h Jul 27, 2015

shoyer Jul 27, 2015

shoyer commented Jul 27, 2015

shoyer commented Jul 27, 2015

shoyer commented Jul 28, 2015

shoyer Jul 28, 2015

shoyer commented Aug 19, 2015

andreas-h commented Aug 20, 2015

shoyer commented Aug 20, 2015

shoyer commented Aug 20, 2015

shoyer Aug 20, 2015

shoyer commented Aug 21, 2015

ENH: finite difference method diff #495

ENH: finite difference method diff #495

Conversation

andreas-h commented Jul 25, 2015

andreas-h commented Jul 25, 2015

shoyer commented Jul 27, 2015

shoyer Jul 27, 2015

Choose a reason for hiding this comment

andreas-h commented Jul 27, 2015

andreas-h commented Jul 27, 2015

shoyer Jul 27, 2015

Choose a reason for hiding this comment

andreas-h Jul 27, 2015

Choose a reason for hiding this comment

this private constructor preserves existing coordinates

it's also much faster, because it doesn't need to do validation

shoyer Jul 27, 2015

Choose a reason for hiding this comment

shoyer commented Jul 27, 2015

shoyer commented Jul 27, 2015

shoyer commented Jul 28, 2015

shoyer Jul 28, 2015

Choose a reason for hiding this comment

shoyer commented Aug 19, 2015

andreas-h commented Aug 20, 2015

shoyer commented Aug 20, 2015

shoyer commented Aug 20, 2015

shoyer Aug 20, 2015

Choose a reason for hiding this comment

shoyer commented Aug 21, 2015

ENH: finite difference method `diff` #495

ENH: finite difference method `diff` #495