Skip to content

Conversation

@cpelley
Copy link

@cpelley cpelley commented Oct 30, 2017

  • API that closely aligns with the existing stratify API.
def interpolate(z_target, z_src, fz_src, axis=-1):
    ....
  • Extrapolation doesn't make sense for a conservative approach so intentionally does not use the same API (separate function).
  • A technical review has already been undertaken as part of this going into ANTS and so it is expected that the review be primarily focused on the changes necessary to migrate this code to python-stratify.
  • Doesn't currently support case where the target levels is 1D and the source levels is ND which stratify.interpolate supports I believe (the implementation can easily support it but I chose not to at the first pass as we don't need it right now so couldn't justify the extra time writing tests).
  • Extends the top-level namespace of stratify:
stratify.interpolate_conservative
  • Cythonised functions where it matters (stratify._conservative.pyx) and a python module for where it doesn't (stratify._bounded_vinterp). Cython code can be further optimised in future (no bounds checking etc.) but intentionally not done here.

@cpelley cpelley self-assigned this Oct 30, 2017
@cpelley cpelley requested a review from pelson October 30, 2017 17:27
@cpelley
Copy link
Author

cpelley commented Oct 31, 2017

ping @hdyson @jkettleb

@hdyson has agreed to perform the first-pass review for this.

Cheers

@cpelley cpelley force-pushed the ENH_CONSERVATIVE_APPROACH branch from 2aabdec to 47684e3 Compare October 31, 2017 09:41
EXTRAPOLATE_NAN, EXTRAPOLATE_NEAREST,
EXTRAPOLATE_LINEAR, PyFuncExtrapolator,
PyFuncInterpolator)
from ._bounded_vinterp import interpolate as interpolate_conservative

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

F401 '._bounded_vinterp.interpolate as interpolate_conservative' imported but unused

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ignore this

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tell flake8 to ignore it with a # noqa type comment in the code. Ref: http://flake8.pycqa.org/en/3.2.1/user/ignoring-errors.html#in-line-ignoring-errors

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @pelson

@SciTools SciTools deleted a comment from stickler-ci Oct 31, 2017
@SciTools SciTools deleted a comment from stickler-ci Oct 31, 2017
@SciTools SciTools deleted a comment from stickler-ci Oct 31, 2017
@cpelley cpelley force-pushed the ENH_CONSERVATIVE_APPROACH branch from 47684e3 to 08239e2 Compare October 31, 2017 09:51
@cpelley
Copy link
Author

cpelley commented Oct 31, 2017

Sorry iterating through issues exposed with travis:

  • Failure due to mock usage and missing from requirements.txt (removed my usage of mock).
  • Numpy include dirs (I didn't need this building locally). Have added this now to setup.py
  • Absolute import for python3 support.
  • Installing a python3 environment locally to remedy remaining issues around python3 support.

@cpelley
Copy link
Author

cpelley commented Oct 31, 2017

Tests now pass for python3 (I'll run these locally in a py3 environment in future).

Ready for review, really this time.

Cheers

# |------------| : Target
# |----------| : Delta (src_max - src_min)
# |----| : Overlap (between src & tgt)
# weight = overlap / delta
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I really like this diagram - it provides the key concept and several definitions very succinctly.

if z_src.ndim != z_target.ndim:
msg = ('Expecting source and target levels dimensionality to be '
'identical. {} != {}.')
raise ValueError(msg.format(z_src.ndim, z_target.ndim))
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would this (and following assertions) be more transparent as:

('Expecting dimensionality of source levels {} and target levels {} to be '
               'identical.')

It's just moving the value of the "thing" to where the definition of the "thing" is.

cdef np.float64_t delta, weight
cdef np.ndarray[np.float64_t, ndim = 2] weights
cdef np.ndarray[np.float64_t, ndim = 1] src_cell, tgt_cell

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It could be cython convention that I'm not familiar with, but I'd expect these to be "ndim=2" without spaces? Similarly for lines 35 and 36.

PyFuncInterpolator)
from ._bounded_vinterp import (interpolate as # noqa: F401
interpolate_conservative)

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can this be simplified by renaming _bounded_vinterp.interpolate as _bounded_vinterp.interpolate_conservative?



def interpolate(z_target, z_src, fz_src, axis=-1):
"""
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've suggested renaming this in another comment. If you opt to retain the existing name, I'd suggest adding that the interpolation is conservative to this docstring?

@hdyson
Copy link

hdyson commented Nov 6, 2017

Thanks @cpelley, this looks good to me. There's a few suggestions above - but I want to be clear that they are just suggestions for some potential fine tuning around the edges.

@cpelley
Copy link
Author

cpelley commented Nov 6, 2017

Thanks @hdyson . Made all changes except the exception messages. I felt that it was clearer especially in some cases and so opted for consistency.

Example:

""The provided data is not of compatible shape with the provided source bounds. ('-', 3, 4) != (2, 4)"

Here '-' is used to denote the dimensions which are not relevant to that statement. Not so clear if pulled into the sentence itself.

Copy link
Member

@pelson pelson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking good. I need to carry on reading from _conservative.pyx.

invert_transpose = [data_transpose.index(ind) for ind in
list(range(result.ndim))]
result = result.transpose(invert_transpose)
return result
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Comments:

  • Looks like this could be a class with behaviour (conservative_interpolation).
  • Very similar code for the non-bounded case. Would be interesting to pull this all out into common functions.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed.

I saw these two points as being connected. I think a refactor to support caching would entail pulling both existing and conservative capability into classes. I didn't want to pre-define a structure before considering this refactor in that wider context of caching (it was easier to use a proxy to the existing API for now). With regards to common functions, as far as I can tell. the existing stratify usage is rather hardcoded to a points based approach right now.
In the refactor to support caching, I hope to have a similar mindset to what I have done here, in that we pull out the code which needn't be cython code into a python module (inc. enduser API), allowing greater flexibility and functionality to be more readily shared between points-based and bounds-based approaches. Are you happy with this mindset?

Cheers

@@ -0,0 +1,55 @@
import numpy as np
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No docstrings 😱

Copy link
Author

@cpelley cpelley Nov 13, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks Phil, just added (see latest commit)
I have taken a numpy doc style approach in the python module but followed the same style forming as the existing cython module for the new conservative cython module. Let me know what you think.

Cheers

Copy link
Member

@pelson pelson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Happy from a maintainability, plumbing and testing perspective.

I'd be interested in seeing the 0 extrapolation behaviour changed though. Thoughts?

source_bounds,
self.data, axis=1)
target_data = np.ones((4, 7))
target_data[:, 0] = 0
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why 0 and not some other extrapolation scheme?

Copy link
Author

@cpelley cpelley Dec 12, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Feels to me like 'nan' (as you say) is more appropriate here. If no source points are contributing to a target cell then there is no data so it should be 'nan'.

I can't think why I chose 0 in the first place and since @hdyson didn't pick up on this either, @hdyson is there a reason that you can remember why 0 seemed to make sense at the time but doesn't now? :)

Cheers

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's going back a while. From memory (so pinch of salt and all that), I seem to recall we discussed this on the phone, but I don't remember the reasons for and against 0.

That said, this behaviour is consistent with the approach we implemented for the CMIP6 aerosols (and was in turn copied from an earlier IDL routine), so it could simply be for consistency with legacy applications. Hope that helps?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @hdyson I'll change to 'NaN' then, we can always post-process the returned array if it proves to be necessary to the UM from such ancillaries.



if __name__ == '__main__':
unittest.main()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't see any non-equally spaced coordinate tests.

@pelson
Copy link
Member

pelson commented Dec 11, 2017

Also, could you add a check for data values being NaN too, please?

@cpelley
Copy link
Author

cpelley commented Dec 12, 2017

Also, could you add a check for data values being NaN too, please?

Yep, will do. Cheers

@cpelley
Copy link
Author

cpelley commented Jan 17, 2018

Hey @pelson, thanks for your patience.
Three new commits, addressing your concerns:

  • TEST: Added non-equally spaced coordinate test.
  • ENH: Return nan values where no source cells contribute to a target cell (I was returning 0 before).
  • ENH: Return nan values in tgt cells where a src nan cells contribute (this was neither tested before nor supported).

Copy link
Member

@pelson pelson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me.

Thanks @cpelley!

source_bounds, data,
axis=1)
target_data = np.array(
[1/1.5, 1 + ((1/3.)/1), 0.25, 0.25, 0.25, 0.25, 1.])[None]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wowzers. This messes with my mind 😄 .

Looks good though.

@pelson pelson merged commit 44135af into SciTools:master Jan 26, 2018
@pelson
Copy link
Member

pelson commented Jan 26, 2018

Thanks for the great review @hdyson. Really appreciated you helping to shape this PR into this really great addition. 👍

@cpelley cpelley deleted the ENH_CONSERVATIVE_APPROACH branch January 26, 2018 13:13
@cpelley
Copy link
Author

cpelley commented Jan 26, 2018

Thanks both, appreciated.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants