Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Smarter weighted aggregation #5082

Closed
schlunma opened this issue Nov 23, 2022 · 3 comments · Fixed by #5084
Closed

Smarter weighted aggregation #5082

schlunma opened this issue Nov 23, 2022 · 3 comments · Fixed by #5084

Comments

@schlunma
Copy link
Contributor

✨ Feature Request

It would be nice to use cubes or cell_measures as weights in weighted aggregation, e.g.,

cube.collapsed(['latitude', 'longitude'], iris.analysis.SUM, weights=area_cube)

or

cube.collapsed(['latitude', 'longitude'], iris.analysis.SUM, weights='cell area')

which automatically handles unit conversions (e.g., multiply by m2 for area-weighted sums).

Motivation

Currently, weights for aggregation need to be arrays. Thus, units for operations like area-weighted sums are not handled correctly, which requires extra code.

Additional context

If this is something that's interesting, I can open a pull request. I already have working prototype in this branch. The main feature is a Weights class that inherits from np.ndarray; thus, changes in the code base are minimal.

`Weights` class
class Weights(np.ndarray):
    """Class for handling weights for weighted aggregation.
    Since it inherits from :numpy:`ndarray`, all common methods and properties
    of this are available.
    Details on subclassing :numpy:`ndarray` are given here:
    https://numpy.org/doc/stable/user/basics.subclassing.html
    """

    def __new__(cls, weights, cube):
        """Create class instance.
        Args:
        * weights (Weights, numpy.ndarray, Cube, string):
            If given as :class:`iris.analysis.Weights`, simply use this. If
            given as a :class:`numpy.ndarray`, use this directly (assume units
            of `1`). If given as a :class:`iris.cube.Cube`, use its data and
            units. If given as a :obj:`str`, assume this is the name of a cell
            measure from ``cube`` and its data and units.
        * cube (Cube):
            Input cube for aggregation. If weights is given as :obj:`str`, try
            to extract a cell measure with the corresponding name from this
            cube. Otherwise, this argument is ignored.
        """
        # Weights is Weights
        if isinstance(weights, cls):
            obj = weights

        # Weights is a cube
        # Note: to avoid circular imports of Cube we use duck typing using the
        # "hasattr" syntax here
        elif hasattr(weights, "add_aux_coord"):
            obj = np.asarray(weights.data).view(cls)
            obj.units = weights.units

        # Weights is a string
        elif isinstance(weights, str):
            cell_measure = cube.cell_measure(weights)
            if cell_measure.shape != cube.shape:
                arr = iris.util.broadcast_to_shape(
                    cell_measure.data,  # fails for dask arrays
                    cube.shape,
                    cube.cell_measure_dims(cell_measure),
                )
            else:
                arr = cell_measure.data
            obj = np.asarray(arr).view(cls)
            obj.units = cell_measure.units

        # Remaining types (e.g., np.ndarray): try to convert to ndarray.
        else:
            obj = np.asarray(weights).view(cls)
            obj.units = Unit("1")

        return obj

    def __array_finalize__(self, obj):
        """See https://numpy.org/doc/stable/user/basics.subclassing.html."""
        if obj is None:
            return
        self.units = getattr(obj, "units", Unit("1"))

    @classmethod
    def update_kwargs(cls, kwargs, cube):
        """Update ``weights`` keyword argument in-place.
        Args:
        * kwargs (dict):
            Keyword arguments that will be updated in-place if a ``weights``
            keyword is present.
        * cube (Cube):
            Input cube for aggregation. If weights is given as :obj:`str`, try
            to extract a cell measure with the corresponding name from this
            cube. Otherwise, this argument is ignored.
        """
        if "weights" not in kwargs:
            return
        kwargs["weights"] = cls(kwargs["weights"], cube)
@rcomer
Copy link
Member

rcomer commented Nov 23, 2022

Related: #3169, #3173.

@schlunma
Copy link
Contributor Author

Related: #3169, #3173.

Ahh, sorry, didn't check for open pull requests, only issues!

@schlunma
Copy link
Contributor Author

Since #3173 is closed I will open a pull request soon (need to add some test before that).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
2 participants