ENH Add Array API compatibility to `mean_absolute_error` #27736

EdAbati · 2023-11-06T22:32:58Z

Reference Issues/PRs

What does this implement/fix? Explain your changes.

It makes the mean_absolute_error implementation compatible and tested with the Array API.

Not sure if it's the best approach, but I have converted the np.average implementation so that it is compatible with the Array API. Is there a better way? (I will fix add the tests to make codecov happy, if you agree to have the _average function)

cc @betatim @ogrisel

github-actions · 2023-11-06T22:34:17Z

✔️ Linting Passed

All linting checks passed. Your pull request is in excellent shape! ☀️

_{Generated for commit: 0fa23d3. Link to the linter CI: here}

ogrisel

Thanks for the PR. Could you please expand the tests to make sure that we cover the raised exceptions?

More details below.

sklearn/metrics/_regression.py

sklearn/metrics/tests/test_common.py

sklearn/utils/_array_api.py

sklearn/utils/tests/test_array_api.py

Co-authored-by: Olivier Grisel <olivier.grisel@ensta.org>

ogrisel

Thanks! One more suggestion for a more informative message. The test needs to be updated accordingly.

BTW, I think that numpy error message would benefit from a similar treatment upstream.

sklearn/utils/_array_api.py

EdAbati · 2023-11-15T22:27:09Z

BTW, I think that numpy error message would benefit from a similar treatment upstream.

I will make an issue on numpy to see if they are intrested in updating the messages :)

Also BTW I added a set of tests for multioutput='raw_values' to check when the output is an array and not a float: 48bd567

ogrisel · 2023-11-22T19:26:39Z

I launched the Array API tests with pytorch with an MPS device on my local laptop and with pytorch on a machine with a cuda device and all the mean absolute error Array API compliance tests pass.

However I observed some failures with cupy:

FAILED sklearn/metrics/tests/test_common.py::test_array_api_compliance[mean_absolute_error-check_array_api_regression_metric-cupy.array_api-None-None] - AttributeError: module 'cupy.array_api' has no attribute 'swapaxes'
FAILED sklearn/metrics/tests/test_common.py::test_array_api_compliance[mean_absolute_error-check_array_api_multioutput_regression_metric-cupy-None-None] - TypeError: Implicit conversion to a NumPy array is not allowed. Please use `.get()` to construct a NumPy array explicitly.
FAILED sklearn/metrics/tests/test_common.py::test_array_api_compliance[mean_absolute_error-check_array_api_multioutput_regression_metric-cupy.array_api-None-None] - ValueError: Unsupported device 'cpu'

Here are the details of the tracebacks:

=============================================================== FAILURES ===============================================================
______________ test_array_api_compliance[mean_absolute_error-check_array_api_regression_metric-cupy.array_api-None-None] _______________

metric = <function mean_absolute_error at 0x7ff1b90f0b80>, array_namespace = 'cupy.array_api', device = None, dtype = None
check_func = <function check_array_api_regression_metric at 0x7ff1714e4280>

    @pytest.mark.parametrize(
        "array_namespace, device, dtype", yield_namespace_device_dtype_combinations()
    )
    @pytest.mark.parametrize("metric, check_func", yield_metric_checker_combinations())
    def test_array_api_compliance(metric, array_namespace, device, dtype, check_func):
>       check_func(metric, array_namespace, device, dtype)

sklearn/metrics/tests/test_common.py:1873: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
sklearn/metrics/tests/test_common.py:1810: in check_array_api_regression_metric
    check_array_api_metric(
sklearn/metrics/tests/test_common.py:1747: in check_array_api_metric
    metric_xp = metric(y_true_xp, y_pred_xp, sample_weight=sample_weight)
sklearn/utils/_param_validation.py:214: in wrapper
    return func(*args, **kwargs)
sklearn/metrics/_regression.py:214: in mean_absolute_error
    output_errors = _average(xp.abs(y_pred - y_true), weights=sample_weight, axis=0)
sklearn/utils/_array_api.py:651: in _average
    weights = xp.swapaxes(weights, -1, axis)
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 

self = <sklearn.utils._array_api._ArrayAPIWrapper object at 0x7ff1b5d40730>, name = 'swapaxes'

    def __getattr__(self, name):
>       return getattr(self._namespace, name)
E       AttributeError: module 'cupy.array_api' has no attribute 'swapaxes'

sklearn/utils/_array_api.py:203: AttributeError
_____________ test_array_api_compliance[mean_absolute_error-check_array_api_multioutput_regression_metric-cupy-None-None] ______________

metric = <function mean_absolute_error at 0x7ff1b90f0b80>, array_namespace = 'cupy', device = None, dtype = None
check_func = <function check_array_api_multioutput_regression_metric at 0x7ff1714e4310>

    @pytest.mark.parametrize(
        "array_namespace, device, dtype", yield_namespace_device_dtype_combinations()
    )
    @pytest.mark.parametrize("metric, check_func", yield_metric_checker_combinations())
    def test_array_api_compliance(metric, array_namespace, device, dtype, check_func):
>       check_func(metric, array_namespace, device, dtype)

sklearn/metrics/tests/test_common.py:1873: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
sklearn/metrics/tests/test_common.py:1831: in check_array_api_multioutput_regression_metric
    check_array_api_metric(
sklearn/metrics/tests/test_common.py:1754: in check_array_api_metric
    assert_allclose(
sklearn/utils/_testing.py:284: in assert_allclose
    actual, desired = np.asanyarray(actual), np.asanyarray(desired)
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 

>   ???
E   TypeError: Implicit conversion to a NumPy array is not allowed. Please use `.get()` to construct a NumPy array explicitly.

cupy/_core/core.pyx:1475: TypeError
________ test_array_api_compliance[mean_absolute_error-check_array_api_multioutput_regression_metric-cupy.array_api-None-None] _________

metric = <function mean_absolute_error at 0x7ff1b90f0b80>, array_namespace = 'cupy.array_api', device = None, dtype = None
check_func = <function check_array_api_multioutput_regression_metric at 0x7ff1714e4310>

    @pytest.mark.parametrize(
        "array_namespace, device, dtype", yield_namespace_device_dtype_combinations()
    )
    @pytest.mark.parametrize("metric, check_func", yield_metric_checker_combinations())
    def test_array_api_compliance(metric, array_namespace, device, dtype, check_func):
>       check_func(metric, array_namespace, device, dtype)

sklearn/metrics/tests/test_common.py:1873: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
sklearn/metrics/tests/test_common.py:1831: in check_array_api_multioutput_regression_metric
    check_array_api_metric(
sklearn/metrics/tests/test_common.py:1752: in check_array_api_metric
    metric_xp = xp.asarray(metric_xp, device="cpu")
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 

obj = Array([0.5, 1. ], dtype=float32)

    def asarray(
        obj: Union[
            Array,
            bool,
            int,
            float,
            NestedSequence[bool | int | float],
            SupportsBufferProtocol,
        ],
        /,
        *,
        dtype: Optional[Dtype] = None,
        device: Optional[Device] = None,
        copy: Optional[bool] = None,
    ) -> Array:
        """
        Array API compatible wrapper for :py:func:`np.asarray <numpy.asarray>`.
    
        See its docstring for more information.
        """
        # _array_object imports in this file are inside the functions to avoid
        # circular imports
        from ._array_object import Array
    
        _check_valid_dtype(dtype)
        if device is not None and not isinstance(device, _Device):
>           raise ValueError(f"Unsupported device {device!r}")
E           ValueError: Unsupported device 'cpu'

/data/parietal/store3/work/ogrisel/mambaforge/envs/dev/lib/python3.10/site-packages/cupy/array_api/_creation_functions.py:59: ValueError
======================================================= short test summary info ========================================================

ogrisel · 2023-11-22T19:29:23Z

For the swapaxes problem, we should report the problem upstream (if not already existing) and mark it the corresponding test case as xfail in the scikit-learn test suite with a link to the upstream issue. For information, I am running the 12.2.0 version of cupy which is the latest available on conda-forge apparently.

ogrisel

The following might fix the other 2 failures with cupy.

You can probably use google colab or kaggle code if you need a machine with a cuda device to launch the tests with cupy.

sklearn/metrics/tests/test_common.py

EdAbati · 2023-11-27T22:24:21Z

Thank you very much for checking cupy :)

I've added the suggestions, but I still have to test if xfail works as expected. (probably I'll do it tomorrow evening)

EdAbati · 2024-03-13T07:52:09Z

Hi @ogrisel, @glemaitre and team, thanks for merging the r2 score PR, it makes my life easier here 😄

I updated this PR with the latest changes and taking inspiration from r2_score. Please let me know if I missed something

(I tested with torch with cpu and mps, but haven't yet tested with cupy)

fcharras · 2024-03-15T15:56:43Z

Looks like the tests in the red pipeline actually passed, but the publishing step failed, not related to the PR.

EdAbati · 2024-03-18T22:06:02Z

I tested with cupy and it is all green with array_api_compat<1.5 ✅

FYI it seems there is an issue with array_api_compat==1.5 (but now fixed in main)

xp.asarraydoesn't convert the array when xp == 'array_api_compat.cupy' and the array is a numpy.ndarray:

>>> import numpy as np
>>> import array_api_compat.cupy as xp
>>> a = np.asarray([1,2,3])
>>> l = [1,2,3]
>>> type(xp.asarray(a))
<class 'numpy.ndarray'>
>>> type(xp.asarray(l))
<class 'cupy.ndarray'>

Testing with array_api_compat==1.5:

FAILED sklearn/metrics/tests/test_common.py::test_array_api_compliance[mean_absolute_error-check_array_api_multioutput_regression_metric-cupy-None-None] - AttributeError: 'numpy.ndarray' object has no attribute 'get'

I see the same error when running the tests on the new _average function.

ogrisel · 2024-03-25T17:20:42Z

FYI it seems there is an issue with array_api_compat==1.5 (but now fixed in main)

Since we don't have a cupy CI yet, we can just assume an array-api-compat version with the fix will be out before it bothers our CI :)

But if that's the case, maybe we can spend some effort to skip cupy tests when array_api_compat's version is one of the known versions with the bug.

doc/whats_new/v1.5.rst

ogrisel

LGTM otherwise. Thanks for the PR.

EdAbati · 2024-04-29T20:33:41Z

Hey @glemaitre sorry for the ping, I noticed you are the other reviewer :) I was wondering if there is anything I should change in this PR

doc/whats_new/v1.5.rst

ogrisel · 2024-05-07T09:53:38Z

Maybe @OmarManzoor would be interested in reviewing this one as well :)

OmarManzoor

Thanks for the PR @EdAbati. Could you kindly resolve the conflict?

sklearn/metrics/tests/test_common.py

OmarManzoor

Just a few comments regarding the change log, otherwise looks good.

doc/whats_new/v1.6.rst

doc/whats_new/v1.5.rst

OmarManzoor

LGTM. Thanks @EdAbati!

EdAbati · 2024-05-15T10:23:42Z

Thank you both 🙂

EdAbati added 2 commits November 2, 2023 19:27

converted mae to array api

4481be6

fixes for MPS device

8b3dc8d

github-actions bot added module:metrics module:utils labels Nov 6, 2023

Merge remote-tracking branch 'origin/main' into mae-array-api

451352c

EdAbati added 2 commits November 6, 2023 23:39

updated docs

0f6ea74

returning float when scalar

746d8ec

EdAbati marked this pull request as ready for review November 12, 2023 15:30

ogrisel reviewed Nov 13, 2023

View reviewed changes

ogrisel added the Array API label Nov 13, 2023

EdAbati and others added 11 commits November 14, 2023 19:56

fixed comment

4b2604e

Co-authored-by: Olivier Grisel <olivier.grisel@ensta.org>

added float32 comment

bc9e6f8

Co-authored-by: Olivier Grisel <olivier.grisel@ensta.org>

improved error message

cb3613d

added test with axis=0

0b58f4b

added error tests

002cacf

fix to dtype=float32

6f452a6

test multioutput

48bd567

Update sklearn/metrics/_regression.py

855aad0

Co-authored-by: Olivier Grisel <olivier.grisel@ensta.org>

Merge branch 'main' into mae-array-api

80450bf

Merge branch 'main' into mae-array-api

c4d7ef9

fix linting

d3378d5

ogrisel reviewed Nov 15, 2023

View reviewed changes

sklearn/utils/_array_api.py Outdated Show resolved Hide resolved

added new error message

6447185

ogrisel reviewed Nov 22, 2023

View reviewed changes

sklearn/metrics/tests/test_common.py Outdated Show resolved Hide resolved

using _convert_to_numpy in tests

b390da3

removed the cast to float

fcf5ee4

EdAbati added 2 commits March 13, 2024 18:08

readded cast to float

58b799e

match r2_score changes

0e7ed2a

Trigger CI

4403874

ogrisel reviewed Mar 25, 2024

View reviewed changes

doc/whats_new/v1.5.rst Outdated Show resolved Hide resolved

ogrisel approved these changes Mar 25, 2024

View reviewed changes

add missing :user:

3c311b0

ogrisel reviewed May 7, 2024

View reviewed changes

doc/whats_new/v1.5.rst Outdated Show resolved Hide resolved

EdAbati added 2 commits May 8, 2024 12:15

Merge remote-tracking branch 'upstream/main' into mae-array-api

197fe3e

moved to 1.6 whatsnew

5cc9c25

OmarManzoor reviewed May 14, 2024

View reviewed changes

sklearn/metrics/tests/test_common.py Outdated Show resolved Hide resolved

EdAbati added 2 commits May 14, 2024 18:20

removed reduntant conversion

18d3ec3

Merge remote-tracking branch 'upstream/main' into mae-array-api

6ab0d16

OmarManzoor reviewed May 15, 2024

View reviewed changes

doc/whats_new/v1.6.rst Show resolved Hide resolved

doc/whats_new/v1.5.rst Outdated Show resolved Hide resolved

ogrisel reviewed May 15, 2024

View reviewed changes

doc/whats_new/v1.5.rst Outdated Show resolved Hide resolved

Revert unrelated change to 1.5.

e0dea5b

ogrisel reviewed May 15, 2024

View reviewed changes

doc/whats_new/v1.5.rst Outdated Show resolved Hide resolved

Revert more unrelated changelog.

0fa23d3

OmarManzoor approved these changes May 15, 2024

View reviewed changes

OmarManzoor merged commit 9f44f1f into scikit-learn:main May 15, 2024
30 checks passed

EdAbati deleted the mae-array-api branch May 15, 2024 10:46

charlesjhill mentioned this pull request May 15, 2024

Make standard scaler compatible to Array API #27113

Draft

jeremiedbb mentioned this pull request May 20, 2024

Release 1.5.0 #29054

Merged

14 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ENH Add Array API compatibility to `mean_absolute_error` #27736

ENH Add Array API compatibility to `mean_absolute_error` #27736

EdAbati commented Nov 6, 2023 •

edited

github-actions bot commented Nov 6, 2023 •

edited

ogrisel left a comment

ogrisel left a comment

EdAbati commented Nov 15, 2023

ogrisel commented Nov 22, 2023

ogrisel commented Nov 22, 2023

ogrisel left a comment

EdAbati commented Nov 27, 2023 •

edited

EdAbati commented Mar 13, 2024 •

edited

fcharras commented Mar 15, 2024

EdAbati commented Mar 18, 2024

ogrisel commented Mar 25, 2024

ogrisel left a comment

EdAbati commented Apr 29, 2024

ogrisel commented May 7, 2024

OmarManzoor left a comment

OmarManzoor left a comment

OmarManzoor left a comment

EdAbati commented May 15, 2024

ENH Add Array API compatibility to mean_absolute_error #27736

ENH Add Array API compatibility to mean_absolute_error #27736

Conversation

EdAbati commented Nov 6, 2023 • edited

Reference Issues/PRs

What does this implement/fix? Explain your changes.

github-actions bot commented Nov 6, 2023 • edited

✔️ Linting Passed

ogrisel left a comment

Choose a reason for hiding this comment

ogrisel left a comment

Choose a reason for hiding this comment

EdAbati commented Nov 15, 2023

ogrisel commented Nov 22, 2023

ogrisel commented Nov 22, 2023

ogrisel left a comment

Choose a reason for hiding this comment

EdAbati commented Nov 27, 2023 • edited

EdAbati commented Mar 13, 2024 • edited

fcharras commented Mar 15, 2024

EdAbati commented Mar 18, 2024

ogrisel commented Mar 25, 2024

ogrisel left a comment

Choose a reason for hiding this comment

EdAbati commented Apr 29, 2024

ogrisel commented May 7, 2024

OmarManzoor left a comment

Choose a reason for hiding this comment

OmarManzoor left a comment

Choose a reason for hiding this comment

OmarManzoor left a comment

Choose a reason for hiding this comment

EdAbati commented May 15, 2024

ENH Add Array API compatibility to `mean_absolute_error` #27736

ENH Add Array API compatibility to `mean_absolute_error` #27736

EdAbati commented Nov 6, 2023 •

edited

github-actions bot commented Nov 6, 2023 •

edited

EdAbati commented Nov 27, 2023 •

edited

EdAbati commented Mar 13, 2024 •

edited