ENH Add Array API compatibility to mean_absolute_error#27736
ENH Add Array API compatibility to mean_absolute_error#27736OmarManzoor merged 37 commits intoscikit-learn:mainfrom
mean_absolute_error#27736Conversation
ogrisel
left a comment
There was a problem hiding this comment.
Thanks for the PR. Could you please expand the tests to make sure that we cover the raised exceptions?
More details below.
Co-authored-by: Olivier Grisel <olivier.grisel@ensta.org>
Co-authored-by: Olivier Grisel <olivier.grisel@ensta.org>
Co-authored-by: Olivier Grisel <olivier.grisel@ensta.org>
ogrisel
left a comment
There was a problem hiding this comment.
Thanks! One more suggestion for a more informative message. The test needs to be updated accordingly.
BTW, I think that numpy error message would benefit from a similar treatment upstream.
I will make an issue on Also BTW I added a set of tests for |
|
I launched the Array API tests with pytorch with an MPS device on my local laptop and with pytorch on a machine with a cuda device and all the mean absolute error Array API compliance tests pass. However I observed some failures with cupy: Here are the details of the tracebacks: Details=============================================================== FAILURES ===============================================================
______________ test_array_api_compliance[mean_absolute_error-check_array_api_regression_metric-cupy.array_api-None-None] _______________
metric = <function mean_absolute_error at 0x7ff1b90f0b80>, array_namespace = 'cupy.array_api', device = None, dtype = None
check_func = <function check_array_api_regression_metric at 0x7ff1714e4280>
@pytest.mark.parametrize(
"array_namespace, device, dtype", yield_namespace_device_dtype_combinations()
)
@pytest.mark.parametrize("metric, check_func", yield_metric_checker_combinations())
def test_array_api_compliance(metric, array_namespace, device, dtype, check_func):
> check_func(metric, array_namespace, device, dtype)
sklearn/metrics/tests/test_common.py:1873:
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
sklearn/metrics/tests/test_common.py:1810: in check_array_api_regression_metric
check_array_api_metric(
sklearn/metrics/tests/test_common.py:1747: in check_array_api_metric
metric_xp = metric(y_true_xp, y_pred_xp, sample_weight=sample_weight)
sklearn/utils/_param_validation.py:214: in wrapper
return func(*args, **kwargs)
sklearn/metrics/_regression.py:214: in mean_absolute_error
output_errors = _average(xp.abs(y_pred - y_true), weights=sample_weight, axis=0)
sklearn/utils/_array_api.py:651: in _average
weights = xp.swapaxes(weights, -1, axis)
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
self = <sklearn.utils._array_api._ArrayAPIWrapper object at 0x7ff1b5d40730>, name = 'swapaxes'
def __getattr__(self, name):
> return getattr(self._namespace, name)
E AttributeError: module 'cupy.array_api' has no attribute 'swapaxes'
sklearn/utils/_array_api.py:203: AttributeError
_____________ test_array_api_compliance[mean_absolute_error-check_array_api_multioutput_regression_metric-cupy-None-None] ______________
metric = <function mean_absolute_error at 0x7ff1b90f0b80>, array_namespace = 'cupy', device = None, dtype = None
check_func = <function check_array_api_multioutput_regression_metric at 0x7ff1714e4310>
@pytest.mark.parametrize(
"array_namespace, device, dtype", yield_namespace_device_dtype_combinations()
)
@pytest.mark.parametrize("metric, check_func", yield_metric_checker_combinations())
def test_array_api_compliance(metric, array_namespace, device, dtype, check_func):
> check_func(metric, array_namespace, device, dtype)
sklearn/metrics/tests/test_common.py:1873:
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
sklearn/metrics/tests/test_common.py:1831: in check_array_api_multioutput_regression_metric
check_array_api_metric(
sklearn/metrics/tests/test_common.py:1754: in check_array_api_metric
assert_allclose(
sklearn/utils/_testing.py:284: in assert_allclose
actual, desired = np.asanyarray(actual), np.asanyarray(desired)
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
> ???
E TypeError: Implicit conversion to a NumPy array is not allowed. Please use `.get()` to construct a NumPy array explicitly.
cupy/_core/core.pyx:1475: TypeError
________ test_array_api_compliance[mean_absolute_error-check_array_api_multioutput_regression_metric-cupy.array_api-None-None] _________
metric = <function mean_absolute_error at 0x7ff1b90f0b80>, array_namespace = 'cupy.array_api', device = None, dtype = None
check_func = <function check_array_api_multioutput_regression_metric at 0x7ff1714e4310>
@pytest.mark.parametrize(
"array_namespace, device, dtype", yield_namespace_device_dtype_combinations()
)
@pytest.mark.parametrize("metric, check_func", yield_metric_checker_combinations())
def test_array_api_compliance(metric, array_namespace, device, dtype, check_func):
> check_func(metric, array_namespace, device, dtype)
sklearn/metrics/tests/test_common.py:1873:
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
sklearn/metrics/tests/test_common.py:1831: in check_array_api_multioutput_regression_metric
check_array_api_metric(
sklearn/metrics/tests/test_common.py:1752: in check_array_api_metric
metric_xp = xp.asarray(metric_xp, device="cpu")
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
obj = Array([0.5, 1. ], dtype=float32)
def asarray(
obj: Union[
Array,
bool,
int,
float,
NestedSequence[bool | int | float],
SupportsBufferProtocol,
],
/,
*,
dtype: Optional[Dtype] = None,
device: Optional[Device] = None,
copy: Optional[bool] = None,
) -> Array:
"""
Array API compatible wrapper for :py:func:`np.asarray <numpy.asarray>`.
See its docstring for more information.
"""
# _array_object imports in this file are inside the functions to avoid
# circular imports
from ._array_object import Array
_check_valid_dtype(dtype)
if device is not None and not isinstance(device, _Device):
> raise ValueError(f"Unsupported device {device!r}")
E ValueError: Unsupported device 'cpu'
/data/parietal/store3/work/ogrisel/mambaforge/envs/dev/lib/python3.10/site-packages/cupy/array_api/_creation_functions.py:59: ValueError
======================================================= short test summary info ======================================================== |
|
For the swapaxes problem, we should report the problem upstream (if not already existing) and mark it the corresponding test case as xfail in the scikit-learn test suite with a link to the upstream issue. For information, I am running the 12.2.0 version of cupy which is the latest available on conda-forge apparently. |
ogrisel
left a comment
There was a problem hiding this comment.
The following might fix the other 2 failures with cupy.
You can probably use google colab or kaggle code if you need a machine with a cuda device to launch the tests with cupy.
|
Thank you very much for checking I've added the suggestions, but I still have to test if |
|
Looks like the tests in the red pipeline actually passed, but the publishing step failed, not related to the PR. |
|
I tested with FYI it seems there is an issue with
Testing with I see the same error when running the tests on the new |
Since we don't have a cupy CI yet, we can just assume an array-api-compat version with the fix will be out before it bothers our CI :) But if that's the case, maybe we can spend some effort to skip cupy tests when array_api_compat's version is one of the known versions with the bug. |
ogrisel
left a comment
There was a problem hiding this comment.
LGTM otherwise. Thanks for the PR.
|
Hey @glemaitre sorry for the ping, I noticed you are the other reviewer :) I was wondering if there is anything I should change in this PR |
|
Maybe @OmarManzoor would be interested in reviewing this one as well :) |
OmarManzoor
left a comment
There was a problem hiding this comment.
Thanks for the PR @EdAbati. Could you kindly resolve the conflict?
OmarManzoor
left a comment
There was a problem hiding this comment.
Just a few comments regarding the change log, otherwise looks good.
OmarManzoor
left a comment
There was a problem hiding this comment.
LGTM. Thanks @EdAbati!
|
Thank you both 🙂 |
Reference Issues/PRs
Towards #26024
What does this implement/fix? Explain your changes.
It makes the
mean_absolute_errorimplementation compatible and tested with the Array API.Not sure if it's the best approach, but I have converted the
np.averageimplementation so that it is compatible with the Array API. Is there a better way? (I will fix add the tests to make codecov happy, if you agree to have the_averagefunction)cc @betatim @ogrisel