Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

TST Skip test when unstable openblas configuration #21702

Merged
merged 10 commits into from
Nov 18, 2021

Conversation

jjerphan
Copy link
Member

@jjerphan jjerphan commented Nov 18, 2021

Reference Issues/PRs

Temporarily fixes #21361.

What does this implement/fix? Explain your changes.

This is a temporary fix for some tests on main.

This fixes some CI jobs on main.

This is a first fix toward preventing contributors spending their days exploring logs
of issues they don't understand the problems being reported because they aren't
related to their PRs.

As informally proposed by @ogrisel, a long term fix would be to pin the maximum
version dependencies and have a process to update each dependency version
when possible.

Any other comments?

This fix can be made more specific on some values of the parametrisation.

sklearn/utils/__init__.py Outdated Show resolved Hide resolved
sklearn/utils/__init__.py Outdated Show resolved Hide resolved
@jjerphan
Copy link
Member Author

Force-pushing to include @adrinjalali in 544d564 as a co-author.

@@ -452,6 +457,10 @@ def set_random_state(estimator, random_state=0):
not joblib.parallel.mp, reason="joblib is in serial mode"
)

fails_if_unstable_openblas = pytest.mark.xfail(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not sure if xfail is a good idea for code that can segfault. Maybe skip is a better idea.

Copy link
Member

@ogrisel ogrisel Nov 18, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was marked resolved without being addressed or commented on: I am pretty sure this is a bad idea to execute code that can segfault, especially when pytest-xdist is not used. XFAIL will run the code, SKIP will not.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My mistake multitasking and forgetting to take your suggestion in the patch of this commit.

@jjerphan jjerphan marked this pull request as ready for review November 18, 2021 10:59
Copy link
Member

@adrinjalali adrinjalali left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A green CI is a good one.

.circleci/config.yml Outdated Show resolved Hide resolved
sklearn/calibration.py Outdated Show resolved Hide resolved
Co-authored-By: Adrin Jalali <adrin.jalali@gmail.com>
@jjerphan jjerphan changed the title TST XFAIL test when unstable openblas configuration TST Skip test when unstable openblas configuration Nov 18, 2021
Co-authored-by: Olivier Grisel <olivier.grisel@ensta.org>
@jjerphan
Copy link
Member Author

I force-pushed to:

  • remove duplicated commits (hence a correct diff now)
  • integrate the forgotten suggestion from @ogrisel

Copy link
Member

@ogrisel ogrisel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The prescott kernel will cause a segfault on any OpenBLAS version when called on non-memory aligned data, whatever the version of OpenBLAS. What changes in old version of OpenBLAS is that the prescott kernel is detected instead of Haswell on SkylakeX machines indirectly causing the bug.

But the bug could still happen if the test is actually run on a prescott machine with a recent version of OpenBLAS.

Instead of introducing a utility function in sklearn.utils with a generic name such as _in_unstable_openblas_configuration, I would rather change the function sklearn.utils.estimators_checks.check_classifiers_train

to call create_memmap_backed_data with aligned=has_prescott_openblas once #21654 is merged, has_prescott_openblas is defined as:

    # OpenBLAS is known to segfault with unaligned data on the prescott architecture
    has_prescott_openblas = any(
        True
        for info in threadpool_info()
        if info["internal_api"] == "openblas"
        and info.get("architecture").lower() == "prescott"
    )

Also for the sake of keeping the git history easier to understand, would it be possible to separate the fix for the calibration plots labels separate from the fix for openblas segfaults? Those are unrelated problems, right? Since we squash merge PRs, I would rather have those 2 fixes in 2 separate commits in the git history, hence 2 separate PRs.

sklearn/linear_model/tests/test_omp.py Outdated Show resolved Hide resolved
@ogrisel
Copy link
Member

ogrisel commented Nov 18, 2021

remove duplicated commits (hence a correct diff now)

I wrote the previous review against before this fix, please ignore that part of the review.

Copy link
Member

@ogrisel ogrisel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Assuming CI is green, +1 for merge with the last comment below:

sklearn/utils/estimator_checks.py Outdated Show resolved Hide resolved
Co-authored-by: Olivier Grisel <olivier.grisel@ensta.org>
@ogrisel
Copy link
Member

ogrisel commented Nov 18, 2021

Green! @adrinjalali still +1 for merge with the new way to fix this?

Copy link
Member

@adrinjalali adrinjalali left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A much cleaner solution indeed

@adrinjalali adrinjalali merged commit 3f019bd into scikit-learn:main Nov 18, 2021
@ogrisel
Copy link
Member

ogrisel commented Nov 18, 2021

Actually I just found out that it does not work :)

$ OPENBLAS_CORETYPE=Prescott pytest -v sklearn/tests/test_common.py -k "check_classifiers_train and svc"
[...]
    def create_memmap_backed_data(data, mmap_mode="r", return_folder=False, aligned=False):
        """
        Parameters
        ----------
        data
        mmap_mode : str, default='r'
        return_folder :  bool, default=False
        aligned : bool, default=False
            If True, if input is a single numpy array and if the input array is aligned,
            the memory mapped array will also be aligned. This is a workaround for
            https://github.com/joblib/joblib/issues/563.
        """
        temp_folder = tempfile.mkdtemp(prefix="sklearn_testing_")
        atexit.register(functools.partial(_delete_folder, temp_folder, warn=True))
        if aligned:
            if isinstance(data, np.ndarray) and data.flags.aligned:
                # https://numpy.org/doc/stable/reference/generated/numpy.memmap.html
                filename = op.join(temp_folder, "data.dat")
                fp = np.memmap(filename, dtype=data.dtype, mode="w+", shape=data.shape)
                fp[:] = data[:]  # write data to memmap array
                fp.flush()
                memmap_backed_data = np.memmap(
                    filename, dtype=data.dtype, mode=mmap_mode, shape=data.shape
                )
            else:
>               raise ValueError("If aligned=True, input must be a single numpy array.")
E               ValueError: If aligned=True, input must be a single numpy array.

@ogrisel
Copy link
Member

ogrisel commented Nov 18, 2021

I will open a PR with a quick fix for the above.

glemaitre pushed a commit to glemaitre/scikit-learn that referenced this pull request Nov 22, 2021
…rn#21702)

* TST XFAIL test when unstable openblas configuration

* Adapt versions parsing

Co-authored-By: Adrin Jalali <adrin.jalali@gmail.com>

* DOC Prefer ifskip and reference scikit-learn#21361

Co-authored-by: Olivier Grisel <olivier.grisel@ensta.org>

* Fix Julien's scheduler incorrect dispatch

* Simplify

Co-authored-by: Olivier Grisel <olivier.grisel@ensta.org>

* Do not remove blank line

* fixup! Do not remove blank line

* Retrigger CI

* Prudently assume Prescott might be the architecture if it is unknown

Co-authored-by: Olivier Grisel <olivier.grisel@ensta.org>

Co-authored-by: Adrin Jalali <adrin.jalali@gmail.com>
Co-authored-by: Olivier Grisel <olivier.grisel@ensta.org>
glemaitre pushed a commit to glemaitre/scikit-learn that referenced this pull request Nov 29, 2021
…rn#21702)

* TST XFAIL test when unstable openblas configuration

* Adapt versions parsing

Co-authored-By: Adrin Jalali <adrin.jalali@gmail.com>

* DOC Prefer ifskip and reference scikit-learn#21361

Co-authored-by: Olivier Grisel <olivier.grisel@ensta.org>

* Fix Julien's scheduler incorrect dispatch

* Simplify

Co-authored-by: Olivier Grisel <olivier.grisel@ensta.org>

* Do not remove blank line

* fixup! Do not remove blank line

* Retrigger CI

* Prudently assume Prescott might be the architecture if it is unknown

Co-authored-by: Olivier Grisel <olivier.grisel@ensta.org>

Co-authored-by: Adrin Jalali <adrin.jalali@gmail.com>
Co-authored-by: Olivier Grisel <olivier.grisel@ensta.org>
samronsin pushed a commit to samronsin/scikit-learn that referenced this pull request Nov 30, 2021
…rn#21702)

* TST XFAIL test when unstable openblas configuration

* Adapt versions parsing

Co-authored-By: Adrin Jalali <adrin.jalali@gmail.com>

* DOC Prefer ifskip and reference scikit-learn#21361

Co-authored-by: Olivier Grisel <olivier.grisel@ensta.org>

* Fix Julien's scheduler incorrect dispatch

* Simplify

Co-authored-by: Olivier Grisel <olivier.grisel@ensta.org>

* Do not remove blank line

* fixup! Do not remove blank line

* Retrigger CI

* Prudently assume Prescott might be the architecture if it is unknown

Co-authored-by: Olivier Grisel <olivier.grisel@ensta.org>

Co-authored-by: Adrin Jalali <adrin.jalali@gmail.com>
Co-authored-by: Olivier Grisel <olivier.grisel@ensta.org>
glemaitre pushed a commit to glemaitre/scikit-learn that referenced this pull request Dec 24, 2021
…rn#21702)

* TST XFAIL test when unstable openblas configuration

* Adapt versions parsing

Co-authored-By: Adrin Jalali <adrin.jalali@gmail.com>

* DOC Prefer ifskip and reference scikit-learn#21361

Co-authored-by: Olivier Grisel <olivier.grisel@ensta.org>

* Fix Julien's scheduler incorrect dispatch

* Simplify

Co-authored-by: Olivier Grisel <olivier.grisel@ensta.org>

* Do not remove blank line

* fixup! Do not remove blank line

* Retrigger CI

* Prudently assume Prescott might be the architecture if it is unknown

Co-authored-by: Olivier Grisel <olivier.grisel@ensta.org>

Co-authored-by: Adrin Jalali <adrin.jalali@gmail.com>
Co-authored-by: Olivier Grisel <olivier.grisel@ensta.org>
glemaitre pushed a commit that referenced this pull request Dec 25, 2021
* TST XFAIL test when unstable openblas configuration

* Adapt versions parsing

Co-authored-By: Adrin Jalali <adrin.jalali@gmail.com>

* DOC Prefer ifskip and reference #21361

Co-authored-by: Olivier Grisel <olivier.grisel@ensta.org>

* Fix Julien's scheduler incorrect dispatch

* Simplify

Co-authored-by: Olivier Grisel <olivier.grisel@ensta.org>

* Do not remove blank line

* fixup! Do not remove blank line

* Retrigger CI

* Prudently assume Prescott might be the architecture if it is unknown

Co-authored-by: Olivier Grisel <olivier.grisel@ensta.org>

Co-authored-by: Adrin Jalali <adrin.jalali@gmail.com>
Co-authored-by: Olivier Grisel <olivier.grisel@ensta.org>
@jjerphan jjerphan deleted the xfail-prescott branch October 21, 2022 14:05
mathijs02 pushed a commit to mathijs02/scikit-learn that referenced this pull request Dec 27, 2022
…rn#21702)

* TST XFAIL test when unstable openblas configuration

* Adapt versions parsing

Co-authored-By: Adrin Jalali <adrin.jalali@gmail.com>

* DOC Prefer ifskip and reference scikit-learn#21361

Co-authored-by: Olivier Grisel <olivier.grisel@ensta.org>

* Fix Julien's scheduler incorrect dispatch

* Simplify

Co-authored-by: Olivier Grisel <olivier.grisel@ensta.org>

* Do not remove blank line

* fixup! Do not remove blank line

* Retrigger CI

* Prudently assume Prescott might be the architecture if it is unknown

Co-authored-by: Olivier Grisel <olivier.grisel@ensta.org>

Co-authored-by: Adrin Jalali <adrin.jalali@gmail.com>
Co-authored-by: Olivier Grisel <olivier.grisel@ensta.org>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
3 participants