-
-
Notifications
You must be signed in to change notification settings - Fork 31
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
pick up new blis, mkl & openblas builds #74
Conversation
Hi! This is the friendly automated conda-forge-linting service. I just wanted to let you know that I linted all conda-recipes in your PR ( |
…nda-forge-pinning 2021.07.08.05.56.39
@isuruf, if you don't have any objections (and you don't merge it yourself), I'm planning to merge this in ~24h |
I'm considering to wait for openblas 0.3.16 with this (which should hopefully happen this weekend), to avoid duplicating traffic for such a key package. Thoughts? |
@martin-frbg, thanks for the release of openblas 0.3.16! Just checking if you would have an idea what might be happening here on OSX+x86-64:
Lots of functions seem to complain about illegal parameters (one example, more in log):
|
No idea - looks like an almost total corruption of the argument lists of the affected functions, almost like a C/Fortran interface mismatch. |
Thanks for taking a look! I also don't understand what's happening, though on a long shot (if it's not the C/Fortran interface), I've seen some weird failures with some blas libraries on Skylake agents in Azure recently (or at least, the difference between failing & passing runs was the presence/absence of |
@h-vetinari, @martin-frbg, we have access to a osx-x86_64 machine if you'd like access. Send me your SSH key and I'll add you. |
Sadly the problem seems to be related to my OpenBLAS PR 3250 (adding a shortcut path in interface/gemv.c that does not allocate a buffer when trans=0, incx=1,incy=1) although I do not see where in the dgemv_n kernel any access attempt could |
I'd like to try if the reversion helps here, but this is already one package too far (meta-package for BLAS variants, rather than the openblas build). It sounds like we should be running the LAPACK test suite also when building on https://github.com/conda-forge/openblas-feedstock... Looking at the build recipe, it looks like this all comes from the upstream Makefile - is there a flag to enable the LAPACK tests in utest? |
No flag in utest, you would need to run |
if [[ "${blas_impl}" == "openblas" && "$(uname)" == MINGW* ]]; then | ||
# I'm not sure why these fail only on Windows. Skip for now. | ||
# Still present as of openblas 0.3.16 | ||
SKIP_TESTS="${SKIP_TESTS}|LAPACK-xeigtsts_sgd_in|LAPACK-xeigtsts_glm_in|LAPACK-xeigtsts_gsv_in|LAPACK-xeigtsts_lse_in|LAPACK-xeigtstd_dgd_in" | ||
SKIP_TESTS="${SKIP_TESTS}|LAPACK-xeigtstd_glm_in|LAPACK-xeigtstd_gsv_in|LAPACK-xeigtstd_lse_in|LAPACK-xlintstc_ctest_in|LAPACK-xlintstrfc_ctest_rfp_in" | ||
SKIP_TESTS="${SKIP_TESTS}|LAPACK-xeigtstc_sep_in|LAPACK-xeigtstc_se2_in|LAPACK-xeigtstc_ced_in|LAPACK-xeigtstc_cgd_in|LAPACK-xeigtstc_csb_in" | ||
SKIP_TESTS="${SKIP_TESTS}|LAPACK-xeigtstc_csg_in|LAPACK-xeigtstc_glm_in|LAPACK-xeigtstc_gsv_in|LAPACK-xeigtstc_lse_in|LAPACK-xeigtstz_zgd_in" | ||
SKIP_TESTS="${SKIP_TESTS}|LAPACK-xeigtstz_glm_in|LAPACK-xeigtstz_gsv_in|LAPACK-xeigtstz_lse_in|BLAS-xblat1c" | ||
fi |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@martin-frbg
Not sure you were aware of this, but there's currently a bunch of tests that fail on openblas & windows (in the peculiar way that conda-forge builds this blas-metapackage).
Hoping to get the lapack tests in conda-forge/openblas-feedstock#122 running for windows as well, then we should see how much of that is "native" (and can hopefully be fixed eventually).
Nothing for you to do right now, just FYI.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, not aware of this. This is specifically with mingw gcc/gfortran and has been happening for a number of previous releases as well ? And do the failures maniffest as "failed to pass the threshold" or do the test codes crash ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I had let everything run through on a debug commit that I rebased out, but it seems Azure deleted the logs already. I'll let it run again.
This is indeed with mingw + gcc (in contrast to the openblas recipe), which might already explain the differences. @isuruf knows this better than I do though, that part of the recipe here hasn't been touched in quite a while.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Error log is here. From what I can see, mostly segfaults, illegal arguments and the like.
Could be something related to libomp / libgomp again?
It's worth noting that this runs the vanilla netlib test suite (for 3.9.0) but linked against openblas. Not sure if there are any relevant divergences there.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Errors of the form *** Illegal value of parameter number 1 not detected by
can be ignored, because the way the lapack tests are set up, they need to override the implementation of xerbla_
in the shared library. In Linux and OSX, the local copy in the test override the xerbla_
implementation in the shared library, but in windows it doesn't.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@isuruf, the thing I don't understand is that - if it was only about xerbla - why is the set of failures not exactly the same between the 4 combinations of {win, osx} + {mkl, openblas}?
@isuruf |
That was an issue in llvm-openmp 12. We need to apply llvm/llvm-project@f1b9ce2 |
Thanks for chasing this down! I wanted to tackle this, but saw you already did: conda-forge/openmp-feedstock#51 |
Also saw conda-forge/conda-forge-pinning-feedstock#1727 (and that you have a rerender open in #76)... Should this be rerendered too before merging? |
I removed of the automerge-label now, but purely to make it possible to react at all to the rerender question above (since the CI could would have finished in the next ~15min) - I'm fine with merging either way (but would tend to rerender so that we get the right llvm-openmp runtime constraint) |
@conda-forge-admin, rerender |
Hi! This is the friendly automated conda-forge-webservice. |
Picking up conda-forge/blis-feedstock#24, conda-forge/intel_repack-feedstock#25 & conda-forge/openblas-feedstock#123