Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update OpenBLAS to v0.3.5 #30583

Merged
merged 5 commits into from Jan 6, 2019

Conversation

@ararslan
Copy link
Member

commented Jan 4, 2019

Note that we're currently using v0.3.3 and this PR goes directly to the newest version, v0.3.5. See the release notes for both 0.3.4 and 0.3.5 for a more complete summary of the changes: https://github.com/xianyi/OpenBLAS/releases.

@staticfloat, we'll need some kind of BinaryBuilder magic for this as well, right? How do?

@vchuravy

This comment has been minimized.

Copy link
Member

commented Jan 4, 2019

@staticfloat

This comment has been minimized.

Copy link
Member

commented Jan 4, 2019

I've started a build of v0.3.5; we'll see what's broken this time around. ;)

@ViralBShah

This comment has been minimized.

Copy link
Member

commented Jan 4, 2019

We should increase the number of MAX THREADS to 40 or perhaps even 64 (on x86-64).

https://github.com/JuliaPackaging/Yggdrasil/blob/2b1576aeb8cdac4e783bd35f58e77fad97c09225/OpenBLAS/build_tarballs.jl#L46

@vtjnash

This comment has been minimized.

Copy link
Member

commented Jan 4, 2019

We set that option low because it seemed to improve performance on small matrices and limits virtual memory usage (iirc, it reserves 256MB * MAX_THREADS). We can reconsider (esp. if OpenBLAS is improved now to initialize this lazily), but that should be a separate PR.

edit: cf 716151d

@ViralBShah

This comment has been minimized.

Copy link
Member

commented Jan 4, 2019

Yes, but the flip side now is that people complain they have bigger machines and Julia does not use all the cores.

@ViralBShah

This comment has been minimized.

Copy link
Member

commented Jan 4, 2019

The small matrix performance is driven by the gemm threshold (which is 50, so multi-threading is not used for smaller matrices). The main driver was memory when we did that change in 2013. I say that core counts and memories have grown enough that doubling the max threads now should be ok (ideally only on linux x86-64).

@ViralBShah

This comment has been minimized.

Copy link
Member

commented Jan 4, 2019

From the openblas 0.3.4 release notes:

OpenBLAS will now provide enough buffer space for at least 50 threads by default.

I am sure we override this with our setting.

@staticfloat

This comment has been minimized.

Copy link
Member

commented Jan 4, 2019

BB tarballs are good to go.

@ViralBShah let's open a separate issue, do some performance measurements, then decide the new value to default to.

@ViralBShah

This comment has been minimized.

Copy link
Member

commented Jan 4, 2019

Yes, agree that the max threads should not hold up the version update.

@ararslan

This comment has been minimized.

Copy link
Member Author

commented Jan 4, 2019

Thanks for taking care of the BinaryBuilder stuff, Elliot!

@ararslan

This comment has been minimized.

Copy link
Member Author

commented Jan 4, 2019

Travis macOS is trying to build GCC from source which filled up the max log length so Travis killed the build.

@ararslan

This comment has been minimized.

Copy link
Member Author

commented Jan 5, 2019

The macOS failure should be fixed by #30599.

@ararslan ararslan force-pushed the aa/openblas-0.3.5 branch from 1ae984a to 5d46f2a Jan 5, 2019

ararslan and others added 5 commits Jan 4, 2019

@ararslan ararslan force-pushed the aa/openblas-0.3.5 branch from 5d46f2a to 05f874b Jan 5, 2019

@ararslan ararslan merged commit ffb6f1e into master Jan 6, 2019

3 checks passed

continuous-integration/appveyor/pr AppVeyor build succeeded
Details
continuous-integration/travis-ci/pr The Travis CI build passed
Details
julia freebsd ci Build done
Details

@ararslan ararslan deleted the aa/openblas-0.3.5 branch Jan 6, 2019

@vtjnash

This comment has been minimized.

Copy link
Member

commented Jan 8, 2019

I'm seeing a substantial number of BLAS tests failing now on Intel Xeon Silver 4114 (none are present if I revert this PR): https://gist.github.com/vtjnash/c4f09f3019b335a690862134807da41c

julia> using LinearAlgebra

julia> LinearAlgebra.versioninfo()
BLAS: libopenblas (OpenBLAS 0.3.5  USE64BITINT DYNAMIC_ARCH NO_AFFINITY SkylakeX MAX_THREADS=16)
LAPACK: libopenblas64_

julia> versioninfo()
Julia Version 1.2.0-DEV.131
Platform Info:
  OS: Linux (x86_64-linux-gnu)
  CPU: Intel(R) Xeon(R) Silver 4114 CPU @ 2.20GHz
  WORD_SIZE: 64
  LIBM: libopenlibm
  LLVM: libLLVM-6.0.1 (ORCJIT, skylake)
@staticfloat

This comment has been minimized.

Copy link
Member

commented Jan 8, 2019

Yep, looks like an OpenBLAS bug. Let's open an issue on OpenBLAS. I'll take a look to see if there's anything suspicious they've merged against SkylakeX recently and try reverting those patches.

@ViralBShah

This comment has been minimized.

Copy link
Member

commented Jan 8, 2019

Why not just move to an older version of openblas and adopt a new version when they fix things upstream? Would be lesser work overall.

@andreasnoack

This comment has been minimized.

Copy link
Member

commented Jan 8, 2019

There might be some important bug fixes in the latest release #30460 (comment)

@staticfloat

This comment has been minimized.

Copy link
Member

commented Jan 8, 2019

Sure, we could revert back to 0.3.3 (0.3.4 won't build on all our platforms) but there are some nice things to have in 0.3.5, lots of bugs fixed, so it's better IMO to disable a few optimized kernels instead.

@ViralBShah

This comment has been minimized.

Copy link
Member

commented Jan 9, 2019

I see pinv, bunchkaufman, qr, and triangular failing on skylake. I've been trying to find a reproducible test case for xianyi/OpenBLAS#1955

The openblas failure is strange. On skylake, the tests that fail in triangular, if I run them on the command line, they are fine. Perhaps some buffer overruns somewhere in openblas?

staticfloat added a commit that referenced this pull request Jan 9, 2019
staticfloat added a commit that referenced this pull request Jan 10, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
6 participants
You can’t perform that action at this time.