Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update OpenBLAS build options #5091

Merged
merged 1 commit into from Jul 17, 2019
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Jump to
Jump to file
Failed to load files.
Diff view
Diff view
6 changes: 3 additions & 3 deletions OpenBLAS.spec
Expand Up @@ -10,12 +10,12 @@ Source: https://github.com/xianyi/OpenBLAS/archive/v%{realversion}.tar.gz

# PRESCOTT is a generic x86-64 target https://github.com/xianyi/OpenBLAS/issues/685
%ifarch x86_64
make FC=gfortran BINARY=64 TARGET=PENRYN NUM_THREADS=256 DYNAMIC_ARCH=1
make FC=gfortran BINARY=64 TARGET=CORE2 NUM_THREADS=256 DYNAMIC_ARCH=0
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what does the NUM_THREADS actually correspond to in reality at run time?
Was the speedup seen in tests of deepAK8 real or a simple parallelization on a quiet node?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@slava77 What affects the number of threads used at runtime is the OPENBLAS_NUM_THREADS environment variable which is set to 1 in https://github.com/cms-sw/cmsdist/blob/IB/CMSSW_11_0_X/gcc700/OpenBLAS-toolfile.spec#L19. NUM_THREADS is the maximum allowed number of threads and is used for resource allocation etc. I explicitly checked the CPU usage when running the tests and it was not going beyond 100%:)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And gslblas is known to be not a very high-performance BLAS routine, so the speed-up we saw after switching to OpenBLAS should not be surprising at all.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks for clarifications

%else
%ifarch aarch64
make FC=gfortran BINARY=64 TARGET=ARMV8 NUM_THREADS=256 DYNAMIC_ARCH=1
make FC=gfortran BINARY=64 TARGET=ARMV8 NUM_THREADS=256 DYNAMIC_ARCH=0
%else
make FC=gfortran BINARY=64 NUM_THREADS=256 DYNAMIC_ARCH=1
make FC=gfortran BINARY=64 NUM_THREADS=256 DYNAMIC_ARCH=0
%endif # aarch64
%endif # x86_64

Expand Down