Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fixes and improvements to xeon phi implementation. #2

Merged
merged 3 commits into from Feb 14, 2014

Conversation

tlrmchlsmth
Copy link
Member

Includes the improvements to the micro-kernel that matches performance with MKL

fgvanzee added a commit that referenced this pull request Feb 14, 2014
Fixes and improvements to xeon phi implementation.
@fgvanzee fgvanzee merged commit b29e1c2 into flame:master Feb 14, 2014
@songmaotian songmaotian mentioned this pull request Apr 22, 2016
@loveshack loveshack mentioned this pull request Mar 5, 2018
loveshack pushed a commit to loveshack/blis that referenced this pull request Sep 24, 2019
This needs fixing properly somehow, but using -O3 (at least with gcc 8.3),
we get this:

Program received signal SIGILL, Illegal instruction.
0x000000001004c660 in bli_cntx_init_power9_ref (cntx=0x103e06b0)
    at ref_kernels/bli_cntx_ref.c:456
456             for ( i = 0; i < BLIS_NUM_LEVEL3_OPS; ++i ) vfuncs[ i ] = NULL;
(gdb) bt
#0  0x000000001004c660 in bli_cntx_init_power9_ref (cntx=0x103e06b0)
    at ref_kernels/bli_cntx_ref.c:456
flame#1  0x000000001004c0a8 in bli_cntx_init_power9 (cntx=<optimized out>)
    at config/power9/bli_cntx_init_power9.c:42
flame#2  0x000000001003c85c in bli_gks_register_cntx (id=BLIS_ARCH_POWER9,
    nat_fp=0x1004c090 <bli_cntx_init_power9>,
    ref_fp=0x1004c0d0 <bli_cntx_init_power9_ref>, ind_fp=<optimized out>)
    at frame/base/bli_gks.c:373
flame#3  0x000000001003c97c in bli_gks_init () at frame/base/bli_gks.c:155
flame#4  0x000000001003cfe8 in bli_init_apis () at frame/base/bli_init.c:78
flame#5  0x00007ffff7e045a8 in __pthread_once_slow () from /lib64/libpthread.so.0
flame#6  0x00000000100492e8 in bli_pthread_once (once=<optimized out>,
    init=<optimized out>) at frame/thread/bli_pthread.c:314
flame#7  0x000000001003d138 in bli_init_once () at frame/base/bli_init.c:104
flame#8  bli_init_auto () at frame/base/bli_init.c:54
flame#9  0x0000000010011300 in cdotc_ (n=<optimized out>, x=<optimized out>,
    incx=<optimized out>, y=<optimized out>, incy=<optimized out>)
    at frame/compat/bla_dot.c:89
flame#10 0x0000000010002a48 in check2_ (sfac=0x103d14dc <sfac>)
    at blastest/src/cblat1.c:529
flame#11 0x0000000010001ef4 in main () at blastest/src/cblat1.c:112
xrq-phys referenced this pull request in xrq-phys/blis Sep 21, 2021
Merge for AOCL 3.0 release.
niyas-sait pushed a commit to niyas-sait/blis that referenced this pull request Feb 25, 2022
* Update blis

* Update gitignore

* Update appveyor

* Update blis submodule

* Fix path to clang on Windows build

* Try to fix build error

* Suppress python build temporarily

* Try to add new windows header

* Update appveyor

* Update msvc.jsonl

* Fix appveyor

* Fix location of header

* Compile bli_pthread_wrap

* Recompile blis with appveyor

* Fix appveyor

* More appveyor tweaks

* Change submodule branch

* Update flame-blis

* Update submodule remote

* Update build for pthread_wrap

* Avoid building pthreads on windows

* Update appveyor build

* Try to fix appveyor

* Try to fix compile error

* Fix include path in Windows build

* Update blis.h for linux-86_64

* Update make script for linuxx-x86_64

* Fix setup.py

* Rename msvc build files

* Update script to generate make jsonl

* Try to avoid requiring conda

* Rename file

* Unhack setup

* Fix setup.py

* Try to fix include

* Try to include compiler libraries

* Fix setup

* Fix setup

* Fix setup.py

* Fix setup.py

* Build with conda

* Fix include in setup.py

* Fix include dirs

* Try to debug file not found error

* Try to fix file not found problem

* Debug windows setup problem

* Try to debug file not found error

* Debug windows

* Debug failure

* Debug failure

* Debug failure

* Try -150

* Try -175

* Try -50

* Try -300

* Add fenv.h

* Try again without conda

* Try to fix compiler complaint

* Try -200

* Try -0

* Add call to vcvarsall

* Fix path to LLVM in setup

* Call Python3 explicitly

* Fix path to python

* Try to simplify appveyor script

* Debug compile error

* Debug compile error

* Simplify appveyor script

* Simplify appveyor to succeed at dummy task

* Try -200

* Try -250

* Try -200

* Try -150

* Try -125

* Try -250

* Try -225

* Try -150

* Try -175

* Try -185

* Try -200

* Try -215

* Try -220

* Try -240

* Try -230

* Try -222

* Try -221

* Work around windows command line length limit

* Fix dir creation

* Fix dir creation

* Unhack dummy.pyx

* Install and test library after building

* Fix pip install in appveyor

* Fix appveyor

* Try to fix appveyor

* Fix appveyor

* Fix appveyor

* Try to fix travis

* Fix appveyor

* Add missing dependency to setup.py

* Try to test more Pythons in appveyor

* Remove copy-source-files.sh

* Remove stray files

* Update gitignore

* Update gitignore

* Fix appveyor test

* Try to fix isnan error on python2

* Try to fix isnan problem on Python2

* Try to fix appveyor test

* Remove incorrect classifiers

* Try building with c11

* Try to add LLVM lib directories to library_dirs

* Link to libm for clang

* Try to fix test execution

* Try to fix libm

* Fix link error on Windows

* Remove libm from windows

* Fix setup.py

* Try linking against msvcrt

* Try to link ucrt

* Try to add missing library dirs for python2.7

* Fix setup

* Don't try to build python2.7 for Windows

* Unhack setup.py
Aaron-Hutchinson referenced this pull request in sifive/sifive-blis Apr 4, 2023
This PR adds the x280 sub-configuration to BLIS.
leekillough pushed a commit to leekillough/blis that referenced this pull request Apr 12, 2023
Restore changes from sifive-blis-private#28
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants