Switch branches/tags
Nothing to show
Find file History
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Permalink
..
Failed to load latest commit information.
.cm
bin
Makefile
README.carefully
README.nas.org
cg.c
cg.o
cgswpf.c
cgswpf.o
ck_compile.sh
ck_compile.sh1
ck_postprocess_time.py
compile_aarch64.sh
compile_x86.sh
npb-C.h
npbparams.h

README.carefully

Note: please observe that in the routine conj_grad three 
implementations of the sparse matrix-vector multiply have
been supplied.  The default matrix-vector multiply is not
loop unrolled.  The alternate implementations are unrolled
to a depth of 2 and unrolled to a depth of 8.  Please
experiment with these to find the fastest for your particular
architecture.  If reporting timing results, any of these three may
be used without penalty.

Performance examples:
The non-unrolled version of the multiply is actually (slightly: 
maybe %5) faster on the sp2-66MHz-WN on 16 nodes than is the 
unrolled-by-2 version below.   On the Cray t3d, the reverse is true, 
i.e., the unrolled-by-two version is some 10% faster.  
The unrolled-by-8 version below is significantly faster
on the Cray t3d - overall speed of code is 1.5 times faster.