Skip to content
The Hari-Zimmermann complex generalized hyperbolic SVD and EVD.
Fortran Makefile C
Branch: master
Clone or download
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Permalink
Type Name Latest commit message Commit time
Failed to load latest commit information.
src
.gitignore
LICENSE
README.md

README.md

FLAPWxHZ

The Hari-Zimmermann complex generalized hyperbolic SVD and EVD.

A part of the supplementary material for the paper arXiv:1907.08560 [math.NA].

Building

Prerequisites

A recent 64-bit Linux (e.g., CentOS 7.6) or macOS (e.g., Mojave) is needed.

Have the Intel MKL (Math Kernel Library) installed.

Then, clone and build JACSD in a directory parallel to this one.

Make options

Run make as follows:

cd src
make [CPU=x64|x200|gnu] [NDEBUG=0|1|2|3|4|5] [all|clean|help]

where CPU should be set for the Intel C++ and Fortran compilers to x64 for Xeons, or to x200 for Xeon Phi KNLs, respectively. If CPU is not set, GNU C (Clang on macOS) and Fortran compilers will be used instead.

GNU Fortran 9 is not supported! Currently, only GPU Fortran 8 is fully supported.

Here, NDEBUG should be set to the desired optimization level (3 is a sensible choice). If unset, the predefined debug-mode build options will be used.

For example, make CPU=x200 NDEBUG=3 clean all will trigger a full, release-mode rebuild for the KNLs.

Execution

Command line

In the examples below, TPC stands for threads-per-core. If the hyperthreading is not desired, it should be set to 1.

FN is the input and output file name prefix (without an extension).

Phase 0

/path/to/phase0.exe input.bin FN

Phase 0 is a data conversion phase from a custom data format to a set of plain binary files.

Phase 1

OMP_NUM_THREADS=T OMP_PLACES=CORES OMP_PROC_BIND=SPREAD,CLOSE /path/to/phase1.exe FN L a G TPC

L, a, and G are the problem-specific parameters.

Phase 2

OMP_NUM_THREADS=T OMP_PLACES=CORES OMP_PROC_BIND=SPREAD,CLOSE /path/to/phase2.exe FN M N TPC

Phase 3

OMP_NUM_THREADS=T OMP_PLACES=CORES OMP_PROC_BIND=SPREAD,CLOSE /path/to/phase3.exe FN M N TPC JSTRAT1 NSWP1 JSTRAT2 NSWP2

JSTRAT1 is the inner, and JSTRAT2 the outer Jacobi strategy.

JSTRAT1 can be 2 for cycwor or 4 for mmstep (recommended).

JSTRAT2 can be 3 for cycwor (recommended if a particular number of threads is supported) or 5 for mmstep.

NSWP1 (1 for block-oriented) and NSWP2 (30 should suffice in most cases) are the maximal numbers of the inner and of the outer sweeps allowed, respectively.

Phase 4

OMP_NUM_THREADS=T OMP_PLACES=CORES OMP_PROC_BIND=SPREAD,CLOSE /path/to/phase4.exe FN N TPC

Data format

All data is stored in the Fortran array order.

The testing dataset is available for download (please, conserve the bandwidth by downloading only what is of interest to you).

An example of data format of the test cases:

file name data type rows columns
FN.X COMPLEX(8) 2*L*a G
FN.T COMPLEX(8) 2*L 2*L
FN.U REAL(8) L*a 1
FN.YY COMPLEX(8) 2*L*a G
FN.WW COMPLEX(8) 2*L*a G
FN.JJ INTEGER(8) 2*L*a 1
FN.Y COMPLEX(8) G G
FN.W COMPLEX(8) G G
FN.J INTEGER(8) G 1
FN.P INTEGER(8) G 1
FN.O INTEGER(8) G 1
FN.YU COMPLEX(8) G G
FN.WV COMPLEX(8) G G
FN.Z COMPLEX(8) G G
FN.EY REAL(8) G 1
FN.EW REAL(8) G 1
FN.E REAL(8) G 1
FN.SY REAL(8) G 1
FN.SW REAL(8) G 1
FN.SS REAL(8) G 1
FN.ZZ COMPLEX(8) G G

Phase 0

Outputs FN.X, FN.T, FN.U.

Phase 1

Input: FN.X, FN.T, FN.U.

Output: FN.YY, FN.WW, FN.JJ.

Phase 2

Input: FN.YY, FN.WW, FN.JJ.

Output: FN.Y, FN.W, FN.J, FN.P, FN.O.

Phase 3

Input: FN.Y, FN.W, FN.J.

Output: FN.YU, FN.WV, FN.Z; FN.EY, FN.EW, FN.E; FN.SY, FN.SW, FN.SS.

Phase 4

Input: FN.Z.

Output: FN.ZZ.

This work has been supported in part by Croatian Science Foundation under the project IP-2014-09-3670 (MFBDA).

You can’t perform that action at this time.