Join GitHub today
GitHub is home to over 28 million developers working together to host and review code, manage projects, and build software together.Sign up
An AVX512 extention to the OpenQCD lattice simulation code
Fetching latest commit…
Cannot retrieve the latest commit at this time.
|Failed to load latest commit information.|
******************************************************************************** openQCD Simulation Program ******************************************************************************** LATTICE THEORY Currently the common features of the supported lattice theories are the following: * 4-dimensional hypercubic N0xN1xN2xN3 lattice with even sizes N0,N1,N2,N3. Open, Schrödinger functional (SF), open-SF or periodic boundary conditions in the time direction and periodic boundary conditions in the space directions. * SU(3) gauge group, plaquette plus planar double-plaquette gauge action (Wilson, Symanzik, Iwasaki,...). * O(a)-improved Wilson quarks in the fundamental representation of the gauge group. Among the supported quark multiplets are the classical ones (pure gauge, two-flavour theory, 2+1 and 2+1+1 flavour QCD), but doublets with a twisted mass and theories with many doublets, for example, are also supported. The O(a)-improvement includes the boundary counterterms required for the improvement of the correlation functions near the boundaries of the lattice in the time direction if open, SF or open-SF boundary conditions are chosen. For the quark fields phase-periodic boundary conditions in the space directions are implemented too. SIMULATION ALGORITHM The simulation program is based on the HMC algorithm. For the heavier quarks, a version of the RHMC algorithm is used. Several advanced techniques are implemented that can be configured at run time: * Nested hierarchical integrators for the molecular-dynamics equations, based on any combination of the leapfrog, 2nd order Omelyan-Mryglod-Folk (OMF) and 4th order OMF elementary integrators, are supported. * Twisted-mass Hasenbusch frequency splitting, with any number of factors and twisted masses. Optionally with even-odd preconditioning. * Twisted-mass determinant reweighting. * Deflation acceleration and chronological solver along the molecular-dynamics trajectories. * A choice of solvers (CGNE, MSCG, SAP+GCR, deflated SAP+GCR) for the Dirac equation, separately configurable for each force component and pseudo-fermion action. All of these depend on a number of parameters, whose values are passed to the simulation program together with those of the action parameters (coupling constants, quark masses, etc.) through a structured input parameter file. PROGRAM FEATURES All programs parallelize in 0,1,2,3 or 4 dimensions, depending on what is specified at compilation time. They are highly optimized for machines with current Intel or AMD processors, but will run correctly on any system that complies with the ISO C89 (formerly ANSI C) and the MPI 1.2 standards. For the purpose of testing and code development, the programs can also be run on a desktop or laptop computer. All what is needed for this is a compliant C compiler and a local MPI installation such as Open MPI. DOCUMENTATION The simulation program has a modular form, with strict prototyping and a minimal use of external variables. Each program file contains a small number of externally accessible functions whose functionality is described at the top of the file. The data layout is explained in various README files and detailed instructions are given on how to run the main programs. A set of further documentation files are included in the doc directory, where the normalization conventions, the chosen algorithms and other important program elements are described. COMPILATION The compilation of the programs requires an ISO C89 compliant compiler and a compatible MPI installation that complies with the MPI standard 1.2 (or later). In the main and devel directories, a GNU-style Makefile is included which compiles and links the programs (type "make" to compile everything; "make clean" removes the files generated by "make"). The compiler options can be set by editing the CFLAGS line in the Makefiles. The Makefiles assume that the following environment variables are set: GCC GNU C compiler command [Example: /usr/bin/gcc]. MPI_HOME MPI home directory [Example: /usr/lib64/mpi/gcc/openmpi]. The mpicc command used is the one in $MPI_HOME/bin and the MPI libraries are expected in $MPI_HOME/lib. MPI_INCLUDE Directory where the mpi.h file is to be found. All programs are then compiled and linked using the mpicc command. The compiler command to be used for the compilation of the modules and the link step can be changed by editing the CC and CLINKER lines in the Makefiles. Independently of what is specified there, the GCC compiler is used to resolve the dependencies on the include files. SSE/AVX ACCELERATION Current Intel and AMD processors are able to perform arithmetic operations on short vectors of floating-point numbers in just one or two machine cycles, using SSE and/or AVX instructions. Many programs in the module directories include inline-assembly SSE and AVX code. Inline assembly is a GCC extension of the C language that may not be supported by other compilers. On 64bit systems the code can be activated by setting the compiler flags -Dx64 or -DAVX, respectively. In addition, SSE prefetch instructions will be used if one of the following options is specified: -DP4 Assume that prefetch instructions fetch 128 bytes at a time (Pentium 4 and related Xeons). -DPM Assume that prefetch instructions fetch 64 bytes at a time (Athlon, Opteron, Pentium M, Core, Core 2 and related Xeons). -DP3 Assume that prefetch instructions fetch 32 bytes at a time (Pentium III). These options have an effect only if -Dx64 or -DAVX is set. The option -DAVX implies -Dx64. If none of these options is set, the programs do not make use of any C language extensions and are fully portable. The latest x86 processors furthermore support fused multiply-add (FMA3) instructions. OpenQCD makes use of these if the option -DFMA3 is set in addition to -DAVX (setting -DFMA3 alone has no effect). On recent x86-64 machines the recommended compiler flags are thus -std=c89 -O -mno-avx -DAVX -DFMA3 -DPM For older machines that do not support the AVX instruction set, the recommended flags are -std=c89 -O -mno-avx -Dx64 -DPM Aggressive optimization levels such as -O2 and -O3 tend to have little effect on the execution speed of the programs, but the risk of generating wrong code is higher. AVX instructions and the option -mno-avx may not be known to old versions of the GCC compiler, in which case one may be limited to SSE accelerations with option string -std=c89 -O -Dx64 -DPM (or no acceleration at all). If compilers other than GCC are used together with the option -Dx64 or -DAVX, it is strongly recommended to verify the correctness of the compilation using the check programs in the devel directory. DEBUGGING FLAGS For troubleshooting and parameter tuning, it may helpful to switch on some debugging flags at compilation time. The simulation program then prints a detailed report to the log file on the progress made in specified subprogram. The available flags are: -DCGNE_DBG CGNE solver. -DFGCR_DBG GCR solver. -FGCR4VD_DBG GCR solver for the little Dirac equation. -DMSCG_DBG MSCG solver. -DDFL_MODES_DBG Deflation subspace generation. -DMDINT_DBG Integration of the molecular-dynamics equations. -DRWRAT_DBG Computation of the rational function reweighting factor. RUNNING A SIMULATION The simulation programs reside in the directory "main". For each program, there is a README file in this directory which describes the program functionality and its parameters. Running a simulation for the first time requires its parameters to be chosen, which tends to be a non-trivial task. The syntax of the input parameter files and the meaning of the various parameters is described in some detail in main/README.infiles and doc/parms.pdf. Examples of valid parameter files are contained in the directory main/examples. EXPORTED FIELD FORMAT The field configurations generated in the course of a simulation are written to disk in a machine-independent format (see modules/misc/archive.c). Independently of the machine endianness, the fields are written in little endian format. A byte-reordering is therefore not required when machines with different endianness are used for the simulation and the physics analysis. AUTHORS The initial release of the openQCD package was written by Martin Lüscher and Stefan Schaefer. Support for Schrödinger functional boundary conditions was added by John Bulava. Phase-periodic boundary conditions for the quark fields were introduced by Isabel Campos. Several modules were taken over from the DD-HMC program tree, which includes contributions from Luigi Del Debbio, Leonardo Giusti, Björn Leder and Filippo Palombi. ACKNOWLEDGEMENTS In the course of the development of the openQCD code, many people suggested corrections and improvements or tested preliminary versions of the programs. The authors are particularly grateful to Isabel Campos, Dalibor Djukanovic, Georg Engel, Leonardo Giusti, Björn Leder, Carlos Pena and Hubert Simma for their communications and help. LICENSE The software may be used under the terms of the GNU General Public Licence (GPL). BUG REPORTS If a bug is discovered, please send a report to <email@example.com>. ALTERNATIVE PACKAGES AND COMPLEMENTARY PROGRAMS There is a publicly available BG/Q version of openQCD that takes advantage of the machine-specific features of IBM BlueGene/Q computers. The version is available at <http://hpc.desy.de/simlab/codes/>. The openQCD programs currently do not support reweighting in the quark masses, but a module providing this functionality can be downloaded from <http://www-ai.math.uni-wuppertal.de/~leder/mrw/>. Full-fledged QCD simulation programs tend to have many adjustable parameters. In the case of openQCD, most parameters are passed to the programs through a human-readable structured file. Liam Keegan's sleek graphical editor for these parameter files offers some guidance and complains when inconsistent parameter values are entered (see <http://lkeegan.github.io/openQCD-input-file-editor>).