Support for F functions #299

Madu86 · 2023-09-26T13:30:33Z

In this PR, I have enabled support for f functions in the ERI and ERI gradient calculations. Specifically, the following tasks were performed.

Updated error trap for F functions. The trap activates only if the user compiles the code without F function support.
Debugged F subroutines of legacy CPU ERI code.
Implemented gradient calculation of ERIs containing F in CPU code.
Ported 2 and 3 implementations to GPU.
Implemented parallel versions of 2-4.
Enabled support to compile new source files added in 3-5 in CMake build system.
Performance optimization of F kernels in CUDA and CUDAMPI versions.
Updated default basis set collection with cc-pVTZ, def2-tZVP and 6-311G(2df,2pd). Also generated atomic densities required for the SAD guess of these basis sets.
Updated test suite with energy, gradient, and geometry optimization tests with cases containing F functions.
Updated CI to compile the code with F function support and test.

The accuracy of the implementations (energy and gradients) was tested against a reference software. The results were in excellent agreement. Please check the PR and test exhaustively.

The code can be configured using the CMake build system for Volta architecture with F function support and GNU compiler toolchain as follows (assuming the build directory is located inside QUICK home directory).

cmake .. -DMPI=TRUE -DCUDA=TRUE -DCMAKE_INSTALL_PREFIX=$(pwd)/../install -DCOMPILER=GNU -DQUICK_USER_ARCH=volta -DENABLEF=TRUE

Here is an accuracy and performance comparison of PSB3 gradient calculation at the B3LYP/cc-pVTZ level of theory. CUDA tests were run on NVIDIA A100 cards.

	CPU serial	CPU parallel (2 procs)	GPU serial	GPU parallel (2 procs)
Energy (a.u.)	-249.921840664	-249.921840664	-249.921840510	-249.921840510

Gradients (a.u./Hartree)
1X	-0.010743	-0.010743	-0.010743	-0.010743
1Y	-0.003115	-0.003115	-0.003115	-0.003115
1Z	-0.000697	-0.000697	-0.000697	-0.000697
2X	0.002315	0.002315	0.002315	0.002315
2Y	0.000594	0.000594	0.000594	0.000594
2Z	0.010834	0.010834	0.010834	0.010834
3X	0.001040	0.001040	0.001039	0.001039
3Y	0.000329	0.000329	0.000329	0.000329
3Z	-0.009902	-0.009902	-0.009902	-0.009902
4X	-0.001622	-0.001622	-0.001622	-0.001622
4Y	-0.000155	-0.000155	-0.000156	-0.000156
4Z	-0.014091	-0.014091	-0.014090	-0.014090
5X	0.016453	0.016453	0.016452	0.016452
5Y	0.004450	0.004450	0.004450	0.004450
5Z	0.013671	0.013671	0.013671	0.013671
6X	-0.007907	-0.007907	-0.007906	-0.007906
6Y	-0.002272	-0.002272	-0.002263	-0.002263
6Z	-0.000343	-0.000343	-0.000357	-0.000357
7X	-0.001899	-0.001899	-0.001899	-0.001899
7Y	-0.000576	-0.000576	-0.000576	-0.000576
7Z	0.004551	0.004551	0.004551	0.004551
8X	0.003601	0.003601	0.003601	0.003601
8Y	0.000919	0.000919	0.000919	0.000919
8Z	0.004957	0.004957	0.004957	0.004957
9X	0.003502	0.003502	0.003502	0.003502
9Y	0.001099	0.001099	0.001099	0.001099
9Z	-0.005542	-0.005542	-0.005542	-0.005542
10X	-0.002006	-0.002006	-0.002006	-0.002006
10Y	-0.000531	-0.000531	-0.000531	-0.000531
10Z	-0.004566	-0.004566	-0.004566	-0.004566
11X	-0.003991	-0.003991	-0.003991	-0.003991
11Y	-0.001023	-0.001023	-0.001022	-0.001022
11Z	-0.006537	-0.006537	-0.006536	-0.006536
12X	-0.003578	-0.003578	-0.003578	-0.003578
12Y	-0.001116	-0.001116	-0.001125	-0.001125
12Z	0.007073	0.007073	0.007086	0.007086
13X	0.003088	0.003088	0.003088	0.003088
13Y	0.000925	0.000925	0.000925	0.000925
13Z	-0.004499	-0.004499	-0.004499	-0.004499
14X	0.001747	0.001747	0.001747	0.001747
14Y	0.000471	0.000471	0.000471	0.000471
14Z	0.005089	0.005089	0.005089	0.005089

Runtime (s)	2997.88	1560.69	131.18	78.40

The input and output files of these runs can be found inside the example.tar.gz attached below.

Finally, it is worth noticing the limitations of the current CUDA/CUDAMPI F implementations.

CUDA one electron integral code does not support F functions. If a calculation contains F functions, CPU OEI implementation will be used. (Eg. https://github.com/Madu86/QUICK/blob/ffunc/src/modules/quick_oei_module.f90#L113-L130)
CUDA F implementation requires a lot of device scratch memory (https://github.com/Madu86/QUICK/blob/ffunc/src/cuda/gpu.cu#L244-L282). Therefore, calculations will fail to run on cards having low amounts of device memory (Eg. cards with < 16 GB memory).
The ERI sorting algorithm used for F kernels is not optimal (https://github.com/Madu86/QUICK/blob/ffunc/src/cuda/gpu.cu#L606-L610). An optimized version has been implemented in the Gen3 ERI effort which is currently WIP.
QUICK/Amber QM/MM has not been tested with F functions.
Support for open shell calculations is not available.
HIP and HIPMPI versions haven't been tested.

example.tar.gz

…ore than 10 primitive functions. If the molecule has certain atoms (eg. V, Mn, etc.), calculation will fail with a seg fault even before the SCF begins.

… ERIs

…e total angular momentum is less than or equal to 8

…ore than 10 primitive functions. If the molecule has certain atoms (eg. V, Mn, etc.), calculation will fail with a seg fault even before the SCF begins.

… ERIs

…e total angular momentum is less than or equal to 8

…ve integrals in d and f grad kernels

…g f gradients

…functions

…this by default.

…thout compiling the code with support.

… functions

agoetz

Nice work. I have looked it over and tested the code.

CPU only (with and without F functions)

GNU 10.2, OpenMPI 4.0.5
GNU 10.2, OpenMPI 4.0.5, MKL 2024.0
GNU 11.4, OpenMPI 4.0.5

CPU and GPU (A100, Expanse)

GNU 10.2.0, OpenMPI 4.1.3, CUDA 11.7, MKL 2020.4

All tests pass.

Open shell (CPU only MPI tested, not serial)

Energy looks good on CPU and GPU
Gradient looks good on CPU

Following needs to be done (will do separate PRs)

Add trap for G functions
Add trap for open shell + F functions + gradient + CUDA
Add test with 4 centers with f functions (e.g. def2-TZVP basis)

Madu86 added 30 commits November 30, 2022 08:43

Added cc-pVTZ basis set. Note that there are basis functions having m…

94c548b

…ore than 10 primitive functions. If the molecule has certain atoms (eg. V, Mn, etc.), calculation will fail with a seg fault even before the SCF begins.

disabled computing OEIs on GPU in order to test high angular momentum…

8287ef5

… ERIs

Merge remote-tracking branch 'upstream/master' into ffunc

6228846

adding new vrr kernels for computing high angular momentum integrals

4f4ba75

updated spd hrr functions to support integrals with f functions whos…

958c824

…e total angular momentum is less than or equal to 8

setting maxprim size for f kernels

bc962e8

changing default sort method for f kernels

d9100f9

disabling useless legacy kernels

ab72439

ERIs with f functions are working

a830cdd

Setting scf=N shouldnt effect the sad guess procedure. Fixed this issue.

44ec9d4

Added cc-pVTZ basis set. Note that there are basis functions having m…

291ad42

…ore than 10 primitive functions. If the molecule has certain atoms (eg. V, Mn, etc.), calculation will fail with a seg fault even before the SCF begins.

disabled computing OEIs on GPU in order to test high angular momentum…

9ecca23

… ERIs

adding new vrr kernels for computing high angular momentum integrals

dcc6ccb

updated spd hrr functions to support integrals with f functions whos…

038b01d

…e total angular momentum is less than or equal to 8

setting maxprim size for f kernels

4815bf0

changing default sort method for f kernels

d62ddbe

disabling useless legacy kernels

072cd44

ERIs with f functions are working

67c4637

Setting scf=N shouldnt effect the sad guess procedure. Fixed this issue.

3188ccf

added new value for store array size

49d9c84

updated scratch array sizes

bdc5ef9

Merge branch 'ffunc' of github.com:Madu86/QUICK into ffunc

46028eb

updated driver for F gradient

ac3b86f

added VRR kernels required to compute F gradients

edbe844

fixed an array size issue

05ca3e9

disabled cuda oei gradients

0202ff2

updated to call spdf3 grad kernel only if f functions are enables

90333cb

updated to perform HRR using contracted integrals rather than primiti…

306832b

…ve integrals in d and f grad kernels

added some high angular momentum device kernels required for computin…

cc1aa1a

…g f gradients

splitted ksks class into 3

e9eef57

Madu86 added 23 commits September 22, 2023 13:23

reverted

bfe93b0

added flag based method to call oei subroutines in the presence of f …

250e80d

…functions

added iclass as an include directory

f80f031

enabled running ci with f functions

76fba00

removed obsolete function call

97e6e02

updated to allocate grad scratch only for grad or opt calculations

9d05080

fixed a bug by initializing a variable properly

d841422

added atomic densities for atoms in CC-PVTZ basis set

cef2751

added geometry optimization test with f functions into full test list.

cdd5bb3

added new basis set

bab3c34

updated with available elements statement

6da7154

updated 6-311G2DF2PD.BAS

a63a005

added def2-tzvp basis set

6eeee47

added new gradient test

075fd52

added new energy test with def2-tzvp basis set

e76d081

store2 is being used by oei code. Updated to allocate and deallocate …

9599bb1

…this by default.

added new preprocessor flag

4073281

Updated to throw and error if the user attempts to use f functions wi…

48cfe82

…thout compiling the code with support.

fixed a merge conflict

021c2dd

added function to check for F function error

dab0c9f

added f tests into short test lists

d8e54e3

updated to skip f tests if the code wasnt compiled with support for f…

a451e09

… functions

updated sort method for computing ERIs with f functions

a643568

Madu86 added the enhancement New feature or request label Sep 26, 2023

Madu86 requested review from agoetz and akhilshajan September 26, 2023 13:30

Madu86 self-assigned this Sep 26, 2023

agoetz approved these changes Feb 25, 2024

View reviewed changes

agoetz merged commit 642885c into merzlab:ffunc-gen2 Feb 25, 2024
4 checks passed

agoetz mentioned this pull request Feb 25, 2024

Enable support for F functions #312

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Support for F functions #299

Support for F functions #299

Madu86 commented Sep 26, 2023 •

edited

Loading

agoetz left a comment

Support for F functions #299

Support for F functions #299

Conversation

Madu86 commented Sep 26, 2023 • edited Loading

agoetz left a comment

Choose a reason for hiding this comment

Madu86 commented Sep 26, 2023 •

edited

Loading