Adding a new calculator #248

baumeier · 2023-04-25T07:37:40Z

Hi devs,

I am looking at adding a new calculator in order to use pysisypus with our own code for ground and excited states (VOTCA, github.com/votca). I have looked at some of the existing calculators trying to reverse engineer what the required functions are to implement, but I guess there is a more efficient way to do that? Do you maybe have some overview or even bare template of what needs to be filled and how?
In particular for the excited state optimizations, I think you explicitly need the info about the MOs expressed GTOs? Then what ordering of functions do you expect in the GTO basis?

Any help is appreciated.

eljost · 2023-04-26T08:43:57Z

Dear Prof. Baumeier,
thanks for your interest in pysisyphus. Below I compiled some notes on how to proceed:

Creating a new calculator

Add new class

Create a new class in pysisyphus.calculators subpackage
1. Subclass pysisyphus.calculators.Calculator or pysisyphus.calculators.OverlapCalculator
2. If the wrapped program is executed by calling a binary add the appropriate conf_key and add an entry to ~/.pysisyphusrc. The conf_key must coincide with the entry
  in the ~/.pysisyphusrc.
Add an import to pysisyphus.calculators.__init__
If the calculator should be available via YAML input add a key to CALC_DICT in pysisyphus.run

Add required methods to class

General idea: implement get_energy/get_forces/(get_hessian) in the Calculator. These function are always called with Cartesian coordinates. In these functions you should construct
the required inputs to your program and call it via self.run(). Depending on the type of calculation, a parser function is called with the path to the calculation outputs. The parser functions
are registered in the self.parser_funcs dict and should extract the required information from the produced files.*

Implement the required methods in your calculator.
1. The most important method is probably get_forces, followed by get_energy and get_hessian.
2. All of these three methods take three arguments (see pysisyphus.calculators.Calculator.Calculator for the function signatures)
  1. atoms, tuple of atom-strings of length N
  2. coords, 1d numpy.array of 3N Cartesian coordinates in Bohr
  3. **prepare_kwargs, possibly empty dict that can contain point charges; not strictly required.
3. All of these methods return a dict that always contains the "energy" key and the energy in Hartree. The return value of get_forces must also contain a "forces" key
  that points to a 1d numpy.array of length 3N, containing Cartesian forces (not gradient!) in atomic units Hartree / Bohr. Similarily, results from get_hessian must contain
  the hessian key, pointing to an 2d numpy.array of shape (3N, 3N) containing the Cartesian Hessian in atomic units.
4. If get_hessian is not provided but a Hessian calculation is requested pysisyphus falls back to a numerical Hessian in newer versions. So get_hessian is not strictly required.
It is then up to you to create the appropriate inputs/input files to carry out the desired calculation.
Actual calculations should be initiated in get_energy/get_forces/get_hessian by calling self.run. See pysisyphus.calculators.Calculator.Calculator for possible arguments, the method is quite flexible. See pysisyphus.calculators.ORCA.ORCA for a simple example where Calculator.run is just called with an input-string or pysisyphus.calculators.Turbomole
for a more involved example where a modified $ENVIRONMENT is used. An important argument to Calculator.run is calc, which indicates the type of calculation carried out.
The provided argument to calc (energy/force/hessian/etc.) must coincide with a key in self.parse_funcs, that points to the respective parser. The parser functions are called with
a single argument: the path (type pathlib.Path) to the temporary directory, where the calculation was carried out.
If you want to keep certain files you should add a pattern to self.to_keep. After a calculation, files matching patterns in self.to_keep are saved in out_dir.
By providing appropriate entries in self._set_plans certain attributes of the calculator are updated with the most recent filenames from self.to_keep, after a calculation. This
way one can easily carry MO coefficients from one calculation to another, to speed up SCF convergence. The _set_plans feature was just recently introduced and is not yet well documented.
If may be a good idea to checkout the ORCA calculator, as this is currently the most complete one in pysisyphus.

Excited-state handling

Excited-state aware calculators must subclass pysisyphus.calculators.OverlapCalculator.OverlapCalculator.
The calculator must implement a prepare_overlap_data method that returns four or seven quantities
1. alpha MO-Coefficients, 2d numpy.array of shape (nocc+nvirt, nocc+nvirt) with nocc and nvirt denoting the number of
  occupied and virtual alpha MOs
2. Xalpha, excitation transition density matrix of alpha electrons, as found in LR-TDDFT of shape (nstates, nocc, nvirt)
3. Yalpha, deexcitation transition density matrix of alpha electrons, as found in LR-TDDFT of shape (nstates, nocc, nvirt), can also be full
  of zeros, e.g., when a TDA calculation was carried out
4. all_energies, 1d numpy.array containing the absolute energies of all calculated states in Bohr
When seven quantities are returned then the related quantities for beta electrons should follow alpha MOs, Xalpha and Yalpha:
1. alpha MOs
2. Xalpha
3. Yalpha
4. beta MOs
5. Xbeta
6. Ybeta
7. all_energies
It is probably a good idea to implement a method like store_and_track in the ORCA calculator, that handles storing of the above quantities and root tracking.

Unit tests

Add a test directory in the tests subdirectory and include a test_file. Pysisyphus uses pytest.
1. Tests that depend on a certain binary or another python package must be decorated with the @using("[your_conf_key]") from the pysisyphus.testing module (see tests/test_orca/test_orca.py). Decorated tests with unavailable dependencies are skipped, otherwise the test would be executed an fail.
2. Add tests to verify the functionality of your calculator, e.g., by energy and gradient calculations on small molecules. Built-in geometries are easily loaded via geom_loader("lib:[file_name]") . See pysisyphus/geom_library for all geometries.
3. Scalar values in tests should be checked with pytest.approx, array values with numpy.testing.assert_allclose

in particular for the excited state optimizations, I think you explicitly need the info about the MOs expressed GTOs? Then what ordering of functions do you expect in the GTO basis?

Any help is appreciated.

The way everything is implemented right now the only requirement is that the order of the MO-matrix and the order of the transition density matrices are consistent.

I'll try to provide a kind of template tomorrow.

Maybe it would be a good idea to wait 2 to 3 weeks before starting with a new calculator, because some parts of pysisyphus are heavily in flux right now, especially the OverlapCalculator. Previously pysisyphus only supported tracking for closed-shell systems but on the wigner branch I now added support for open-shell too. I plan to make a new release soon. Maybe this would then be a good starting point for you/your group to develop a VOTCA calculator.

All the best
Johannes

baumeier · 2023-06-16T09:00:17Z

Hello Johannes,
thank you for the detailed explanation and sorry for reacting so late.
In the meantime, we have from our side made python bindings to our C++ code and ship that in the latest development version of VOTCA as an ASE calculator. Would you think using this is an easier starting point for the integration with pysisyphus? I think it already provides natively the energy and gradient functions and we could maybe just expand it by returning also the additional excited state information (MOs, Xalpha, Yalpha) in a convenient way.

Cheers,
Bjoern

eljost · 2023-07-19T10:44:38Z

Dear Bjoern,

I just had a quick look at your documentation and VOTCA seems to comprise several modules, e.e.g, CSG and XTP, isn't it? Will the pysisyphus calculator will about the XTP module?

Can you please provide a link to the ASE calculator? A quick search on ASE's gitlab repo for "VOTCA" returned no results. Are you talking about this snippet https://github.com/votca/votca/blob/4645704e18767051905d821447f6f211c55475f6/xtp-tutorials/pyxtp/run_energy.py#L4 ?

EDIT
Ok, I seem to have found it.

baumeier · 2023-07-20T14:26:51Z

Yes, the one you found is what I was referring to. It is not part of the ASE package itself (hopefully in the future), but for now it lives in the VOTCA-XTP environment.

eljost · 2023-07-20T20:29:23Z

Do you still need some support or some direction on how to proceed? If so, let me know.

Personally, I like decoupled calculators like my Psi4 calculator. This way VOTCA could come from conda and is called via a shell-script, as runpsi4.sh.
But depending on what you want to, this approach could be limiting. For me, decoupled calculators are easier to support as I don't have to combine pysisyphus and Psi4 and VOTCA and other calculators in one python environment.

But going the decoupled route may also prevent you from using the stuff you alread built for pyxtp...

One quick idea I had was to reuse the xml files found in $VOTCASHARE, and update them as necessary with options provided as nested dicts. This way you would have full control over most of the options. But maybe it would be tedious to create the inputs.

Would you like to use pysisyphus via YAML inputs or more via the Python API? A possible YAML input with the approach outlined above could look like:

geom:
 [...]
calc:
 type: votca
 # charge and spin should not be defined as child of
 # options.dftpackage
 charge: 0
 mult: 1
 # Maybe just pick the appropriate XML file according to the string in 'execute'.
 # But as I just noticed ... you include other xml files via the link attribute...
 # Another approach would be to add an XML file as explicit argument to the VOTCA
 # calculator, which could then be further customized on a per-calculation basis
 # by the arguments in 'options' below...
 execute: dftgwbse
 options:
  dftpackage:
   basisset: def2-svp
   auxbasisset: aux-def2-svp
  [etc]
opt:
 thresh: gau

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Adding a new calculator #248

Adding a new calculator #248

baumeier commented Apr 25, 2023

eljost commented Apr 26, 2023

baumeier commented Jun 16, 2023

eljost commented Jul 19, 2023 •

edited

Loading

baumeier commented Jul 20, 2023

eljost commented Jul 20, 2023

Adding a new calculator #248

Adding a new calculator #248

Comments

baumeier commented Apr 25, 2023

eljost commented Apr 26, 2023

Creating a new calculator

Add new class

Add required methods to class

Excited-state handling

Unit tests

baumeier commented Jun 16, 2023

eljost commented Jul 19, 2023 • edited Loading

baumeier commented Jul 20, 2023

eljost commented Jul 20, 2023

eljost commented Jul 19, 2023 •

edited

Loading