Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adding a new calculator #248

Open
baumeier opened this issue Apr 25, 2023 · 5 comments
Open

Adding a new calculator #248

baumeier opened this issue Apr 25, 2023 · 5 comments

Comments

@baumeier
Copy link

Hi devs,

I am looking at adding a new calculator in order to use pysisypus with our own code for ground and excited states (VOTCA, github.com/votca). I have looked at some of the existing calculators trying to reverse engineer what the required functions are to implement, but I guess there is a more efficient way to do that? Do you maybe have some overview or even bare template of what needs to be filled and how?
In particular for the excited state optimizations, I think you explicitly need the info about the MOs expressed GTOs? Then what ordering of functions do you expect in the GTO basis?

Any help is appreciated.

@eljost
Copy link
Owner

eljost commented Apr 26, 2023

Dear Prof. Baumeier,
thanks for your interest in pysisyphus. Below I compiled some notes on how to proceed:

Creating a new calculator

Add new class

  1. Create a new class in pysisyphus.calculators subpackage
    1. Subclass pysisyphus.calculators.Calculator or pysisyphus.calculators.OverlapCalculator
    2. If the wrapped program is executed by calling a binary add the appropriate conf_key and add an entry to ~/.pysisyphusrc. The conf_key must coincide with the entry
      in the ~/.pysisyphusrc.
  2. Add an import to pysisyphus.calculators.__init__
  3. If the calculator should be available via YAML input add a key to CALC_DICT in pysisyphus.run

Add required methods to class

General idea: implement get_energy/get_forces/(get_hessian) in the Calculator. These function are always called with Cartesian coordinates. In these functions you should construct
the required inputs to your program and call it via self.run(). Depending on the type of calculation, a parser function is called with the path to the calculation outputs. The parser functions
are registered in the self.parser_funcs dict and should extract the required information from the produced files.
*

  1. Implement the required methods in your calculator.
    1. The most important method is probably get_forces, followed by get_energy and get_hessian.
    2. All of these three methods take three arguments (see pysisyphus.calculators.Calculator.Calculator for the function signatures)
      1. atoms, tuple of atom-strings of length N
      2. coords, 1d numpy.array of 3N Cartesian coordinates in Bohr
      3. **prepare_kwargs, possibly empty dict that can contain point charges; not strictly required.
    3. All of these methods return a dict that always contains the "energy" key and the energy in Hartree. The return value of get_forces must also contain a "forces" key
      that points to a 1d numpy.array of length 3N, containing Cartesian forces (not gradient!) in atomic units Hartree / Bohr. Similarily, results from get_hessian must contain
      the hessian key, pointing to an 2d numpy.array of shape (3N, 3N) containing the Cartesian Hessian in atomic units.
    4. If get_hessian is not provided but a Hessian calculation is requested pysisyphus falls back to a numerical Hessian in newer versions. So get_hessian is not strictly required.
  2. It is then up to you to create the appropriate inputs/input files to carry out the desired calculation.
  3. Actual calculations should be initiated in get_energy/get_forces/get_hessian by calling self.run. See pysisyphus.calculators.Calculator.Calculator for possible arguments, the method is quite flexible. See pysisyphus.calculators.ORCA.ORCA for a simple example where Calculator.run is just called with an input-string or pysisyphus.calculators.Turbomole
    for a more involved example where a modified $ENVIRONMENT is used. An important argument to Calculator.run is calc, which indicates the type of calculation carried out.
    The provided argument to calc (energy/force/hessian/etc.) must coincide with a key in self.parse_funcs, that points to the respective parser. The parser functions are called with
    a single argument: the path (type pathlib.Path) to the temporary directory, where the calculation was carried out.
  4. If you want to keep certain files you should add a pattern to self.to_keep. After a calculation, files matching patterns in self.to_keep are saved in out_dir.
  5. By providing appropriate entries in self._set_plans certain attributes of the calculator are updated with the most recent filenames from self.to_keep, after a calculation. This
    way one can easily carry MO coefficients from one calculation to another, to speed up SCF convergence. The _set_plans feature was just recently introduced and is not yet well documented.
  6. If may be a good idea to checkout the ORCA calculator, as this is currently the most complete one in pysisyphus.

Excited-state handling

  1. Excited-state aware calculators must subclass pysisyphus.calculators.OverlapCalculator.OverlapCalculator.
  2. The calculator must implement a prepare_overlap_data method that returns four or seven quantities
    1. alpha MO-Coefficients, 2d numpy.array of shape (nocc+nvirt, nocc+nvirt) with nocc and nvirt denoting the number of
      occupied and virtual alpha MOs
    2. Xalpha, excitation transition density matrix of alpha electrons, as found in LR-TDDFT of shape (nstates, nocc, nvirt)
    3. Yalpha, deexcitation transition density matrix of alpha electrons, as found in LR-TDDFT of shape (nstates, nocc, nvirt), can also be full
      of zeros, e.g., when a TDA calculation was carried out
    4. all_energies, 1d numpy.array containing the absolute energies of all calculated states in Bohr
  3. When seven quantities are returned then the related quantities for beta electrons should follow alpha MOs, Xalpha and Yalpha:
    1. alpha MOs
    2. Xalpha
    3. Yalpha
    4. beta MOs
    5. Xbeta
    6. Ybeta
    7. all_energies
  4. It is probably a good idea to implement a method like store_and_track in the ORCA calculator, that handles storing of the above quantities and root tracking.

Unit tests

  1. Add a test directory in the tests subdirectory and include a test_file. Pysisyphus uses pytest.
    1. Tests that depend on a certain binary or another python package must be decorated with the @using("[your_conf_key]") from the pysisyphus.testing module (see tests/test_orca/test_orca.py). Decorated tests with unavailable dependencies are skipped, otherwise the test would be executed an fail.
    2. Add tests to verify the functionality of your calculator, e.g., by energy and gradient calculations on small molecules. Built-in geometries are easily loaded via geom_loader("lib:[file_name]") . See pysisyphus/geom_library for all geometries.
    3. Scalar values in tests should be checked with pytest.approx, array values with numpy.testing.assert_allclose

in particular for the excited state optimizations, I think you explicitly need the info about the MOs expressed GTOs? Then what ordering of functions do you expect in the GTO basis?

Any help is appreciated.

The way everything is implemented right now the only requirement is that the order of the MO-matrix and the order of the transition density matrices are consistent.

I'll try to provide a kind of template tomorrow.

Maybe it would be a good idea to wait 2 to 3 weeks before starting with a new calculator, because some parts of pysisyphus are heavily in flux right now, especially the OverlapCalculator. Previously pysisyphus only supported tracking for closed-shell systems but on the wigner branch I now added support for open-shell too. I plan to make a new release soon. Maybe this would then be a good starting point for you/your group to develop a VOTCA calculator.

All the best
Johannes

@baumeier
Copy link
Author

Hello Johannes,
thank you for the detailed explanation and sorry for reacting so late.
In the meantime, we have from our side made python bindings to our C++ code and ship that in the latest development version of VOTCA as an ASE calculator. Would you think using this is an easier starting point for the integration with pysisyphus? I think it already provides natively the energy and gradient functions and we could maybe just expand it by returning also the additional excited state information (MOs, Xalpha, Yalpha) in a convenient way.

Cheers,
Bjoern

@eljost
Copy link
Owner

eljost commented Jul 19, 2023

Dear Bjoern,

I just had a quick look at your documentation and VOTCA seems to comprise several modules, e.e.g, CSG and XTP, isn't it? Will the pysisyphus calculator will about the XTP module?

Can you please provide a link to the ASE calculator? A quick search on ASE's gitlab repo for "VOTCA" returned no results. Are you talking about this snippet https://github.com/votca/votca/blob/4645704e18767051905d821447f6f211c55475f6/xtp-tutorials/pyxtp/run_energy.py#L4 ?

EDIT
Ok, I seem to have found it.

@baumeier
Copy link
Author

Yes, the one you found is what I was referring to. It is not part of the ASE package itself (hopefully in the future), but for now it lives in the VOTCA-XTP environment.

@eljost
Copy link
Owner

eljost commented Jul 20, 2023

Do you still need some support or some direction on how to proceed? If so, let me know.

Personally, I like decoupled calculators like my Psi4 calculator. This way VOTCA could come from conda and is called via a shell-script, as runpsi4.sh.
But depending on what you want to, this approach could be limiting. For me, decoupled calculators are easier to support as I don't have to combine pysisyphus and Psi4 and VOTCA and other calculators in one python environment.

But going the decoupled route may also prevent you from using the stuff you alread built for pyxtp...

One quick idea I had was to reuse the xml files found in $VOTCASHARE, and update them as necessary with options provided as nested dicts. This way you would have full control over most of the options. But maybe it would be tedious to create the inputs.

Would you like to use pysisyphus via YAML inputs or more via the Python API? A possible YAML input with the approach outlined above could look like:

geom:
 [...]
calc:
 type: votca
 # charge and spin should not be defined as child of
 # options.dftpackage
 charge: 0
 mult: 1
 # Maybe just pick the appropriate XML file according to the string in 'execute'.
 # But as I just noticed ... you include other xml files via the link attribute...
 # Another approach would be to add an XML file as explicit argument to the VOTCA
 # calculator, which could then be further customized on a per-calculation basis
 # by the arguments in 'options' below...
 execute: dftgwbse
 options:
  dftpackage:
   basisset: def2-svp
   auxbasisset: aux-def2-svp
  [etc]
opt:
 thresh: gau

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants