Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

GeomeTRIC Interface #1813

Merged
merged 16 commits into from
Apr 22, 2020
Merged

GeomeTRIC Interface #1813

merged 16 commits into from
Apr 22, 2020

Conversation

zachglick
Copy link
Contributor

@zachglick zachglick commented Feb 17, 2020

Description

Allows for the use of the GeomeTRIC optimizer within a Psi4 input. The desired optimization engine, either geometric or optking (default), can now specified with an engine argument to the optimize() function. In addition, a dictionary of GeomeTRIC-specific keywords and options (like constraints) may be passed to the optimizer. The test_h2o_constrained pytest demonstrates how this is done.

e = optimize(..., engine=`geometric`, optimizer_keywords={...})

Output is consistent with Psi4's default geometry optimization:

Example result

>>> grep "~" output.dat


  ==> GeomeTRIC Optimizer <==                                                                   ~
  Psi4 convergence criteria QCHEM  not recognized by GeomeTRIC, switching to GAU_TIGHT          ~
  Measures of convergence in internal coordinates in au.                                        ~
  Criteria marked as inactive (o), active & met (*), and active & unmet ( ).                    ~
  --------------------------------------------------------------------------------------------- ~
   Step     Total Energy     Delta E     MAX Force     RMS Force      MAX Disp      RMS Disp    ~
  --------------------------------------------------------------------------------------------- ~
    Convergence Criteria    1.00e-06      1.50e-05      1.00e-05      6.00e-05      4.00e-05    ~
  --------------------------------------------------------------------------------------------- ~
      0  -7.64427364e+01    --------      5.01e-02      4.03e-02      --------      --------    ~
      1  -7.64446505e+01   -1.91e-03      2.68e-03      1.95e-03      3.06e-02      2.16e-02    ~
      2  -7.64446681e+01   -1.77e-05      5.27e-04      4.17e-04      4.22e-03      3.98e-03    ~
      3  -7.64446684e+01   -3.06e-07 *    2.27e-05      2.03e-05      4.11e-04      2.93e-04    ~
      4  -7.64446684e+01    6.91e-10 *    3.28e-06 *    2.74e-06 *    1.78e-05 *    1.49e-05 *  ~
  Optimization converged!                                                                       ~

Todos

  • Working GeomeTRIC interface
  • Improved printing and error handling
  • Pytest(s)
  • Constrained optimizations
  • Process GeomeTRIC keywords in Psi4 input
  • Composite energy calls (CBS, etc.)
  • Documentation

Checklist

Status

  • Ready for review
  • Ready for merge

psi4/driver/driver.py Outdated Show resolved Hide resolved
psi4/driver/driver.py Outdated Show resolved Hide resolved
psi4/driver/driver.py Outdated Show resolved Hide resolved
psi4/driver/driver.py Outdated Show resolved Hide resolved
@dgasmith
Copy link
Member

You can pull geometric apart like so: https://github.com/leeping/geomeTRIC/blob/master/geometric/tests/test_batch_opt.py

This is a bit complex compared to what you wish to do, but it may give a better product.

Here is everything unwound:

import qcengine as qcng
import qcelemental as qcel
import geometric
import pkg_resources
import logging.config
import sys

mol_uc2 = qcel.models.Molecule.from_data(
    """
O 0 0 0
H 0 0 1
H 0 1 0
"""
)

input_data = {
    "keywords": {
        "convergence_set": "GAU_LOOSE",
        "coordsys": "tric",
        "maxiter": 25,
        "enforce": 0.1,
        "constraints": {
            "set": [
                {"type": "distance", "indices": [0, 1], "value": 1},
            ]
        },
#        "program": "psi4",
        "program": "mopac",
    },
    "input_specification": {
        "driver": "gradient",
        "model": {"method": "pm6-d3"},
#        "model": {"method": "b3lyp-d3", "basis": "sto-3g"},
    },
    "initial_molecule": mol_uc2.dict(),
}



# Set a temporary logger to capture output
log_stream = geometric.nifty.RawStreamHandler(stream=sys.stdout)
#log_stream = geometric.nifty.RawStreamHandler(stream=StringIO())
logger = geometric.nifty.logger
logger.addHandler(log_stream)


# Parse JSON
input_opts = geometric.run_json.parse_input_json_dict(input_data)
M, engine = geometric.optimize.get_molecule_engine(**input_opts)

# Handle constraints
constraints_dict = input_opts.get('constraints', {})
constraints_string = geometric.run_json.make_constraints_string(constraints_dict)
Cons, CVals = None, None
if constraints_string:
    if 'scan' in constraints_dict:
        raise ValueError("No scan!")
    Cons, CVals = geometric.optimize.ParseConstraints(M, constraints_string)


# Set up the internal coordinate system
coordsys = input_opts.get('coordsys', 'tric')
CoordSysDict = {
    'cart': (geometric.internal.CartesianCoordinates, False, False),
    'prim': (geometric.internal.PrimitiveInternalCoordinates, True, False),
    'dlc': (geometric.internal.DelocalizedInternalCoordinates, True, False),
    'hdlc': (geometric.internal.DelocalizedInternalCoordinates, False, True),
    'tric': (geometric.internal.DelocalizedInternalCoordinates, False, False)
}

# Build internal coordinates
CoordClass, connect, addcart = CoordSysDict[coordsys.lower()]
IC = CoordClass(
    M,
    build=True,
    connect=connect,
    addcart=addcart,
    constraints=Cons,
    cvals=CVals[0] if CVals is not None else None)



# Get initial coordinates in bohr
coords = M.xyzs[0].flatten() * geometric.nifty.ang2bohr

# Setup an optimizer object
params = geometric.optimize.OptParams(**input_opts)
optimizer = geometric.optimize.Optimizer(coords, M, IC, engine, None, params)

# Print
IC.printConstraints(coords, thre=-1)

def compute(coords, opt):
    mol_dict = mol_uc2.dict()
    mol_dict['geometry'] = coords

    inpmodel = {
        "molecule": mol_dict,
        "driver": "gradient",
        "model": {"method": "pm6"}
    }
    ret = qcng.compute(inpmodel, "mopac")
    opt.E = ret.properties.return_energy
    opt.gradx = ret.return_result
    return ret


optimizer.calcEnergyForce()
optimizer.prepareFirstStep()
logger.info("[AU]: e=%.5f bl=%.5f,%.5f g=%.4f" % (
        optimizer.E, optimizer.X[0],optimizer.X[3], optimizer.gradx[0]))

while True:
    if optimizer.state in [geometric.optimize.OPT_STATE.CONVERGED, geometric.optimize.OPT_STATE.FAILED]:
        logger.info("Optmization convereged!")
        break

    optimizer.step()
    optimizer.calcEnergyForce()
    optimizer.evaluateStep()
    logger.info("[AU]: e=%.5f bl=%.5f,%.5f g=%.4f" % (
            optimizer.E, optimizer.X[0],optimizer.X[3], optimizer.gradx[0]))

You may want to build another Engine.

@zachglick
Copy link
Contributor Author

You can pull geometric apart like so: https://github.com/leeping/geomeTRIC/blob/master/geometric/tests/test_batch_opt.py

This is a bit complex compared to what you wish to do, but it may give a better product.

Here is everything unwound:

import qcengine as qcng
import qcelemental as qcel
import geometric
import pkg_resources
import logging.config
import sys

mol_uc2 = qcel.models.Molecule.from_data(
    """
O 0 0 0
H 0 0 1
H 0 1 0
"""
)

input_data = {
    "keywords": {
        "convergence_set": "GAU_LOOSE",
        "coordsys": "tric",
        "maxiter": 25,
        "enforce": 0.1,
        "constraints": {
            "set": [
                {"type": "distance", "indices": [0, 1], "value": 1},
            ]
        },
#        "program": "psi4",
        "program": "mopac",
    },
    "input_specification": {
        "driver": "gradient",
        "model": {"method": "pm6-d3"},
#        "model": {"method": "b3lyp-d3", "basis": "sto-3g"},
    },
    "initial_molecule": mol_uc2.dict(),
}



# Set a temporary logger to capture output
log_stream = geometric.nifty.RawStreamHandler(stream=sys.stdout)
#log_stream = geometric.nifty.RawStreamHandler(stream=StringIO())
logger = geometric.nifty.logger
logger.addHandler(log_stream)


# Parse JSON
input_opts = geometric.run_json.parse_input_json_dict(input_data)
M, engine = geometric.optimize.get_molecule_engine(**input_opts)

# Handle constraints
constraints_dict = input_opts.get('constraints', {})
constraints_string = geometric.run_json.make_constraints_string(constraints_dict)
Cons, CVals = None, None
if constraints_string:
    if 'scan' in constraints_dict:
        raise ValueError("No scan!")
    Cons, CVals = geometric.optimize.ParseConstraints(M, constraints_string)


# Set up the internal coordinate system
coordsys = input_opts.get('coordsys', 'tric')
CoordSysDict = {
    'cart': (geometric.internal.CartesianCoordinates, False, False),
    'prim': (geometric.internal.PrimitiveInternalCoordinates, True, False),
    'dlc': (geometric.internal.DelocalizedInternalCoordinates, True, False),
    'hdlc': (geometric.internal.DelocalizedInternalCoordinates, False, True),
    'tric': (geometric.internal.DelocalizedInternalCoordinates, False, False)
}

# Build internal coordinates
CoordClass, connect, addcart = CoordSysDict[coordsys.lower()]
IC = CoordClass(
    M,
    build=True,
    connect=connect,
    addcart=addcart,
    constraints=Cons,
    cvals=CVals[0] if CVals is not None else None)



# Get initial coordinates in bohr
coords = M.xyzs[0].flatten() * geometric.nifty.ang2bohr

# Setup an optimizer object
params = geometric.optimize.OptParams(**input_opts)
optimizer = geometric.optimize.Optimizer(coords, M, IC, engine, None, params)

# Print
IC.printConstraints(coords, thre=-1)

def compute(coords, opt):
    mol_dict = mol_uc2.dict()
    mol_dict['geometry'] = coords

    inpmodel = {
        "molecule": mol_dict,
        "driver": "gradient",
        "model": {"method": "pm6"}
    }
    ret = qcng.compute(inpmodel, "mopac")
    opt.E = ret.properties.return_energy
    opt.gradx = ret.return_result
    return ret


optimizer.calcEnergyForce()
optimizer.prepareFirstStep()
logger.info("[AU]: e=%.5f bl=%.5f,%.5f g=%.4f" % (
        optimizer.E, optimizer.X[0],optimizer.X[3], optimizer.gradx[0]))

while True:
    if optimizer.state in [geometric.optimize.OPT_STATE.CONVERGED, geometric.optimize.OPT_STATE.FAILED]:
        logger.info("Optmization convereged!")
        break

    optimizer.step()
    optimizer.calcEnergyForce()
    optimizer.evaluateStep()
    logger.info("[AU]: e=%.5f bl=%.5f,%.5f g=%.4f" % (
            optimizer.E, optimizer.X[0],optimizer.X[3], optimizer.gradx[0]))

You may want to build another Engine.

This is super helpful, thanks!

@lgtm-com
Copy link

lgtm-com bot commented Feb 19, 2020

This pull request introduces 2 alerts when merging 1bf69ab into 3121918 - view on LGTM.com

new alerts:

  • 2 for Unused local variable

tests/pytests/test_geometric.py Outdated Show resolved Hide resolved
psi4/driver/driver.py Outdated Show resolved Hide resolved
psi4/driver/driver.py Outdated Show resolved Hide resolved
@loriab loriab added this to the Psi4 1.4 milestone Feb 29, 2020
Copy link
Contributor

@PeterKraus PeterKraus left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks great already and I'm looking forward to trying it out. I have a couple of minor comments, though. Also, can GeomeTRIC do TS searches? If yes, perhaps a test would be good.

psi4/driver/driver.py Outdated Show resolved Hide resolved
psi4/driver/driver.py Outdated Show resolved Hide resolved
psi4/driver/driver.py Outdated Show resolved Hide resolved
psi4/driver/driver.py Outdated Show resolved Hide resolved
psi4/driver/driver.py Outdated Show resolved Hide resolved
psi4/driver/driver.py Outdated Show resolved Hide resolved
@zachglick
Copy link
Contributor Author

This looks great already and I'm looking forward to trying it out. I have a couple of minor comments, though. Also, can GeomeTRIC do TS searches? If yes, perhaps a test would be good.

Thanks Peter, your comments are very helpful! I don't think transition state searches are implemented yet, although I see that there is a PR in the GeomeTRIC repo:
leeping/geomeTRIC#107

@zachglick zachglick changed the title GeomeTRIC optimizer in native Psi4 input GeomeTRIC Interface Apr 6, 2020
Copy link
Member

@loriab loriab left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks great!

doc/sphinxman/source/optking.rst Outdated Show resolved Hide resolved
psi4/driver/driver.py Outdated Show resolved Hide resolved
psi4/driver/driver.py Show resolved Hide resolved
tests/pytests/test_geometric.py Outdated Show resolved Hide resolved
pytest.param({'name': 'hf', 'options': {'scf_type': 'pk'}, 'ref_ene' : -76.02082389228, 'ref_nuc': 9.26528625744628}, id='rhf(pk)'),
pytest.param({'name': 'mp2', 'options': {'mp2_type': 'df'}, 'ref_ene' : -76.22711819393223, 'ref_nuc': 9.09137805747361}, id='mp2(df)'),
pytest.param({'name': 'mp2', 'options': {'mp2_type': 'conv'}, 'ref_ene' : -76.2271678506303, 'ref_nuc': 9.091178486990861}, id='mp2(conv)'),
pytest.param({'name': 'b3lyp', 'options': {'scf_type': 'df'}, 'ref_ene' : -76.41632755714534, 'ref_nuc': 9.04535641436914}, id='b3lyp'),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How (step & opt convergence conditions) did you get the ref E and nre? It can be difficult to make opt tests checks that are robust to the optimizer taking an extra step (either b/c of platform or noise or algorithm changes). It can be good to crank up opt crit to verytight and all the e/d/r_convergences to 10 to gather the ref values. Then, at least when the test fails, runtime opt convergence and compare_values atol are rationally adjustable.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I got the reference energies exactly as you described: e_convergence : 10, d_convergence : 8, g_convergence : GAU_TIGHT. I wouldn't have any problem with relaxing the comparison criteria, since I imagine any future error that would cause a test to fail will do so in a major way.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds workable. It can be good to get the ref nre from a verytight opt convergence, then run at plain tight opt convergence so that an extra opt step moves closer to ref value rather than away from it. But existing will be fine for now.


# run gradient at optimized geometry to get a wfn
if return_wfn:
g, wfn = gradient(name, return_wfn=True, **kwargs)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I got nervous about different P::e.globals or other settings depending on whether return_wfn = T/F. Even if it's fine now, may change btwn psithon/psiapi, if qcng is involved whether extras['psiapi'] = T/F, and with ddd or not. I don't particularly want to waste cycles by always running the extra gradient calc, but may be worth it to avert routing bugs.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are you saying we should run that gradient calculation regardless of the value of return_wfn? I think that would be a pretty reasonable decision.

I wouldn't be too worried about the extra cycle. In my limited experience, GeomeTRIC usually requires many more optimization cycles than Optking, so if a user has made the decision to use GeomeTRIC, a single additional cycle would not be that much of a problem.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point on number of iterations. In that case, yes, go for final gradient() call under all circumstances.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Even better, I did away with the final gradient call. I can stick the wfn to the GeomeTRIC Engine object during the optimization, so there's no need to make any additional calls after the optimization loop.

pytest.param({'name': 'hf', 'options': {'scf_type': 'pk'}, 'ref_ene' : -76.02082389228, 'ref_nuc': 9.26528625744628}, id='rhf(pk)'),
pytest.param({'name': 'mp2', 'options': {'mp2_type': 'df'}, 'ref_ene' : -76.22711819393223, 'ref_nuc': 9.09137805747361}, id='mp2(df)'),
pytest.param({'name': 'mp2', 'options': {'mp2_type': 'conv'}, 'ref_ene' : -76.2271678506303, 'ref_nuc': 9.091178486990861}, id='mp2(conv)'),
pytest.param({'name': 'b3lyp', 'options': {'scf_type': 'df'}, 'ref_ene' : -76.41632755714534, 'ref_nuc': 9.04535641436914}, id='b3lyp'),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds workable. It can be good to get the ref nre from a verytight opt convergence, then run at plain tight opt convergence so that an extra opt step moves closer to ref value rather than away from it. But existing will be fine for now.

Copy link
Contributor

@PeterKraus PeterKraus left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for taking the suggestions aboard. LGTM.

@zachglick zachglick requested a review from dgasmith April 10, 2020 19:23
Copy link
Member

@dgasmith dgasmith left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FYI @leeping
Thanks, this looks great and will help a lot of requests for constrained op.

:py:func:`~psi4.optimize`. The optimization will respect the keywords |optking__g_convergence|
and |optking__geom_maxiter|. Any other GeomeTRIC-specific options (including constraints)
may be specified with the ``geometric_opts`` argument to :py:func:`~psi4.optimize`.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we need a few constrained optimization examples here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added

from qcelemental.util import which_import

if not which_import('geometric', return_bool=True):
raise ModuleNotFoundError('Python module geometric not found. Solve by installing it: `conda install -c conda-forge geometric` or `pip install geometric`')
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Excellent! This kind of error message really helps

psi4/driver/driver.py Outdated Show resolved Hide resolved
@leeping
Copy link
Contributor

leeping commented Apr 17, 2020

Hi there, thanks a lot for including me on this. :)

Transition state optimization is implemented, but we have not tested it extensively against other codes. It does work quite well for in-house applications containing 50+ atoms.

I'm very interested to see how you run these optimizations directly in Psi4. It should be a lot more efficient than calling Psi4 repeatedly on the command line.

Also happy to provide examples of constrained optimization. Let me know if you need any.

@susilehtola
Copy link
Member

Is there a performance benefit over running the program in the command line? Nuclear forces / hessians i/o is inconsequential compared to the quantum chemistry part. Any savings would come from reusing checkpoint information for the Fock / density matrices... right?

@leeping
Copy link
Contributor

leeping commented Apr 17, 2020

I'm just referring to the overhead associated with setup & teardown of the calculation. It could slow things down if the calculations are fast (which might be the case with semiempirical methods or minimal basis sets).

@dgasmith
Copy link
Member

It can a bit, Psi's startup time is ~0.4 seconds or so with all of Python loading in. In general QC will dwarf this time so it isn't much of an issue. Your right though with XTB and DFTB it gets more interesting.

@loriab loriab merged commit 0ec2032 into psi4:master Apr 22, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants