molecule extensions for zmat and efp #52

loriab · 2018-10-18T20:54:12Z

This is a continuation from a split on #44

Without undermining the agreed-upon Cartesian exchange format for Mols, there are other input formats and other non-QM molecule domains out there. In particular, these can interact with the main Cartesian QM domain

In Psi4 we've rewritten stuff so that all molecule parsing, basis set attaching, and molecule exchange is in (a close relative of) QCSchema up to the point at which it hits our internal C++ class. That class supports ZMat internal storage, so rather than drop that widespread functionality, we need a way of transporting the ZMat info to the constructor, hence the very generic geom_unsettled and variables fields. Psi4 has no intention of using the ZMat extension as an output format. That is, (in: Cart, ZMat) --> (out: Cart) is and remains the plan. This is possible b/c all progs accept Cart as input.

ZMatrix -- required fields are geom_unsettled and symbols

        "psi4:geom_unsettled": {
            "description": "(nat, )  all-string Cartesian and/or zmat anchor and value contents.",
            "type": "array",
            "items": {
                "type": "string"
            }
        },
        "psi4:variables": {
            "description": "(nvar, 2) pairs of variables (str) and values (float). May be incomplete.",
            "type": "array",
            "items": {
                "type": "array",
                "items": {
                    "type": "string",
                }
            }
        }

zmat_schema_example = {
    'symbols': np.array(['H', 'O', 'O', 'H']),
    'geom_unsettled': [[], ['1', '0.95'], ['2', '1.40', '1', 'A'], ['3', '0.95', '2', 'A', '1', '120.0']],
    'variables': [['A', 105.0]],
}

mixed_zmat_cartesian_example = {
    'geom_unsettled': [..., ['-2.509000000000', '-0.794637665924', '0.000000000000'], ['1', 'CC', '3', '30', '2', 'A2']],
    ...
}

Programming-wise, effective fragment potentials are very useful in complicating and clarifying dictionary-like system exchange between programs. In EFP, the full Cartesian geometry is only available through calls to an EFP library with fragment files. Instead, parsing only supplies efp fragment files and orientation hints. Most importantly, EFP and QM domains are connected (because need to freeze orientation) and are best parsed simultaneously. The output format is pure-xyzabc hints.

EFP -- required fields are fragment_files and geom_hints. I usually require hint_types, too, but that can be read off from the length of arrays in geom_hints. It's just a question of if this should be generic enough for other, perhaps overlapping-length hint-types.

        "psi4:fragment_files": {
            "description": "(nfr, ) lowercased names of efp fragment files (no path info).",
            "type": "array",
            "items": {
                "type": "string"
            }
        },
        "psi4:hint_types": {
            "description": "(nfr, ) type of fragment orientation hint.",
            "type": "string",
            "enum": ["xyzabc", "points"]
        },
        "psi4:geom_hints": {
            "description": "(nfr, ) inner lists have length 6 (xyzabc; to orient the center) or
                            9 (points; to orient the first three atoms) of the EFP fragment.",
            "type": "array",
            "items": {
                "type": "array",
                "items": {
                    "type": "number"
                }
            }
        },

mixed_qm_efp_schema = {
    'qm': {
        'geom': np.array([0., 0., 0.118720, -0.753299, 0.0, -0.474880, 0.753299, 0.0, -0.474880]),
        'symbols': np.array(['O', 'H', 'H']),
        'fix_com': True,
        'fix_orientation': True,
        'fix_symmetry': 'c1',
    },
    'efp': {
        'fragment_files': ['h2o', 'ammonia'],
        'geom_hints': [[-2.12417561, 1.22597097, -0.95332054, -2.902133, -4.5481863, -1.953647],
                       [0.98792, 1.87681, 2.85174, 1.68798, 1.18856, 3.09517, 1.45873, 2.55904, 2.27226]],
        'hint_types': ['xyzabc', 'points'],
        'fix_com':
        True,
        'fix_orientation':
        True,
        'fix_symmetry':
        'c1',
    }
}

We have from_string parsing and validation tech for all three domains (QM Cart, ZMat, EFP) that have been working soundly with these extensions for many months and that others are free to use.

The text was updated successfully, but these errors were encountered:

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

molecule extensions for zmat and efp #52

molecule extensions for zmat and efp #52

loriab commented Oct 18, 2018

molecule extensions for zmat and efp #52

molecule extensions for zmat and efp #52

Comments

loriab commented Oct 18, 2018