Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

molecule extensions for zmat and efp #52

Open
loriab opened this issue Oct 18, 2018 · 0 comments
Open

molecule extensions for zmat and efp #52

loriab opened this issue Oct 18, 2018 · 0 comments

Comments

@loriab
Copy link
Collaborator

loriab commented Oct 18, 2018

This is a continuation from a split on #44

Without undermining the agreed-upon Cartesian exchange format for Mols, there are other input formats and other non-QM molecule domains out there. In particular, these can interact with the main Cartesian QM domain

In Psi4 we've rewritten stuff so that all molecule parsing, basis set attaching, and molecule exchange is in (a close relative of) QCSchema up to the point at which it hits our internal C++ class. That class supports ZMat internal storage, so rather than drop that widespread functionality, we need a way of transporting the ZMat info to the constructor, hence the very generic geom_unsettled and variables fields. Psi4 has no intention of using the ZMat extension as an output format. That is, (in: Cart, ZMat) --> (out: Cart) is and remains the plan. This is possible b/c all progs accept Cart as input.

ZMatrix -- required fields are geom_unsettled and symbols

        "psi4:geom_unsettled": {
            "description": "(nat, )  all-string Cartesian and/or zmat anchor and value contents.",
            "type": "array",
            "items": {
                "type": "string"
            }
        },
        "psi4:variables": {
            "description": "(nvar, 2) pairs of variables (str) and values (float). May be incomplete.",
            "type": "array",
            "items": {
                "type": "array",
                "items": {
                    "type": "string",
                }
            }
        }
zmat_schema_example = {
    'symbols': np.array(['H', 'O', 'O', 'H']),
    'geom_unsettled': [[], ['1', '0.95'], ['2', '1.40', '1', 'A'], ['3', '0.95', '2', 'A', '1', '120.0']],
    'variables': [['A', 105.0]],
}
mixed_zmat_cartesian_example = {
    'geom_unsettled': [..., ['-2.509000000000', '-0.794637665924', '0.000000000000'], ['1', 'CC', '3', '30', '2', 'A2']],
    ...
}

Programming-wise, effective fragment potentials are very useful in complicating and clarifying dictionary-like system exchange between programs. In EFP, the full Cartesian geometry is only available through calls to an EFP library with fragment files. Instead, parsing only supplies efp fragment files and orientation hints. Most importantly, EFP and QM domains are connected (because need to freeze orientation) and are best parsed simultaneously. The output format is pure-xyzabc hints.

EFP -- required fields are fragment_files and geom_hints. I usually require hint_types, too, but that can be read off from the length of arrays in geom_hints. It's just a question of if this should be generic enough for other, perhaps overlapping-length hint-types.

        "psi4:fragment_files": {
            "description": "(nfr, ) lowercased names of efp fragment files (no path info).",
            "type": "array",
            "items": {
                "type": "string"
            }
        },
        "psi4:hint_types": {
            "description": "(nfr, ) type of fragment orientation hint.",
            "type": "string",
            "enum": ["xyzabc", "points"]
        },
        "psi4:geom_hints": {
            "description": "(nfr, ) inner lists have length 6 (xyzabc; to orient the center) or
                            9 (points; to orient the first three atoms) of the EFP fragment.",
            "type": "array",
            "items": {
                "type": "array",
                "items": {
                    "type": "number"
                }
            }
        },
mixed_qm_efp_schema = {
    'qm': {
        'geom': np.array([0., 0., 0.118720, -0.753299, 0.0, -0.474880, 0.753299, 0.0, -0.474880]),
        'symbols': np.array(['O', 'H', 'H']),
        'fix_com': True,
        'fix_orientation': True,
        'fix_symmetry': 'c1',
    },
    'efp': {
        'fragment_files': ['h2o', 'ammonia'],
        'geom_hints': [[-2.12417561, 1.22597097, -0.95332054, -2.902133, -4.5481863, -1.953647],
                       [0.98792, 1.87681, 2.85174, 1.68798, 1.18856, 3.09517, 1.45873, 2.55904, 2.27226]],
        'hint_types': ['xyzabc', 'points'],
        'fix_com':
        True,
        'fix_orientation':
        True,
        'fix_symmetry':
        'c1',
    }
}

We have from_string parsing and validation tech for all three domains (QM Cart, ZMat, EFP) that have been working soundly with these extensions for many months and that others are free to use.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant