Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Solidify API & Variable Names #41

Closed
FarnazH opened this issue Jan 25, 2019 · 8 comments · Fixed by #91
Closed

Solidify API & Variable Names #41

FarnazH opened this issue Jan 25, 2019 · 8 comments · Fixed by #91
Assignees
Labels
API breaking Should be done first to stabilize API

Comments

@FarnazH
Copy link
Member

FarnazH commented Jan 25, 2019

We should make the necessary changes to vague and long variable names... See issue #1

@tovrstra
Copy link
Member

tovrstra commented Jan 28, 2019

I'll make a list of additional name/API changes w.r.t #1:

Name changes

  • numbers -> atnums (atomic numbers)
  • pseudo_numbers -> atcorenums (effective core charges, zero for ghost atoms)
  • coordinates -> atcoords (atomic Cartesian coordinates)
  • polar -> polarizability_tensor, shape (3, 3) (dipolar polarizability matrix)
  • cell or rvecs -> cellvecs (matrix whose rows are cell vectors, shape (nvec, 3)
  • ms2=1+abs(nalpha-nbeta) will be replaced by spinpol=abs(nalpha-nbeta) (spin polarization)

Behavior changes

  • Include ghost atoms in the at* attributes. Elements corresponding to ghost atoms in atcorenums must be zero.

Consolidation of ever-growing number of attributes

  • mulliken_charges, npa_charges, esp_charges -> atcharges is a dictionary where the key is a string describing the type of atomic charge and the value is a corresponding array of atomic charges. Keys should be specific and somewhat self-explaining, e.g. 'mulliken', 'natural', 'lowdin', 'hly', 'msk', etc.
  • one_mo, kin, na, olp -> one_ints is a dictionary where the key is any of 'core', 'kin', 'na', 'olp', followed by a suffix '_ao' or '_mo'. With this change the class attribute IOData.two_index_names can also be removed.
  • two_mo, er -> two_ints is a dictionary where the key is any of 'er', 'two', followed by a suffix '_ao' or '_mo'
  • dm_* -> one_rdms is a dictionary with various types of 1-particle reduced density matrices. Standard keys are 'scf', 'post_scf', 'scf_spin', 'post_scf_spin'. These are always stored in the AO basis.
  • extra in which everything is stored for which IOData does not define standard names. For now gaussian_command,
  • dipole_moment, quadrupole_moments -> moments dictionary with (angmom, kind) keys or something similar (to be decided).

Things to be removed because they are not needed at this stage or in the near future

New attributes for IOData for generating QC & MM input files, keeping QM/MM in mind

  • run_type: 'energy', 'opt', 'freq', ...
  • obasis_name, e.g. '6-31g', or 'sto-3g'
  • lot level of theory.
  • extcharges, array with values of external charges, with shape (nextcharge, 4). First three columns for Cartesian X, Y and Z coordinates, last column for the actual charge.
  • charge, the net charge (should not be a new attribute, but something that is coupled to nelec).
  • bonds, an integer array with shape (nbond, 3). Each row represents one bond. One row consists of three integers: first atom index (starting from zero), second atom index & an optional bond type (0: not known, 1: single, 2: double, 3: triple, 4: conjugated)

New attributes for IOData for reading useful properties out of QM or MM calculations, and other popular file formats

  • atforces, forces in Cartesian coordinates, shape (natom, 3)
  • athessian, hessian in Cartesian coordinates, shape (3*natom, 3*natom)
  • atmasses, vector with atomic masses
  • atfrozen, boolean flags, True for atoms whose positions were not optimized, or for which no Hessian matrix elements were computed. (Does not affect the size of the Hessian.)
  • g_rot the rotational symmetry number of the molecule, a.k.a. the degeneracy of the rotational partition function.
  • two_rdms is a dictionary of two-particle reduced density matrices, standard keys 'post_scf' and 'post_scf_spin' plus a suffix '_mo' or '_ao' should be used to specify the basis in which these matrices are stored.
  • 'atffparams' is a dictionary with arrays of atomic force field parameters (typically non-bonded). Keys include 'charges', 'vdw_radii', 'sigmas', 'epsilons', 'alphas' (atomic polarizabilities), 'c6s', 'c8s', 'c10s', 'buck_as', 'buck_bs', 'lj_as', 'core_charges', 'valence_charges', 'valence_widths', ... Not all of them have to be present, depending on the use case.
  • basisdef is a basis set definition. This can be a dictionary whose keys are symbols (of chemical elements), atomic numbers (similar to previous, str to make distinction with following) or an atom index (integer referring to a specific atom in a molecule). With this attribute we can easily load and dump basis set definitions from EMSL and derive molecular basis sets from them (see Intuitive representation of molecular basis set #42).

Other API changes

@tovrstra
Copy link
Member

I've edited the plan to include an extra attributes where everything must be stored for which we do not have standard names. This makes it possible to add more attributes to the IOData class in future without breaking any code that uses IOData. All allowed attributes should be put in __slots__, either manually or with the attr package. See https://docs.python.org/3/reference/datamodel.html#slots

@tovrstra
Copy link
Member

I've merged extcoords and extcharges into one array. One is never used without the other, so putting them into one array would save us from checking the consistency of the sizes of these arrays.

I've also removed (some time ago) the atghost attribute. Ghost atoms can be identified with atcorenums == 0.0.

@tovrstra tovrstra added the API breaking Should be done first to stabilize API label Apr 19, 2019
@tovrstra
Copy link
Member

tovrstra commented May 8, 2019

@FarnazH I'll take care of this one soon.

@tovrstra
Copy link
Member

tovrstra commented May 9, 2019

Small change of plan after brief chat with @PaulWAyers : spinpol=abs(nalpha-nbeta) will replace spinmult=1+abs(nalpha+nbeta) because spin multiplicity is in several cases not a very accurate name.

@tovrstra
Copy link
Member

Small change of plan. one_ints and two_ints instead of one_body_integrals and two_body_integrals, similar to one_rdms and two_rdms.

@tovrstra
Copy link
Member

Small changes: added lot and moments.

@tovrstra
Copy link
Member

I've renamed runtype to run_type in the second comment.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
API breaking Should be done first to stabilize API
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants