Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

py-side Hessian analysis #834

Merged
merged 68 commits into from
Apr 13, 2018
Merged

py-side Hessian analysis #834

merged 68 commits into from
Apr 13, 2018

Conversation

loriab
Copy link
Member

@loriab loriab commented Oct 31, 2017

Description

Includes Joe's #772, since the original point of this was to do properly the hacks needed to get the normco into the Wfn. This includes py-side vibrational analysis, thermo, and small-system reordering. Also fixes Hessian symmetrization (py-side; haven't moved it back c-side).

This isn't a final product. I only re-hooked-back-up one conventional freq test case. There's plenty of organizational, naming, storage (want both freq and normco in Wfn? and if so, only vibs, not RT? and how ID imag freq if we're float and don't want -?) choices yet. What this does have is a substantial py test case with 6 Hessian readouts from Cfour and 6 matching p4 findif-by-grad vibrational analyses to help debug some analytic Hessian errors at wwdc.

And watch out – any change to findif.h recompiles most of psi.

In the end, this added a py-side vib analysis, identified two Hessian bugs, upgraded the two Molecule classes, added Molecule serialization everywhere except initial string input parsing, added alignment and BFS tools, and lots more – see below.

Todos

Notable points that this PR has either accomplished or will accomplish.

  • Developer Interest
    • This PR includes Joe's Changes to psi4 for MDT interface. #772
      • adds an optional history object to the optimize output. This object contains energies, gradients, and coordinates for the molecule at each step of the optimization. Hooked up to MDT.
      • adds normalmode displacement export to the api and optional saving of normal modes to the wavefunction.
      • Adds handling of several one-electron properties to the API and saves them to the wavefunction.
    • qcdb.Molecule
      • Fix some bugs, including C3/D3 symmetry detection w/o verbose and printing for Q-Chem when dummy (or ghost?, don't remember which) present
      • Simplify rotational_constants, add rotational_symmetry_number, add axis_representation (probably should double-check this with a spectroscopy book)
      • Optimize instantiation (really, update_geometry) of large systems by a couple orders of magnitude (now ~1 min for 6000 atoms, w/symmetry)
      • Hook up Trent Parker's linear-scaling BFS algorithm to qcdb.Molecule or through just np arrays. Has "seed" argument so you can forcibly split intramolecularly or close-bound intermol.
      • Selected steps (for performance) are now in numpy, as a result, geometry(np_out=True) and xyz(np_out=True) are avail to skip the cast-back-to-list step. Also, to_arrays returns geom, mass, elez, etc. as numpy arrays, rather than needing to iterate over natom.
    • psi4.core.Molecule
      • add rotational_symmetry_number
      • export rotor_type
      • faster BFS algorithm hooked up to psi4.Mol, too. Depending on Mol type, returns numpy split arrays, single large Mol with atoms rearranged and fragmentation embedded, and/or list of indiv Mols, one for each fragment. Radically pare down auto_fragments/old BFS code.
      • units no longer exported as a property. has getter/setter, and the getter returns strings, not Molecule::GeometryUnits objects.
    • Both psi4.core.Molecule && qcdb.Molecule
      • Enhanced add_atom to take label (in add'n to symbol) and mass number, so is a full fledged CoordEntry entry point
      • Add mass_number storage and accessor. In Psi, this is just a pass-through – nothing done with it. Stores isotope mass number if mass corresponds to valid nuclide, -1 otherwise or unknown.
      • Fragmentation member data public access. Rename fragments --> get_fragments; similar for fragment_types, fragment_charges, fragment_multiplicities. Only fisapt code was using this. For setters, added set_fragmentation_pattern that sets them all at once.
      • Added input_units_to_au getter/setter; checks physical reasonableness.
      • Sets input_units_to_au whenever set_units called, rather than as extra step.
      • Finally add a com_fixed fn to mirror orientation_fixed. In qcdb.Mol, also a fix_com.
      • create_molecule_from_string was doing all kinds of contortions with fr_types and efp_chg/mult to the extend that fragment* arrays were not the same length. Function to be retired shortly, so papered over.
      • Psi has long had the problem that set_multiplicity/set_molecular_charge values might not stick upon reinterpret_coordentries b/c recomputed from fragments (which the user can't change) and by high-spin-sum. Adds logic to retain set_mult value if all fragments real (still no guarantee of physical reasonableness wrt frags) b/c otherwise I couldn't do isapt. See validate_and_fill_chgmult for general solution at the boundaries of Mol class, but no good internally to psi4.Mol b/c in python.
      • to_arrays, to_dict, BFS, B787, scramble fns shared btwn psi4 & qcdb Mol classes. When drop py27, can attach qcdb fns directly to psi4 and drop the @static raw_ intermed fns.
    • Molecule serialization
      • New from_arrays constructor that takes minimal (geom & Z) to full geom spec as arrays (homogeneous natom-length, not heterogeneous per-atom) and thoroughly validates them and fills in defaults, emitting a standardized (to become MolSSI JSON) molrec dict. Plain fn produces dict. psi4.core.Molecule.from_arrays and qcdb.Molecule.from_arrays go a step further and return a Mol.
        • Function validate_and_fill_chgmult addresses current problems (1) create_mol_from_string's problem of not allowing overall chgmult to be set, defaulting all frag to 0 1, and defaulting overall to fragA, (2) physical reasonableness (chg/mult/#elec compatible) not being tested for fragments at all and for overall not until the SCF code, (3) problems like chg mult misallocation #114, and (4) overall and frag chgmult getting out of whack upon set_chg/mult, extract_subsets, not having frag chgmult editable, scf singlet/doublet defaulting. Taking given tot/frag chg/mult info and reasoning out the missing info would have been massive logic for chg/mult/#elec simultaneously, so this just codes up the rules and the defaults and throw itertools.product at it. Slows down a bit by 10 frag, but there's room for performance optimization. Ideally we'd call this at every reinterpret_coordentry, but that's not feasible for psi4 Mol. A good newdev starter project would be (for impossible reconciliations) to keep track of the least offending combo, and print out a summary of what rules it's violating.
        • Function reconcile_nucleus addresses problems (1) can't specify atoms by atomic number, (2) can't specify masses by mass number, (3) faced with A, Z, E, real/ghost, and label @C_special@12.1, all as inputs, need to make sure (a) extracting as much user info as possible, (b) not allowing contradictory info, and (c) filling in everything else from periodictable. Like chgmult, fn handles this by defining rules and candidate values, and letting itertools do the rest. Expand atom label spec so can specify isotope and atomic number e.g., Gh(27@58.933) or 2H_deut.
        • Validates atoms-too-close all at once, rather than per-atom, which can run into units, coordsys, and? old CoordEntry trouble.
      • New from_dict fn takes a fully validated and defaulted molrec dict and constructs a Mol. psi4 fn is in export_mints.cc, and qcdb fn is in molecule.py.
      • New to_dict fn serializes Mol into molrec dict. This fn in molecule.py is shared by psi4 & qcdb Mol classes. For the moment (not mandatory), this dict passes again through from_arrays and output is compared to make sure (1) resulting dict is pure and (2) Mol hasn't been tampered with in an unphysical way.
      • from_arrays used internally to qcdb.Molecule for decomposing and reconstituting Mol into np arrays for fragmentation and alignment (potentially reordering)
      • BasisSet molecule handover (happens a lot to construct py basis for c++ mol and hand it back; also happens once per atom for SAD) that used to use create_psi4_string_from_molecule & create_molecule_from_string now uses to_dict & from_dict. So even though all this serialization tech isn't hooked up to molecule {...}, it's still getting exercised a lot.
      • ISAPT calcs may well need to use from_arrays rather than molecule {...} for a bit (until new molecule string parser in) if splitting bonds. See note in tests/isapt1. @bwb314, your SAPT consumer base may need informing about this change when it hits master.
    • Add a py-side vibrational analysis called from driver.frequency() after the internal hessian() call
      • Still sets Wavefunction.frequencies member data with vib-only real frequencies in cm^-1 so tests happy
      • Rest of results is appended to Wfn py-side. See Q below
      • Adds a symmetrize_hessian function
      • Translates print_molden_vibs
      • Easy switching between mass-weighted normalized (q), un-mass-weighted (w), and un-mass-weighted normalized (x) normal coordinates
      • projects translations and rotations from Hessian
      • separate print_vibs(vibinfo) function in which q/w/x normco, number of vibs per row, number of decimal places for freq, number of decimal places for normco, and normco (nat x 3) or (3*nat x 1) is all customizable by option
      • py-side thermo analysis that returns a dict of all the results
      • compare_vibinfos fn so one can compare every aspect of a vib calculation, including normco. has capability to forgive some fields e.g., irrep label on mol with degen modes
      • checks normco for partial findif frequency. @psi-rking fixed bug that made normco wrong for not-totally-symmetric partial findif-by-G frequency. connection?
      • Adds gradient calc to the beginning of every Hessian and stores in wfn. Adds check of whether sys is at stationary point (Grms > 1.e-2) and if so, and doing findif, include rotational dof in SALCs. Adds keyword FD_PROJECT for user to force inclusion/disclusion.
      • If not at stationary point (same crit as above), during vibrational analysis don't project out rotations b/c vib/rot can't properly be separated. this leaves 3 extra normco than user might be expecting.
      • For reference, if my memory serves be correctly, Gaussian prints low freqs of unproj and final normco of proj, Q-Chem prints the final normco of proj, Cfour prints final normco of unproj and prints the freq of proj.
    • Add nifty fn stolen from internet to
      • decompose numpy matrices into little numpy matrices (like the i, jth atom of a Hessian ...) and put it back together again.
      • generate a random, evenly distributed 3d rotation matrix
    • Adds an alignment tool (B787) that returns a displacement vec, rotation mat, and atom map to get best Kabsch overlap (@dsirianni code) between two molecules. Uses Hungarian algorithm+ for atom shuffling. In that case, required hungarian and networkx add'l packages. Handles shuffled atoms up to at least benzene trimer and more if haven't too many symmetry-equivalent atoms. Really good for perfect matches, fine but untuned for imperfect matches. Also provides functions that use that return set to properly manipulate list, vec, mol, grad, hess (hess untested). Upon request, also tests mirror-image alignment.
    • Remove everything beyond constructing the Cartesian Hessian from findif, including VIBRATION class
    • Remove the c-side thermo module
    • Adds extensive vib test on HOOH-TS, CO2, ethene, H2CO, methane that checks freqs and normco of each mol vs. Cfour output for following conditions
      • Cfour Hessian (tests harmonic analysis machinery)
      • Psi freq by grad
      • Psi freq by energy
      • Psi freq by analytic (only methane and ammonia pass) All pass now, thanks to @andysim
  • fix the embarrassing bug (I'll do you a coding favor if you spot it, but I'm not admitting it otherwise)
  • better handle non-stationary. probably need an option to force-rot-space-inclusion but by default, frequency() does a gradient to determine if rotational projection is safe, then another flag to query external field and company to see if even translational projection is appropriate.
  • more testing of isotopic, partial, non-stationary, etc., particularly wrt normco not just freq
  • Add compare_dicts and compare_molrecs comparison functions. Former uses deepdiff module and does recursive comparison (sensitive to types). Can exempt fields with forgive arg. Latter makes use of former and also allows geoms to change if warranted by fix_com/orientation (checks via aligner).
  • convert qcdb test suite from GnuMake (yes, that old) to pytest. switch compare_* fns to raise TestComparisonError, not sys.exit(1).
  • have to temporarily set 0 2 in some test cases, b/c the reconciler wants physically reasonable values, but the input molecule isn't currently passing through the code that would set those defaults.
  • updated v2rdm_casscf pinning to one Eugene already fixed up to work w/psi
  • User-Facing for Release Notes

Questions

  • Right now, results of harmonic analysis is a dict of namedtuples with numpy arrays (contents and interface below). This is pretty convenient to access and work with. But it is numpy arrays attached to Wfn and thus mixing array classes on Wfn and unable to access c-side. Is this ok? Better ideas?
      Returns
      -------
      dict, text
          Returns dictionary of VibrationAspect objects (fields: lbl unit data comment)
          Also returns text suitable for printing
    
      +---------------+--------------------------------------------+-----------+------------------------------------------------------+
      | key           | description (lbl & comment)                | units     | data (real/imaginary modes)                          |
      +===============+============================================+===========+======================================================+
      | omega         | frequency                                  | cm^-1     | np.array(ndof) complex (real/imag)                   |
      | q             | normal mode, normalized mass-weighted      | a0 u^1/2  | np.array(ndof, ndof) float                           |
      | w             | normal mode, un-mass-weighted              | a0        | np.array(ndof, ndof) float                           |
      | x             | normal mode, normalized un-mass-weighted   | a0        | np.array(ndof, ndof) float                           |
      | degeneracy    | degree of degeneracy                       |           | np.array(ndof) int                                   |
      | TRV           | translation/rotation/vibration             |           | np.array(ndof) str 'TR' or 'V' or '-' for partial    |
      | gamma         | irreducible representation                 |           | np.array(ndof) str irrep or None if unclassifiable   |
      | mu            | reduced mass                               | u         | np.array(ndof) float (+/+)                           |
      | k             | force constant                             | mDyne/A   | np.array(ndof) float (+/-)                           |
      | DQ0           | RMS deviation v=0                          | a0 u^1/2  | np.array(ndof) float (+/0)                           |
      | Qtp0          | Turning point v=0                          | a0 u^1/2  | np.array(ndof) float (+/0)                           |
      | Xtp0          | Turning point v=0                          | a0        | np.array(ndof) float (+/0)                           |
      | theta_vib     | char temp                                  | K         | np.array(ndof) float (+/0)                           |
    
      Examples
      --------
      # displacement of first atom in highest energy mode
      >>> vibinfo['x'].data[:, -1].reshape(nat, 3)[0]
    
      # remove translations & rotations
      >>> vibonly = filter_nonvib(vibinfo)
    
  • Should we run a gradient before every freq to judge whether to project rotations? Right now freq on non-stationary structures are going to be way different than from findif because findif stuck with the (in PR analysis labeling) "pre-proj" vibs
  • How to handle natural geometry shift the occurs upon isotopic substitution in geometry but not in reused Hessian? shift/rotate the Hessian? compute in center-of-charge instead?

Good stuff still to do

  • vibs docs
  • warning to isapt folks that frag spec just got tighter wrt mult
  • break up vibanalysis, dft-cation, dft-water-dimer tests. to py?

closes #940

Status

  • Ready for review
  • Ready for merge content-correctness-wise
  • Ready for merge psi-planning-wise

@loriab
Copy link
Member Author

loriab commented Oct 31, 2017

Just found a checklist of stuff that needs to be working by end of this project, so may as well transcribe it somewhere it'll be remembered:

  • xtpl/delta Hess (for Constance)
  • kill off freq in wfn.h
  • scf H w/EFP
  • vib anal working for findif
  • vib anal working for analytic (right now a 10-liner micro analysis by Andy)
  • example of isotope reanalaysis
  • coord sys for normco
  • partial Hess
  • input orientation affect normco orientation

@loriab
Copy link
Member Author

loriab commented Nov 13, 2017

With the newest commits, this still has one embarrassing bug, but all test cases pass, and it's ready for review. This is as far as I intend to dismantle findif.

@loriab
Copy link
Member Author

loriab commented Nov 14, 2017

Ok, I've added an orienting Description to the main PR box.

Requesting review by @psi-rking, particularly the vib.py file and findif changes.

@psi-rking
Copy link
Contributor

I haven't been using 'asarray'. Will start.

Does the rotational_symmetry_number() code account for the change in mass for isotopic substitution somewhere?
It looks to me that it only does if:
full_point_group_with_n()
checks mass not just Z's and geometry. Does it?

FYI, the 'nmwhess' read to me as "new mass-weighted hessian". I realized subsequently it meant 'non'. I've skimmed the changes overall, but haven't scrutinized vib.py itself.

@loriab
Copy link
Member Author

loriab commented Nov 15, 2017

@psi-rking

Yup, asarray() is great, a product of DGAS' numpy interface. And so much handier for printing psi4.core.Matrix to screen, rather than just mat.print_out() to file.

I haven't traced it fully recently, but I expect here is where mass changes get caught. In practical terms, the freq-isotope[12] test cases should cover the rotational_symmetry_number with some care. You're right that mass-symmetry-breaking in Molecule wasn't working at one time, though. I should probably spice up mints5 to check this thoroughly. Or were you concerned particularly with non-Abelian PG involving n? Or something else I've missed?

Hmm, I keep reading it as "nwchem hessian". Maybe time for a name change to just "hessian" with the weightedness in the docstring.

@psi-rking
Copy link
Contributor

Some checks in libmints/molecule.cc do use 'is_equivalent_to', e.g. 'has_inversion'. A possible problem I see is in molecule.cc (assuming this code hasn't changed):

void Molecule::set_full_point_group(double zero_tol) contains
int Cn_z = matrix_3d_rotation_Cn(geom, z_axis, false, zero_tol);
with declaration
int matrix_3d_rotation_Cn(Matrix &coord, Vector3 axis, bool reflect, double TOL, int max_Cn_to_check)
which calls
present = coord.equal_but_for_row_order(rotated_mat, TOL);

but the 'equal_but_for_row_order' is a matrix operation
bool Matrix::equal_but_for_row_order(const Matrix *rhs, double TOL)
which doesn't mind masses.

If there is a problem, it would appear for higher n.

@loriab
Copy link
Member Author

loriab commented Dec 6, 2017

Ok, @psi-rking, I've expanded the qcdb.Molecule mints5 test case to change the mass of various atoms, then check that the symmetry changes accordingly. Haven't done every PG, but did a variety. Assuming the psi4.core.Molecule behaves analogously, does that alleviate your earlier concern?

Any other problems you notice?

@loriab
Copy link
Member Author

loriab commented Jan 31, 2018

Ok, @psi4/maintainers and @psi-rking, this mammoth is ready to review. I've updated the RN at the top, and I recommend reading them over before meeting the code. Nitpick away (esp. at numpy stuff that I was still learning at the start), and feel free to question the broader (esp. vibanal) logic flow.

# At stationary point?
G0 = gradient(lowername, molecule=molecule, **kwargs)
nonzero_gradient = G0.rms() > 1.e-2 # pulled out of a hat
translations_projection_sound = not core.get_option('SCF', 'EXTERN') and \
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I realize it's pulled out of a hat, but it seems a little loose from my (very limited) intuition. I'll do some testing and post back here soon...

nonzero_gradient = G0.rms() > 1.e-2 # pulled out of a hat
translations_projection_sound = not core.get_option('SCF', 'EXTERN') and \
not core.get_option('SCF', 'PERTURB_H')
rotations_projection_sound = translations_projection_sound and not nonzero_gradient
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I really like this rotations_projection_sound mechanism

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

EFP should also disqualify projection, I think

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ah, EFP – good point

wfn.frequencies().print_out()
core.thermo(wfn, wfn.frequencies())
# Project final frequencies?
translations_projection_sound = not core.get_option('SCF', 'EXTERN') and \
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems like a repeat of line 1313. In the spirit of the DRY principle, it may be a good idea to centralize this logic; being able to add new cases to only one location would be great

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, its awkward b/c (1) the former is in the inner fn hessian and there's no easy place to return the value to the outer frequency fn and (2) they're actually controlling two different projections – in hessian, whether to include rot in SALCs if findif, and in freq, whether to project rot in the vib analysis, whether analytic or findif.

As soon as c-side findif is dead, I agree a unified location (and hat value) is desirable.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Regarding your second point, one solution might be to have two helper functions like energy_is_rotationally_invariant and energy_is_translationally_invariant, which could be called as part of the determination of the different *_projection_sound determinations, with appropriate additions to the resulting bool for the findif / hessian projection. Currently both have the same dependence on external potentials, but something like simple periodic boundary conditions would make the energy translationally, but not rotationally, invariant. My fear is that we could introduce something like PBC and forget to add it to the list of conditions in one place or another, potentially creating a mess. I'm sure you've already thought this through, but just wanted to voice my concerns so we keep this stuff in mind through the migration to py-side analysis.

# Project final frequencies?
translations_projection_sound = not core.get_option('SCF', 'EXTERN') and \
not core.get_option('SCF', 'PERTURB_H')
rotations_projection_sound = translations_projection_sound and wfn.gradient().rms() < 1.e-2 # aforementioned hat
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Similarly, the 1.0e-2 would be better if it were a variable, so that both locations are updated consistently in the event of us changing tolerance.

@@ -1709,6 +1765,73 @@ def frequency(name, **kwargs):
return core.get_variable('CURRENT ENERGY')


def vibanal_wfn(wfn, hess=None, irrep=None, molecule=None, project_trans=True, project_rot=True):
# TODO should go back to private
import numpy as np
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wouldn't be better to import this for the whole file? After all your hard work, I don't think we need math anymore, also.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

About to push a cleanup of driver and partial qcdb imports. Numpy now always top-level.

# TODO should go back to private
import numpy as np

if hess is None:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a great idea! Being able to analyze anything passed in is really cool.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The vibanal_wfn was a minimum distance hookup btwn the driver and the python. There's another vibanal_str in tests/python/vib_analysis that gets at the analysis from another angle. I expect these functions to evolve a lot as we see what's really needed. The isotopes test case shows Hessian reuse (finally!).


if dashlvleff.lower() == 'd2p4':
returnstring = '%12.6f %12.6f %12.6f %12.6f %12.6f %6d\n' % \
(dashcoeff['s6'],
returnstring = dashformatter.format(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, so I agree with your assertion that the .format is cleaner than the old % syntax now.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think im grumpy and old:

>>> fmt = "%12.6f %12.6f"
>>> fmt % (10, 5)
'   10.000000     5.000000'

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, but mixed args & kwargs spec is so much cleaner with format. I still cling to printf in cpp, but we'll just have to disagree over % vs format in py. I love it. :-)

text += fmt.format('[u A^2]', a='I_A', b='I_B', c='I_C', *rc_moi['u A^2'])
text += fmt.format('[GHz]', a='A', b='B', c='C', *rc_moi['GHz'])
text += fmt.format('[MHz]', a='A', b='B', c='C', *rc_moi['MHz'])
text += fmt.format('[cm^-1]', a='A', b='B', c='C', *rc_moi['cm^-1'])
print(text)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this missing a redirect to the outfile, or deliberately going to stdout?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pretty much the whole of qcdb outside those pieces that would clutter up a psi4 calc has a serious outfile/stdout problem for which I don't have a clean solution. Yes, it's just heading to stdout.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can always overwrite stdout to psi4 output (I think folks would be unhappy). Or provide a output handle to the functions.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd really like a logging type approach where an output resource could be global. But an output handle is probably more realistic. I've got a lot of goodie, text = function_of_worth() going now, which complicates function returns. I may be searching the wrong things, but I haven't seen a good sol'n out there for py modules.


"""
import numpy as np
from matplotlib import pyplot
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm guessing this isn't called by everyday users, but it may still be worth a try / except ImportError for the not entirely ubiquitous matplotlib.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hopefully they won't have do_plot=True w/o considering if they have anything to plot with, but yes, good point, there's several feature that require add'l packages that I've try/excepted so good form to do mpl, too.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've tripped myself up a time or two by assuming that mpl is installed on a given machine, before discovering it isn't. I'm not aware of any distros that ship it by default so I think a message describing how to pull it in through conda may aid novice users who thought they'd just get a plot automagically. Other stuff that's more standard is fine to leave unguarded IMHO

return passed


def hessian_symmetrize(hess, mol):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yay! Thanks for figuring out how to do this properly, so we can have a ceremonial burning of my horribly failed attempt.

text.append(' pre-proj all modes:' + str(_format_omega(pre_frequency_cm_1, 4)))

# project & solve
mwhess_proj = np.einsum('ij,jk,kl->il', P.T, mwhess, P)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is very elegant machinery, and I like the details being provided to the user the way they are

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, careful here without np.einsum optimize machinery this could scale like N^4. Might be better write np.dot(P.T, mwhess).dot℗. This counteracts Andy, but that could actually become a bottleneck!

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point. Back to .dot


imagfreqidx = np.where(vibonly['omega'].data.imag > vibonly['omega'].data.real)[0]
if len(imagfreqidx):
print("Warning: thermodynamics relations excluded imaginary frequencies: {}".format(omega_str[imagfreqidx]))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If there's exactly 1 imaginary frequency, perhaps we could think about automagically reporting the Wigner tunneling correction.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@CDSherrill already has plans of putting someone on the Grimme low-freq rotor correction. Wigner's another good one.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am assuming this (Wigner tunnelling or hindered rotor treatments) more or less fell through the cracks? Or do you have anyone working on it?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It did make it on to the wish list, so not wholly lost. No one working on it. @rainli323 wanted a project and it might be suitably compact unless you had your eye on it.

except ImportError:
msg = "\n\tPlot not generated; matplotlib is not installed on this machine.\n\n"
print(msg)
core.print_out(msg)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good way to flatten this.

rotations_projection_sound = (translations_projection_sound and
stationary_point)

return translations_projection_sound, rotations_projection_sound
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yea, I like this too. Single place to find this kind of information.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice! This is exactly what I was thinking when nagging about a centralized routine before.

Copy link
Member

@dgasmith dgasmith left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think ill take this in sections. But this is awesome.

@@ -143,6 +143,8 @@ install:
- conda info -a
- conda create -q -n p4env python=$PYTHON_VER numpy cmake ci-psi4-lt psi4-lt-mp gau2grid dftd3 gcp snsmp2 scipy -c psi4/label/dev -c psi4
- source activate p4env
- conda install hungarian networkx -c psi4
- conda install deepdiff -c conda-forge
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are we adding this as Psi4 requirements?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No. hungarian and networkx are very specialized, and they're only needed if you want to align two molecules and can't guarantee that the atoms map (first atom of molA plays same structural role as first atom of molB, etc.) So those are definitely in-function imports so that run-of-the-mill calcs aren't burdened.

deepdiff you actually will hit by default at present b/c there's a temporary? internal check that the Mol.to_dict are sound. It's easy enough to get, as I added it to conda-forge, but what do you think? Require or disable internal checks and some test cases if not present?

In fact, right now you have to compile hungarian yourself for most platforms. Author would like to shove it into numpy as his is in C and reputedly far faster than the np implementation. But no one's actively working on that. Btw, @dgasmith, do you know how a module that compiles against numpy/arrayobject.h chooses its numpy? My py27 conda pkg has been supplied with a np112 numpy but ends up with np112 in its conda build string but np110 in its setup.py build string. And complains at runtime about being compiled against a diff version. There's NPY_1_7_API_VERSION warnings at compile-time if that's a clue.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Anything we can pip install I wouldn't worry too much about; however, if we start requiring libraries from conda-forge we enforce conda on everyone. That is a step I am not too comfortable with at the moment. As a note, you can get networkx/deepdiff off pip.

If he wants to get a C hungarian into NumPy, I can help him with that.

Yea, no idea on setup/distutils compilation, I try to stay far away.

zero_ghost_fragments=zero_ghost_fragments,
nonphysical=nonphysical,
mtol=mtol,
verbose=verbose)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cool, can these take in list as well?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, all inputs are array-like. You can see it in action in isapt1.

The output molrec dictionary is of fixed type, which is ndarray for atom stuff (geom, mass) and list for fragment stuff (fragment_separators, fragment_charges). I didn't quite understand your slack comment about which is preferable for the to_json operation (not in the scope of this PR).

# replaced by `cls.fn = qcdb.Molecule.fn` and in qcdb/molecule.py
# itself, the raw, staticmethod fns can use their official names again
# and not need the append line at bottom of file. what I do for you,
# py2 ...
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It might be worth considering dropping Py2 before too much longer. EOL for most major projects is end of this year, not the 2020 death clock.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Guess we can twitter/GH/forum survey to see if there's major objection to dropping py27. The static/raw workaround wasn't so ugly as I had feared (I thought I'd have to have multiple fns with the same long args list). So while I've no affection for 27, I haven't yet reached the "Shoo!" stage.

>>> psi4.set_memory("1e5 gb") # string w/ exponent
>>> psi4.set_memory("5e5") # string w/o units
>>> psi4.set_memory(2000) # mem too small
>>> psi4.set_memory(-5e5) # negative (and too small)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great cleanup!

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One of these days I'll have to actually build the docs in real time and see if my docstring cleanups are actually taking affect.

for jat in range(nat):
alhess[iat, jat] = (self.rotation.T).dot(blocked_hess[iat, jat].dot(self.rotation))

alhess = alhess[:, self.atommap]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should be able to do alhess[self.atommap[:, None], self.atommap].

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why yes, not that I knew that when I wrote it. Even cleaner, I think I can do alhess[np.ix_(self.atommap, self.atommap)]

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Works as well :)

basisdict = qcdb.BasisSet.pyconstruct(
mol.create_psi4_string_from_molecule(), key, resolved_target, fitrole, other, return_atomlist=return_atomlist)
basisdict = qcdb.BasisSet.pyconstruct(mol.to_dict(),
key, resolved_target, fitrole, other, return_atomlist=return_atomlist)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, I think we can overload the __dict__ attribute so that dict(mol) will work as well.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would that work on psi4.core.Molecule, too, though?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yea, since everything is a dict all of the special Python functions like dict, +=, etc just look for a object that has the right attributes. Kinda handy.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Still makes me nervous b/c for both Mol classes, the to_dict could be not a full picture to recreate (say, Zmat or xyz with vars). Result of to_dict is a fine, upstanding representation of the Mol and can be reversed with from_dict, but it's not necessarily the same repr of the Mol instance.

@loriab
Copy link
Member Author

loriab commented Apr 11, 2018

Ok, once again rebased and comments addressed and RTG.

@dgasmith dgasmith mentioned this pull request Apr 13, 2018
4 tasks
Copy link
Member

@dgasmith dgasmith left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, LGTM. Lets do this!

Copy link
Member

@jturney jturney left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great! I resolved the file conflict. Waiting on tests to pass before merging.

@jturney jturney merged commit c4f476b into psi4:master Apr 13, 2018
@loriab loriab mentioned this pull request Apr 29, 2018
24 tasks
@rainli323
Copy link
Contributor

rainli323 commented Apr 22, 2019 via email

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Normal mode vectors are not accessible via Python API
8 participants