Update Param Tethering#43
Merged
ericchansen merged 1 commit intoericchansen:param_tethfrom Jun 15, 2017
Merged
Conversation
This address the problem highlighted in PO's email below. I simply moved the parameter tethering section of collect_data right after the collection from a reference file. I'm sure we have had this discussion in the past, but instead of having two giant lists (or arrays) of data it would be nice to have blocks of data/structures/whatever where an object consists of the reference and calculated values, label, and weights. From Per-Ola's Email: """" Hi, We’re having GitHub problems here, can’t upload changes (firewall, don’t know if it will be resolved). We needed to implement tethering, which fails with the current code. The problem is that the reference data ( -r ) gets written first, but the parameter tethering data ( -mp ) gets written last, so you get a mismatch in compare. I just moved 18 lines in “calculate.py”, the MacroModel parameter tethering, earlier into the file (just after the general data section). Now it works. Could one of you make this fix “officially”? Xin will need this soon. We did find that in at least one case, tethering seems to be the only way to keep the angles in check. The same problem I’ve discussed before, if you look into a system where angles strain against each other (just do H-C-H in methane as an example), increasing all angle reference values will give an equally good structure fit, but will make a highly strained and unstable force field. I think we should do this generally, everywhere, tether the angle reference values to the average of the observed. Cheers, Per-Ola """
ericchansen
pushed a commit
that referenced
this pull request
Jan 14, 2019
* Substructure Name Selection Since the argparse argument was alreayd present without any use, I added in logic to gather data based off of the users prefernce with the default being 'OPT'. This is useful if you want to gather data from a particular substructure that you are not parameterizing. * New compare-Merge (#23) * Update loop and compare Instead of dealing with full arrays of data for reference and calculated data, this will make a dictionary of data types for reference and calculated data. The code also had two seperate functions for printing data and comparing data, which is not consolidated into one function with the option of printing data if necessary. I think there needs to be some additional support for reference data and parameter tethering data. * Modify modules to work with new compare.py Compare.py was updated to return the total score and not the last calculated score of an individual point. gradient.py and simplex.py were updated to use the new function in compare.py * Fixed Label Bug. Under some circumstances two data points that should be the same interaction were being removed in trim_data(). This was because the labels were different; the structure index was different. This is probably due to compressing the number of MacroModel calculations for a single structure resulting in a single output file with all of the structures within that file. Using a regex I was able to match labels only based on filename and atom invovled in the dihedral. This should fix this problem now. Gradient.py has some problems that I still need to address when handling the new data dictionaries. For now I am just useing do_newton() since it does not need to go through the data like the others do. * Small fixes * Torsion Label REGEX modified. I had to use '\S' to capture all non-whitespace characters that may be included in the filename. Additionally if a file has more than 9 structures the regex wouldn't match so instead of '\d' I used '\d+'. * Updated Logic for new_compare Several things had to be modified to use the dictionaries that have been set up in previous commits. The par_diff file is now written in the same order as the compare files. The other modifications just included reading data from the dictionaries rather than an array of data. * MacroModel MMO data gets sorted This shouldn't really be to big of a problem, but I think the function I added along with what is present in the Tinker class should be a function within the structure class. Structures with different filenames but with the same structure do not print interactions to the *.mmo file fully sorted. It seems that they are only sorted based on the index of one atom (bonds: first atom; angles: central atom; torsions: second atom). I added additional logic that will sort the bonds, angles, and torsions of a structure to a completely sorted list that is the same everytime. * Bug when using gradient Gradient was recalculating the score of the original force field for all of the differentiation steps and thus reported a 1st and 2nd derivative of 0.0. The appropriate logic had to be added to convert the new calculated data into the datatype dictionary. * Logging and Sort Torsions Loggin was added to report all the data that was removed. Torsions, I guess, are sorted in the *.mmo file based off of the lowest terminal atom. I gave it similar logic to angles to sort based off of the internal atoms first. * Sorting Torsions Read previous commit. * Bug Fixes for previous commit * Additional Bugs in Gradient The parameter differentiation file was not being written correctly with respect to the differentiation force fields. Just added logic to write the trimed and sorted data correctly. * Label to Attr for param tethering in reference file. * Added support to write com file to different directory. User can now specify directory that the gaussian *.com file should be written in. * Changed the link1 setting for FZTS Instead of having a link1 with an optimization without frozen coordinates, a link1 job of a frequency calclutaion will be done. This way if the previous frozen opt does not converge a frequency calcluations will still be done for the user to visualize the vibrations.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
This address the problem highlighted in PO's email below. I simply moved the parameter tethering section of collect_data right after the collection from a reference file. I'm sure we have had this discussion in the past, but instead of having two giant lists (or arrays) of data it would be nice to have blocks of data/structures/whatever where an object consists of the reference and calculated values, label, and weights.
From Per-Ola's Email:
""""
Hi,
We’re having GitHub problems here, can’t upload changes (firewall, don’t know if it will be resolved). We needed to implement tethering, which fails with the current code. The problem is that the reference data ( -r ) gets written first, but the parameter tethering data ( -mp ) gets written last, so you get a mismatch in compare. I just moved 18 lines in “calculate.py”, the MacroModel parameter tethering, earlier into the file (just after the general data section). Now it works.
Could one of you make this fix “officially”? Xin will need this soon.
We did find that in at least one case, tethering seems to be the only way to keep the angles in check. The same problem I’ve discussed before, if you look into a system where angles strain against each other (just do H-C-H in methane as an example), increasing all angle reference values will give an equally good structure fit, but will make a highly strained and unstable force field. I think we should do this generally, everywhere, tether the angle reference values to the average of the observed.
Cheers,
Per-Ola
"""