-
Notifications
You must be signed in to change notification settings - Fork 46
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Cosym: choose cluster containing the most identity reindexing ops #1514
Conversation
abdde55
to
c4275c8
Compare
Codecov Report
@@ Coverage Diff @@
## master #1514 +/- ##
==========================================
- Coverage 65.62% 65.61% -0.01%
==========================================
Files 614 614
Lines 68965 68958 -7
Branches 9529 9523 -6
==========================================
- Hits 45257 45250 -7
- Misses 21866 21868 +2
+ Partials 1842 1840 -2 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Popped a couple of comments inline, mostly asking for small improvements to the documentation of the code as there are some sgtbx
type rabbit holes in this... I assume that the code does what the commit messages say and overall the spirit of the change set makes a lot of sense. I would wonder if this is a case for not squash merging however as the structure of the PR does have bug fix and tidying commits separately. Maybe squash down to 2 commits? So bugfix+news, cleaning+test changes
@@ -175,10 +175,12 @@ def _map_space_group_to_input_cell(intensities, space_group): | |||
sg_best = sg_primitive.change_basis(cb_op_best_primitive.inverse()) | |||
# best_subgroup above is the bravais type, so create thin copy here with the | |||
# user-input space group instead | |||
best_subsym = best_subsym.customized_copy( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this stuff could do with some kind of annotation somewhere... I am certain you have gained some understanding as a side-effect of these changes which should probably be documented inline?i.e. the why of est_subsym.change_basis(cb_op_inp_best.inverse())
etc.
I ask this as someone who may have to debug this one day :-)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is just to obtain the subsym
in the input setting rather than the "best" setting -> cb_op_inp_best.inverse()
@@ -370,28 +372,6 @@ def _analyse_symmetry(self): | |||
) | |||
self.params.cluster.n_clusters = len(cosets.partitions) | |||
|
|||
def _space_group_for_dataset(self, dataset_id, sym_ops): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You're not wrong that retaining this adds confusion, as I was trying to work out how things still work with all this stuff removed...
@@ -177,9 +178,17 @@ def run(self): | |||
self.cosym_analysis.run() | |||
|
|||
reindexing_ops = {} | |||
sym_op_counts = { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is behaviour well defined in the unlikely event of a tie?
@@ -177,9 +178,17 @@ def run(self): | |||
self.cosym_analysis.run() | |||
|
|||
reindexing_ops = {} | |||
sym_op_counts = { | |||
cluster_id: collections.Counter( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wait, what is this syntax?
Learning opportunity for YT
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, dictionary creation from iterator...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Possibly re-arranging the code slightly (e.g. a dummy variable to move everything onto a single line) would make it look less magic?
* Deprecate target.get_sym_ops(), use target.sym_ops instead * Remove CosymAnalysis.space_groups - this has been superceded by the pointless-style symmetry analysis, and retaining this likely just causes added complexity or confusion. * Tidy up CosymAnalysis._reindexing_ops_for_dataset method * Call cosym run() directly instead of via procrunner * Set n_clusters param after clustering - n_clusters may have been Auto, in which case we now know how many clusters were found by clustering * Cleaner to use 'if i_cluster in reindexing_ops: break'; add some docstrings to the method. * Minimal test for CosymAnalysis._reindexing_ops_for_dataset()
…exing ops This can be important if the lattice symmetry is only very approximately related by pseudosymmetry to the true symmetry. If all potential reindexing ops are genuine indexing ambiguities, then it doesn't matter which one is chosen, however if not, then choosing the wrong one will distort the true unit cell. In such cases it is likely that the input datasets were already indexed consistently, therefore default to choosing the cluster that contains the most identity reindexing ops.
0ed77e8
to
6159374
Compare
This can be important if the lattice symmetry is only very approximately related by pseudosymmetry to the true symmetry. If all potential reindexing ops are genuine indexing ambiguities, then it doesn't matter which one is chosen, however if not, then choosing the wrong one will distort the true unit cell. In such cases it is likely that the input datasets were already indexed consistently, therefore default to choosing the cluster that contains the most identity reindexing ops.