Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Attempted fix for cosym reindex bug #2486 #2548

Merged
merged 8 commits into from
Dec 14, 2023

Conversation

jbeilstenedmands
Copy link
Contributor

@jbeilstenedmands jbeilstenedmands commented Nov 6, 2023

I've been giving this some thought, and I believe there is no issue with the fundamental cosym symmetry analysis, rather we are just not ensuring that we are outputting in the standard setting.
In some of the example cases highlighted, where the lattice symmetry is higher than the best cell (within tolerance of lattice_symmetry_max_delta), the cosym procedure can reindex into a valid but non-conventional cell. So we have to then recalculate what is the input-to-best cell transformation after the cosym procedure, in order to apply it correctly and recover the standard setting. I don't believe it is possible to reliably do this when choosing the seeds for the cosym reindexing.

This change correctly processes the two examples - the 5AUI example and Dan's failing-input-single, and all the cosym command line tests pass. Note I have temporarily commented out the recent change that stops cosym being run on single datasets, just for testing purposes.
I plan to add some full tests to this PR to demonstrate the fix and stop regressions. I still need to think about a few of the FIXME's in the change_of_basis_ops_to_best_cell function, but wanted to get this out so people can test and comment.

This has highlighted to me that we probably also want to reduce lattice_symmetry_max_delta (defaut 5). I think we can accurately index to a much higher tolerance, as we only want to catch datasets here which really could be misindexed. I note the default in dials.symmetry is 2, and I recently changed the default down from 2 to 0.5 in xia2.ssx for example because it was catching too many cases that didn’t need accidental-ambiguity assessment.
We could definitely do with better documentation on this confusing/complicated program too.

Fixes #2486

Copy link

codecov bot commented Nov 6, 2023

Codecov Report

Merging #2548 (04d3f32) into main (c12da11) will increase coverage by 0.00%.
Report is 5 commits behind head on main.
The diff coverage is 91.11%.

❗ Current head 04d3f32 differs from pull request most recent head 08f223a. Consider uploading reports for the commit 08f223a to get more accurate results

@@           Coverage Diff           @@
##             main    #2548   +/-   ##
=======================================
  Coverage   78.80%   78.80%           
=======================================
  Files         609      609           
  Lines       74659    74688   +29     
  Branches    10623    10633   +10     
=======================================
+ Hits        58832    58858   +26     
- Misses      13649    13651    +2     
- Partials     2178     2179    +1     

@jbeilstenedmands jbeilstenedmands changed the title Attempted fix for #2486 Attempted fix for cosym reindex bug #2486 Nov 6, 2023
@jbeilstenedmands jbeilstenedmands marked this pull request as ready for review November 7, 2023 11:03
@jbeilstenedmands
Copy link
Contributor Author

I would appreciate @biochem-fan and @dwpaley testing this change on any interesting/tricky datasets that you have. In particular any multi-crystal monoclinic datasets with some distribution in the integrated unit cell values, maybe also increase the lattice_symmetry_max_delta for extra fun.
To confirm the fix on your single-dataset examples, you will need to comment out L470-L473 in command_line/cosym.py.

@biochem-fan
Copy link
Member

Wonderful! I tested this on another C-centered monoclinic dataset (unpublished) that caused problems before.
It seems working fine. Tomorrow I will test more. If I don't report further issues, please consider I approved your fix.

@jbeilstenedmands
Copy link
Contributor Author

@biochem-fan Super, thanks! I have just tested on @dagewa's data from #2530, which crashes due to cosym working in MX space groups (P 1 2/m 1 != P 1 21/m 1), but apart from that seems to work, so working on adding a fix to this PR for non-MX space groups now....

@dwpaley
Copy link
Contributor

dwpaley commented Nov 30, 2023

This looks great.

One comment about the procedure to determine reindex_changes_cell. I believe in the typical use case, this function _apply_reindexing_operators is working on a list of experiments that were indexed in P1, i.e. no symmetry constraints applied. The median of a bunch of non-symmetrized cells is also a non-symmetrized cell of course. That means any non-identity reindexing operator will change the cell. For clarity is it better to just proceed as if reindex_changes_cell is true?

@biochem-fan
Copy link
Member

I believe in the typical use case, this function _apply_reindexing_operators is working on a list of experiments that were indexed in P1, i.e. no symmetry constraints applied.

Not necessarily. I often use cosym to sort out indexing ambiguities in small wedge datasets.

@jbeilstenedmands
Copy link
Contributor Author

@dwpaley you make an astute observation.
At this point in _apply_reindexing_operators, the input_cell can still be symmetrized depending on the input, which can be seen by printing out the input_cell values for some of the test examples in tests/command_line/test_cosym.p::test_synthetic.
"P422", (79, 79, 37, 90, 90, 90): input_cell = 37.0, 79.0, 79.0, 90.0, 90.0, 90.0
"P321", (59.39, 59.39, 28.35, 90, 90, 120): input_cell = 8.35, 59.39, 59.39, 60.0, 90.0, 90.0
"C2", (56.194, 53.224, 32.156, 90.000, 92.277, 90.000): input_cell : 32.156, 38.699, 38.699, 86.89, 88.35, 88.35

The original cells have been mapped to a reduced cell, which is symmetrized. The tests only use one cell value, but if there were a distribution in cell parameters but with symmetry enforced, the first two cases would still have a symmetrized median cell.
For high symmetry cells like P422 and P321, reindex_changes_cell is always False. For P321 and other rhombohedral symmetries, "cb_op_inp_best" is not "a,b,c", so we enter the 'elif' branch on L305.

If the input were indexed in P1 but the true cell is high symmetry.... I need to think a bit more about exactly what is going to happen....

@jbeilstenedmands jbeilstenedmands merged commit 8ce9de9 into main Dec 14, 2023
18 checks passed
@jbeilstenedmands jbeilstenedmands deleted the fix_reindex_to_best_cosym branch December 14, 2023 16:11
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

dials.cosym finds the right operator but puts it wrongly
4 participants