Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make some tools more robust to few reflections #1263

Merged
merged 7 commits into from
May 22, 2020

Conversation

benjaminhwilliams
Copy link
Member

@benjaminhwilliams benjaminhwilliams commented May 14, 2020

Some of the DIALS tools furthest downstream, such as dials.scale, dials.symmetry and dials.merge, rely on receiving a sufficient number of unique reflections to perform their calculations. Existing checks that the input is valid are not explicit and are insufficient to catch all failure modes. Changes here attempt to rationalise some behaviour in cases of very few input reflections, as from a very narrow wedge of a rotation data set.

Details of changes:

  • If dials.algorithms.scaling.ScalingAlgorithm.remove_bad_data results in no remaining data, raise an exception.
  • Fix the intended behaviour of remove_bad_data to correctly handle experiment UID strings.
  • In dials.algorithms.scaling.Ih_table.IhTableBlock.setup_binner, be less prescriptive in the choice of binner parameters in the case of few reflections.
  • If modules.dials.algorithms.merging.merge.merge_and_truncate does not receive enough reflections to calculate Wilson statistics for log output, catch the resulting exception and log an error-level message instead.
  • In the existing check for a similar failure to calculate and log the merging statistics, increase the logging level of the resuting message to error, for consistency.
  • In dials.algorithms.symmetry.symmetry_base._normalise, if an exception is encountered when trying to perform the normalisation, rather than raise a cryptic exception, simply log a warning and default to un-normalised intensities.
  • Correct a typo in modules.dials.command_line.merge.phil_scope.

@benjaminhwilliams benjaminhwilliams self-assigned this May 14, 2020
Copy link
Contributor

@rjgildea rjgildea left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I note that this may be one case that would perhaps benefit from splitting up the changes into multiple commits (and not necessarily squash/merging), rather than grouping together several logically unrelated changes in one commit. This would make e.g. reverting the individual changes easier if it later turns out that one of them caused some issues.

algorithms/scaling/Ih_table.py Outdated Show resolved Hide resolved
algorithms/merging/merge.py Outdated Show resolved Hide resolved
algorithms/scaling/algorithm.py Show resolved Hide resolved
algorithms/symmetry/__init__.py Outdated Show resolved Hide resolved
algorithms/merging/merge.py Outdated Show resolved Hide resolved
algorithms/merging/merge.py Outdated Show resolved Hide resolved
algorithms/merging/merge.py Outdated Show resolved Hide resolved
algorithms/symmetry/__init__.py Outdated Show resolved Hide resolved
algorithms/symmetry/__init__.py Outdated Show resolved Hide resolved
@benjaminhwilliams
Copy link
Member Author

@rjgildea, regarding keeping some degree of commit history rather than a full squash-merge, I agree. Once this PR gets approval, I'll force-push a picked/squashed version with different changes kept separate and ask for approval again.

@benjaminhwilliams
Copy link
Member Author

I'm afraid I've given up on the idea of unpicking every place where problems occur and building unit tests for each. Instead I've just built some simple integration tests. Let me know if you'd like me to revisit the unit tests, especially since it might avoid things like the previously undetected bug in dials.algorithms.scaling.algorithm.ScalingAlgorithm.remove_bad_data.

@benjaminhwilliams benjaminhwilliams marked this pull request as ready for review May 19, 2020 10:56
algorithms/scaling/test_scale.py Show resolved Hide resolved
algorithms/scaling/algorithm.py Outdated Show resolved Hide resolved
algorithms/merging/merge.py Outdated Show resolved Hide resolved
algorithms/scaling/scaling_library.py Outdated Show resolved Hide resolved
algorithms/scaling/test_scale.py Outdated Show resolved Hide resolved
@graeme-winter
Copy link
Contributor

Ran through a standard data set with this change set, made no difference to a xia2.small_molecule run - literally the text output is identical modulo directory.

The reflections coming out are identical 🙂

Grey-Area dials :( [few-reflections] $ diff /tmp/few/DataFiles/AUTOMATIC_DEFAULT_scaled_unmerged.sca /tmp/nat/DataFiles/AUTOMATIC_DEFAULT_scaled_unmerged.sca

gives nothing => from that perspective safe

@benjaminhwilliams
Copy link
Member Author

Great, thanks. Would you be happy for me to merge?

@benjaminhwilliams
Copy link
Member Author

In the meantime, I'll work on restructuring the commits to be a little more minimal, then we can break with convention and merge without squashing, as recommended by @rjgildea, since there is value in being able to revert logically distinct components of this PR.

benjaminhwilliams added a commit that referenced this pull request May 21, 2020
Test the behaviour of dials.symmetry, dials.scale, dials.merge and
dials.report for very small numbers of reflections.
Copy link
Member

@Anthchirp Anthchirp left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this ready to go?

@benjaminhwilliams
Copy link
Member Author

@rjgildea indicated offline that he wanted to take a look before I merged...

Test the behaviour of dials.symmetry, dials.scale, dials.merge and
dials.report for very small numbers of reflections.
Simply return an null value, which is correctly interpreted downstream
so that Xtriage plots aren't generated.
Do not raise an exception if Wilson stats cannot be calculated for
printing.  Instead, log an error message and continue.

Also correct a typo in the help text of a PHIL parameter.
Otherwise, if step == 0, ma.setup_binner_d_star_sq_step fails with an
AssertionError.
Previously, if the anomalous merging statistics could not be calculated,
the non-anomalous merging statistics would not be calculated either,
even if they could be.  Make the merging statistics calculation more
robust so that not calculating anomalous merging statistics does not
break downstream reporting.
Otherwise, non-integer experiment IDs cause an exception.

Also raise a ValueError if all experiments have been rejected, since we
cannot possibly continue processing no data.
If reflection intensity normalisation cannot be performed (for example
if there are too few reflections), then fall back on using un-normalised
intensities and log a warning to that effect.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants