-
Notifications
You must be signed in to change notification settings - Fork 46
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Output scaling overall merging statistics in the xia2 style. #1312
Conversation
Well, I obviously approve of the suggestion! 🙂 |
One potential issue is that before applying a resolution cutoff, the high resolution bin will often just be noise, which may be confusing or unsightly. Something we could do to get around this is to use the resolutionizer code to suggest a resolution limit, and then report the high resolution bin using that limit? Thoughts @graeme-winter ? |
Valid concern, would suggest using the resolutionizer code to determine an outer bin then show
unless %USER% has set limit, in which case use that. By a happy coincidence I think this is exactly what xia2.small_molecule does 🤔 |
Example output when resolution limit within measured range:
|
Just run this
So worked on 1st cut 🙂 |
Change set looks sensible. I wonder if we should pull the equivalent code out of xia2 & just use this? Will go look at the change sets in more detail now. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have made a few comments, most of them pretty minor but I think it would be good to look at them before merging this. The actual output change I am fine with, but wondering if some housekeeping while you're looking at this stuff would be a good idea?
algorithms/scaling/observers.py
Outdated
max_current_res = merging_stats.bins[-1].d_min | ||
cut_merging_statistics_result = None | ||
cut_anom_merging_statistics_result = None | ||
if r_cc - max_current_res > 0.005: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
0.005
because?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Additional: I dislike magic numbers, and also the significance of these depends very much on the value under comparison
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We report resolution stats to 2dp, so this is meant to be "if the same to within two decimal places". xia2's magic number here is 0.004 🤷♂️ 🙃
for f, k in zip(row_format, row_data) | ||
) | ||
except TypeError: | ||
formatted = "(error)" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🤔 not sure I like this one...
util/resolutionizer.py
Outdated
else: | ||
cc_f = fit(s_s[i:], cc_s[i:], 6) | ||
|
||
logger.debug("rch: fits") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
rch:
?
@@ -19,7 +19,7 @@ | |||
from dials.util.export_mtz import MADMergedMTZWriter, MergedMTZWriter | |||
from dials.report.analysis import ( | |||
make_merging_statistics_summary, | |||
make_xia2_style_statistics_summary, | |||
table_1_summary, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍
Fixed most issues now and marked as resolved. Anything left are things carried over from existing code and I think are fine to stay as is. |
util/resolutionizer.py
Outdated
|
||
if cc_half_method == "sigma_tau": | ||
cc_s = flex.double( | ||
[b.cc_one_half_sigma_tau for b in merging_statistics.bins] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[b.cc_one_half_sigma_tau for b in merging_statistics.bins] | |
b.cc_one_half_sigma_tau for b in merging_statistics.bins |
flex constructors work with generators, no need to explicitly construct a list and then throw it away. Applies to this line and a couple more times down the file.
@@ -182,28 +181,184 @@ def _batch_bins_and_data(batches, values, function_to_apply): | |||
return batch_bins, data | |||
|
|||
|
|||
def make_merging_statistics_summary(dataset_statistics): | |||
"""Format merging statistics information into an output string.""" | |||
formats = collections.OrderedDict( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
CPython 3.6+ (and Python 3.7+) dictionaries are ordered by default. No more need to use OrderedDict
.
i.e. Overall, high, low.
184da70
to
994d80c
Compare
Codecov Report
@@ Coverage Diff @@
## master #1312 +/- ##
==========================================
- Coverage 64.26% 64.20% -0.06%
==========================================
Files 616 616
Lines 69688 69782 +94
Branches 9505 9529 +24
==========================================
+ Hits 44786 44805 +19
- Misses 23160 23212 +52
- Partials 1742 1765 +23
Continue to review full report at Codecov.
|
i.e. The end of scaling output looks something like this. Bonus is that a few additional anomalous quality indicators are now output. Only thing not output that is in xia2 output is the Wilson B factor, as this should be calculated after merging and truncating.