Output scaling overall merging statistics in the xia2 style. #1312

jbeilstenedmands · 2020-06-25T19:53:52Z

i.e. The end of scaling output looks something like this. Bonus is that a few additional anomalous quality indicators are now output. Only thing not output that is in xia2 output is the Wilson B factor, as this should be calculated after merging and truncating.

            ----------Merging statistics by resolution bin----------           

 d_max  d_min   #obs  #uniq   mult.  %comp       <I>  <I/sI>    r_mrg   r_meas    r_pim   cc1/2   cc_ano
 69.31   3.29  22774   3392    6.71  98.63     532.9    53.1    0.038    0.041    0.016   0.999*   0.353*
  3.29   2.61  22323   3319    6.73  97.53     193.6    36.0    0.050    0.054    0.021   0.998*   0.393*
  2.61   2.28  22557   3259    6.92  96.79      99.3    26.6    0.065    0.071    0.027   0.997*   0.324*
  2.28   2.07  22203   3244    6.84  95.98      67.6    20.3    0.079    0.085    0.032   0.996*   0.281*
  2.07   1.93  21523   3169    6.79  95.25      44.0    15.1    0.099    0.108    0.041   0.992*   0.190*
  1.93   1.81  21673   3188    6.80  94.40      25.1    10.4    0.136    0.147    0.056   0.991*   0.186*
  1.81   1.72  21276   3154    6.75  93.98      15.6     7.3    0.180    0.195    0.074   0.986*   0.162*
  1.72   1.65  20912   3106    6.73  93.13      10.4     5.3    0.231    0.251    0.096   0.978*   0.132*
  1.65   1.58  21666   3123    6.94  92.64       8.4     4.4    0.275    0.297    0.112   0.973*   0.053*
  1.58   1.53  20697   3083    6.71  92.03       6.3     3.5    0.326    0.354    0.135   0.955*   0.106*
  1.53   1.48  20174   3067    6.58  91.61       5.0     2.8    0.382    0.415    0.161   0.939*   0.031
  1.48   1.44  20819   3031    6.87  90.80       3.7     2.2    0.473    0.512    0.194   0.915*   0.034
  1.44   1.40  18344   2987    6.14  90.46       3.2     1.8    0.523    0.571    0.227   0.878*   0.001
  1.40   1.37  13881   2480    5.60  72.83       2.7     1.5    0.584    0.645    0.267   0.812*  -0.001
  1.37   1.33   9578   1749    5.48  52.71       2.4     1.3    0.638    0.704    0.292   0.748*  -0.019
  1.33   1.31   6726   1287    5.23  38.73       2.3     1.2    0.654    0.726    0.306   0.735*   0.007
  1.31   1.28   4850   1001    4.85  29.69       2.0     1.0    0.746    0.833    0.362   0.602*  -0.020
  1.28   1.26   2923    700    4.18  21.03       1.8     0.8    0.854    0.974    0.453   0.403*   0.032
  1.26   1.23   1214    407    2.98  12.37       1.5     0.6    0.897    1.070    0.569   0.282*  -0.156
  1.23   1.21    305    191    1.60   5.64       1.6     0.5    0.741    0.981    0.636   0.490*  -0.408
 69.19   1.21 316418  48937    6.47  72.90      69.3    12.8    0.065    0.070    0.027   0.999*   0.329*


               ----------Summary of merging statistics----------               

                                             Overall    Low     High
High resolution limit                           1.21    3.29    1.21
Low resolution limit                           69.19   69.31    1.23
Completeness                                   72.9    98.6     5.6
Multiplicity                                    6.5     6.7     1.6
I/sigma                                        12.8    53.1     0.5
Rmerge(I)                                     0.065   0.038   0.741
Rmerge(I+/-)                                  0.055   0.031   0.695
Rmeas(I)                                      0.070   0.041   0.981
Rmeas(I+/-)                                   0.065   0.037   0.983
Rpim(I)                                       0.027   0.016   0.636
Rpim(I+/-)                                    0.035   0.019   0.695
CC half                                       0.999   0.999   0.490
Anomalous completeness                         71.6    99.1     1.4
Anomalous multiplicity                          3.3     3.5     1.3
Anomalous correlation                         0.329   0.353  -0.408
Anomalous slope                               0.807
dF/F                                          0.070
dI/s(dI)                                      0.959
Total observations                           316418   22774     305
Total unique                                  48937    3392     191

Writing html report to dials.scale.html
Saving the scaled experiments to scaled.expt
Saving the scaled reflections to scaled.refl
See dials.github.io/dials_scale_user_guide.html for more info on scaling options

graeme-winter · 2020-06-25T19:56:40Z

Well, I obviously approve of the suggestion! 🙂

jbeilstenedmands · 2020-06-26T06:05:00Z

One potential issue is that before applying a resolution cutoff, the high resolution bin will often just be noise, which may be confusing or unsightly. Something we could do to get around this is to use the resolutionizer code to suggest a resolution limit, and then report the high resolution bin using that limit? Thoughts @graeme-winter ?

graeme-winter · 2020-06-26T06:14:42Z

Valid concern, would suggest using the resolutionizer code to determine an outer bin then show

overall
overall to chosen limit
inner
outer to chosen limit

unless %USER% has set limit, in which case use that. By a happy coincidence I think this is exactly what xia2.small_molecule does 🤔

jbeilstenedmands · 2020-06-26T09:23:34Z

Example output when resolution limit within measured range:

            ----------Merging statistics by resolution bin----------           

 d_max  d_min   #obs  #uniq   mult.  %comp       <I>  <I/sI>    r_mrg   r_meas    r_pim   cc1/2   cc_ano
 72.29   4.85  35119   2123   16.54  90.96      99.1    35.1    0.125    0.129    0.027   0.998*   0.013
  4.85   3.85  35008   2204   15.88  94.03      83.1    30.5    0.147    0.151    0.033   0.996*   0.027
  3.85   3.36  34497   2206   15.64  95.50      41.3    23.8    0.192    0.197    0.044   0.995*  -0.010
  3.36   3.05  34699   2266   15.31  96.26      20.3    16.0    0.314    0.323    0.071   0.989*   0.029
  3.05   2.83  33927   2202   15.41  95.45      13.9    12.3    0.405    0.417    0.094   0.975*   0.002
  2.83   2.67  34209   2268   15.08  97.38      10.7     9.3    0.526    0.541    0.123   0.963*   0.013
  2.67   2.53  35104   2291   15.32  97.61       8.2     7.3    0.627    0.645    0.145   0.942*  -0.018
  2.53   2.42  32881   2239   14.69  96.22       6.4     5.5    0.762    0.787    0.186   0.384*  -0.008
  2.42   2.33  27903   2214   12.60  94.94       5.1     3.9    0.939    0.974    0.252   0.870*   0.007
  2.33   2.25  20854   2161    9.65  92.47       4.2     2.7    0.966    1.014    0.299   0.599*  -0.021
  2.25   2.18  16347   2062    7.93  88.54       3.6     2.0    1.308    1.394    0.464   0.333*   0.014
  2.18   2.12  12857   2020    6.36  85.99       3.0     1.4    1.097    1.189    0.445   0.006   0.004
  2.12   2.06  10391   1968    5.28  84.07       2.8     1.0    1.156    1.298    0.563   0.007  -0.008
  2.06   2.01   8006   1865    4.29  81.09       2.6     0.7    1.330    1.512    0.695  -0.001   0.016
  2.01   1.97   6184   1847    3.35  78.76       2.7     0.6    1.721    2.070    1.108   0.004   0.040
  1.97   1.92   4482   1673    2.68  71.56       0.4     0.5    1.188    1.414    0.747   0.030   0.005
  1.92   1.89   3037   1454    2.09  62.70       3.3     0.4    1.534    1.968    1.209  -0.004   0.250
  1.89   1.85   1692   1059    1.60  45.61       9.1     0.3    1.663    2.162    1.364   0.047  -1.000
  1.85   1.82    655    528    1.24  22.65       2.8     0.2    1.860    2.583    1.786   0.019   0.000
  1.82   1.79    194    173    1.12   7.49       3.4     0.1   -2.792   -3.948   -2.792   0.072   0.000
 72.23   1.79 388046  36823   10.54  78.99      18.6     9.1    0.296    0.311    0.086   0.583*   0.003


Resolution limit suggested from cc1/2 fit (limit cc1/2=0.3): 2.18

               ----------Summary of merging statistics----------               

                                            Suggested   Low    High  Overall
High resolution limit                           2.18    5.92    2.18    1.79
Low resolution limit                           72.23   72.27    2.22   72.23
Completeness                                   94.5    90.1    87.3    79.0
Multiplicity                                   14.1    16.5     7.5    10.5
I/sigma                                        13.4    41.1     1.9     9.1
Rmerge(I)                                     0.260   0.107   1.452   0.296
Rmerge(I+/-)                                  0.260   0.106   1.450   0.293
Rmeas(I)                                      0.268   0.109   1.555   0.311
Rmeas(I+/-)                                   0.274   0.111   1.620   0.316
Rpim(I)                                       0.063   0.023   0.537   0.086
Rpim(I+/-)                                    0.084   0.032   0.699   0.108
CC half                                       0.899   0.998   0.311   0.583
Anomalous completeness                         62.8    74.4    48.9    44.4
Anomalous multiplicity                          8.4     9.0     4.8     6.7
Anomalous correlation                         0.000   0.008  -0.003   0.003
Anomalous slope                               0.791                   0.688
dF/F                                          0.100                   0.235
dI/s(dI)                                      0.683                   0.492
Total observations                           340548   19123    8309  388046
Total unique                                  24236    1161    1103   36823

Writing html report to dials.scale.html
Saving the scaled experiments to scaled.expt
Saving the scaled reflections to scaled.refl
See dials.github.io/dials_scale_user_guide.html for more info on scaling options

graeme-winter · 2020-06-26T13:44:55Z

Just run this

                                             Overall    Low     High
High resolution limit                           1.08    2.94    1.08
Low resolution limit                          102.23  102.69    1.10
Completeness                                   88.6   100.0    16.7
Multiplicity                                   20.2    25.0     1.1
I/sigma                                        21.9    63.6     1.7
Rmerge(I)                                     0.088   0.052   0.240
Rmerge(I+/-)                                  0.086   0.052   0.259
Rmeas(I)                                      0.090   0.053   0.333
Rmeas(I+/-)                                   0.090   0.053   0.363
Rpim(I)                                       0.018   0.011   0.229
Rpim(I+/-)                                    0.025   0.014   0.254
CC half                                       0.999   0.999   0.880
Anomalous completeness                         84.7   100.0     1.5
Anomalous multiplicity                         10.8    14.2     1.0
Anomalous correlation                        -0.037  -0.067   0.000
Anomalous slope                               0.956
dF/F                                          0.050
dI/s(dI)                                      0.799
Total observations                          1828128  138425     958
Total unique                                  90451    5533     842

So worked on 1st cut 🙂

graeme-winter · 2020-06-26T14:16:00Z

Change set looks sensible.

I wonder if we should pull the equivalent code out of xia2 & just use this?

Will go look at the change sets in more detail now.

graeme-winter

I have made a few comments, most of them pretty minor but I think it would be good to look at them before merging this. The actual output change I am fine with, but wondering if some housekeeping while you're looking at this stuff would be a good idea?

algorithms/merging/merge.py

algorithms/scaling/observers.py

graeme-winter · 2020-06-26T14:25:38Z

algorithms/scaling/observers.py

+            max_current_res = merging_stats.bins[-1].d_min
+            cut_merging_statistics_result = None
+            cut_anom_merging_statistics_result = None
+            if r_cc - max_current_res > 0.005:


0.005 because?

Additional: I dislike magic numbers, and also the significance of these depends very much on the value under comparison

We report resolution stats to 2dp, so this is meant to be "if the same to within two decimal places". xia2's magic number here is 0.004 🤷‍♂️ 🙃

algorithms/scaling/observers.py

graeme-winter · 2020-06-26T14:34:13Z

report/analysis.py

+                    for f, k in zip(row_format, row_data)
+                )
+            except TypeError:
+                formatted = "(error)"


🤔 not sure I like this one...

report/analysis.py

report/test_analysis.py

graeme-winter · 2020-06-26T14:40:15Z

util/resolutionizer.py

+    else:
+        cc_f = fit(s_s[i:], cc_s[i:], 6)
+
+    logger.debug("rch: fits")


graeme-winter · 2020-06-29T12:07:38Z

algorithms/merging/merge.py

@@ -19,7 +19,7 @@
 from dials.util.export_mtz import MADMergedMTZWriter, MergedMTZWriter
 from dials.report.analysis import (
    make_merging_statistics_summary,
-    make_xia2_style_statistics_summary,
+    table_1_summary,


jbeilstenedmands · 2020-06-29T12:38:38Z

Fixed most issues now and marked as resolved. Anything left are things carried over from existing code and I think are fine to stay as is.

Anthchirp · 2020-07-06T08:11:33Z

util/resolutionizer.py

+
+    if cc_half_method == "sigma_tau":
+        cc_s = flex.double(
+            [b.cc_one_half_sigma_tau for b in merging_statistics.bins]


Suggested change

[b.cc_one_half_sigma_tau for b in merging_statistics.bins]

b.cc_one_half_sigma_tau for b in merging_statistics.bins

flex constructors work with generators, no need to explicitly construct a list and then throw it away. Applies to this line and a couple more times down the file.

Anthchirp · 2020-07-06T08:13:20Z

report/analysis.py

@@ -182,28 +181,184 @@ def _batch_bins_and_data(batches, values, function_to_apply):
    return batch_bins, data


-def make_merging_statistics_summary(dataset_statistics):
-    """Format merging statistics information into an output string."""
+formats = collections.OrderedDict(


CPython 3.6+ (and Python 3.7+) dictionaries are ordered by default. No more need to use OrderedDict.

i.e. Overall, high, low.

…ed limit also.

codecov · 2020-07-16T09:04:56Z

Codecov Report

Merging #1312 into master will decrease coverage by 0.05%.
The diff coverage is 88.23%.

@@            Coverage Diff             @@
##           master    #1312      +/-   ##
==========================================
- Coverage   64.26%   64.20%   -0.06%     
==========================================
  Files         616      616              
  Lines       69688    69782      +94     
  Branches     9505     9529      +24     
==========================================
+ Hits        44786    44805      +19     
- Misses      23160    23212      +52     
- Partials     1742     1765      +23

Impacted Files	Coverage Δ
algorithms/scaling/algorithm.py	`84.05% <ø> (-0.06%)`	⬇️
algorithms/scaling/observers.py	`92.98% <77.77%> (-1.48%)`	⬇️
report/analysis.py	`95.20% <88.73%> (-4.80%)`	⬇️
algorithms/merging/merge.py	`83.06% <100.00%> (+0.13%)`	⬆️
command_line/scale.py	`90.81% <100.00%> (ø)`
report/test_analysis.py	`100.00% <100.00%> (ø)`
command_line/report.py	`74.36% <0.00%> (-5.84%)`	⬇️
report/plots.py	`89.25% <0.00%> (-1.50%)`	⬇️
algorithms/scaling/test_scale.py	`98.77% <0.00%> (-1.23%)`	⬇️
algorithms/integration/report.py	`86.59% <0.00%> (-0.69%)`	⬇️
... and 3 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update d064f02...994d80c. Read the comment docs.

jbeilstenedmands changed the title ~~Output scaling overall merging statistics in the xia2-style.~~ Output scaling overall merging statistics in the xia2 style. Jun 25, 2020

jbeilstenedmands marked this pull request as ready for review June 26, 2020 13:00

jbeilstenedmands requested a review from graeme-winter June 26, 2020 13:01

graeme-winter requested changes Jun 26, 2020

View reviewed changes

graeme-winter reviewed Jun 29, 2020

View reviewed changes

jbeilstenedmands requested a review from graeme-winter June 29, 2020 12:37

github-actions bot added the PR: merge conflicts label Jul 3, 2020

Anthchirp reviewed Jul 6, 2020

View reviewed changes

jbeilstenedmands added 8 commits July 16, 2020 08:55

Output scaling overall merging statistics in the xia2-style.

9589012

i.e. Overall, high, low.

If suggested resolution limit, show merging stats summary for suggest…

e74f12f

…ed limit also.

Add test and newsfragment

aaa0b8f

Tidy observer functions, string outputting

0a7c325

Rearrange code to be usable by xia2

56fbf5f

Fix output for centric space groups

09b7c31

Update newsfragment

57ce80f

Update to use new resolutionizer function.

994d80c

jbeilstenedmands force-pushed the xia2_style_merging_stats branch from 184da70 to 994d80c Compare July 16, 2020 08:02

github-actions bot removed the PR: merge conflicts label Jul 16, 2020

jbeilstenedmands merged commit 01800bf into master Jul 16, 2020

jbeilstenedmands deleted the xia2_style_merging_stats branch July 16, 2020 12:09

rjgildea mentioned this pull request Dec 14, 2020

Layout of merging stats table #1524

Closed

huwjenkins mentioned this pull request May 21, 2022

Layout of dials.scale merging stats table does not respect user resolution cutoff #2117

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Output scaling overall merging statistics in the xia2 style. #1312

Output scaling overall merging statistics in the xia2 style. #1312

jbeilstenedmands commented Jun 25, 2020

graeme-winter commented Jun 25, 2020

jbeilstenedmands commented Jun 26, 2020

graeme-winter commented Jun 26, 2020

jbeilstenedmands commented Jun 26, 2020

graeme-winter commented Jun 26, 2020

graeme-winter commented Jun 26, 2020

graeme-winter left a comment

graeme-winter Jun 26, 2020

graeme-winter Jun 26, 2020

jbeilstenedmands Jun 26, 2020

graeme-winter Jun 26, 2020

graeme-winter Jun 26, 2020

graeme-winter Jun 29, 2020

jbeilstenedmands commented Jun 29, 2020

Anthchirp Jul 6, 2020

Anthchirp Jul 6, 2020

codecov bot commented Jul 16, 2020 •

edited

	[b.cc_one_half_sigma_tau for b in merging_statistics.bins]
	b.cc_one_half_sigma_tau for b in merging_statistics.bins

Output scaling overall merging statistics in the xia2 style. #1312

Output scaling overall merging statistics in the xia2 style. #1312

Conversation

jbeilstenedmands commented Jun 25, 2020

graeme-winter commented Jun 25, 2020

jbeilstenedmands commented Jun 26, 2020

graeme-winter commented Jun 26, 2020

jbeilstenedmands commented Jun 26, 2020

graeme-winter commented Jun 26, 2020

graeme-winter commented Jun 26, 2020

graeme-winter left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jbeilstenedmands commented Jun 29, 2020

Choose a reason for hiding this comment

Choose a reason for hiding this comment

codecov bot commented Jul 16, 2020 • edited

Codecov Report

codecov bot commented Jul 16, 2020 •

edited