Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

In profile-fitted integration, don't integrate reflections with a high fraction of invalid pixels. #1640

Merged
merged 8 commits into from
Mar 31, 2021

Conversation

jbeilstenedmands
Copy link
Contributor

@jbeilstenedmands jbeilstenedmands commented Mar 30, 2021

For profile-fitted integration, only integrate reflections that have a high fraction of valid foreground pixels (currently set to 0.6 0.75). Controlled by parameter valid_foreground_threshold (range 0 -> 1). For #1638.

This is intended so that reflections which have centroids in the panel gaps are not integrated by profile fitting. In my opinion the default parameter should be at least 0.5 so that the peak centre is approximately observed. Perhaps a higher threshold such as 0.75 is more appropriate, I plan to investigate how the data quality is affected depending on the threshold value.

…than a

given fraction of the foreground pixels are valid.

This is controlled by a parameter valid_foreground_threshold (range 0->1),
default value 0.6. For #1638
@codecov
Copy link

codecov bot commented Mar 30, 2021

Codecov Report

Merging #1640 (e8e3374) into main (7cfb6ec) will increase coverage by 0.00%.
The diff coverage is 93.18%.

❗ Current head e8e3374 differs from pull request most recent head 555b99e. Consider uploading reports for the commit 555b99e to get more accurate results

@@           Coverage Diff           @@
##             main    #1640   +/-   ##
=======================================
  Coverage   66.63%   66.63%           
=======================================
  Files         616      616           
  Lines       68949    68951    +2     
  Branches     9601     9595    -6     
=======================================
+ Hits        45943    45945    +2     
  Misses      21070    21070           
  Partials     1936     1936           

@graeme-winter graeme-winter self-requested a review March 31, 2021 09:41
Copy link
Contributor

@graeme-winter graeme-winter left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Stage 0 review of looking at the code changes - seem sane - propose moving the default value to a unique place i.e. the master phil, not scattering 0.6 everywhere, and there seemed to be an unrelated change in the mix.

I will run some actual integration tests now (no pun intended) to assess the impact of these changes. I will approve once that change is made.

I think this is a large fanfare bug fix which warrants a 3.4.2 @ndevenish

algorithms/integration/integrator.py Outdated Show resolved Hide resolved
algorithms/integration/integrator.py Outdated Show resolved Hide resolved
algorithms/integration/integrator.py Show resolved Hide resolved
algorithms/integration/report.py Show resolved Hide resolved
working_directory=tmpdir,
)
assert not result.returncode and not result.stderr
table = flex.reflection_table.from_file(tmpdir / "integrated.refl")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Propose stronger test would be to check that there are no reflection centroids in tile join regions?

@jbeilstenedmands
Copy link
Contributor Author

Tested on the x4-wide dataset. This PR gives significant improvement in merging R-factors, Isigma compared to current release (dials-3.4), with 10K fewer reflections. Compared to dials-3.3, R-factors are slightly worse however there are 6K more reflections.

This PR, with threshold value of 0.75:

High resolution limit                           1.09    2.95    1.09
Low resolution limit                           29.94   29.95    1.11
Completeness                                   96.6   100.0    71.1
Multiplicity                                    5.5     5.6     2.5
I/sigma                                        10.1    74.7     0.1
Rmerge(I)                                     0.091   0.034   2.577
Rmerge(I+/-)                                  0.075   0.020   2.637
Rmeas(I)                                      0.099   0.037   3.149
Rmeas(I+/-)                                   0.090   0.024   3.530
Rpim(I)                                       0.040   0.015   1.766
Rpim(I+/-)                                    0.048   0.013   2.329
CC half                                       0.999   0.999   0.297
Wilson B factor                              11.430
Anomalous completeness                         93.4    98.1    53.1
Anomalous multiplicity                          3.0     3.5     1.5
Anomalous correlation                         0.495   0.790   0.068
Anomalous slope                               0.381
dF/F                                          0.093
dI/s(dI)                                      0.627
Total observations                            82773    5011    1330
Total unique                                  15109     893     533

dials-3.4 (after #1297):

High resolution limit                           1.09    2.95    1.09
Low resolution limit                           29.94   29.95    1.11
Completeness                                   97.1   100.0    72.7
Multiplicity                                    6.1     5.7     2.8
I/sigma                                         9.3    66.4     0.1
Rmerge(I)                                     0.108   0.036   2.861
Rmerge(I+/-)                                  0.091   0.022   2.589
Rmeas(I)                                      0.118   0.040   3.445
Rmeas(I+/-)                                   0.107   0.027   3.409
Rpim(I)                                       0.046   0.016   1.864
Rpim(I+/-)                                    0.056   0.014   2.196
CC half                                       0.998   0.999   0.233
Wilson B factor                              12.480
Anomalous completeness                         95.3    98.4    63.1
Anomalous multiplicity                          3.3     3.5     1.6
Anomalous correlation                         0.479   0.776   0.017
Anomalous slope                               0.430
dF/F                                          0.128
dI/s(dI)                                      0.736
Total observations                            92365    5124    1538
Total unique                                  15187     896     545

dials-3.3.5 (before #1297), processed to same resolution:

High resolution limit                           1.09    2.96    1.09
Low resolution limit                           29.94   29.95    1.11
Completeness                                   96.0   100.0    67.8
Multiplicity                                    5.1     5.2     2.3
I/sigma                                        11.1    83.2     0.1
Rmerge(I)                                     0.086   0.029   3.887
Rmerge(I+/-)                                  0.072   0.019   4.490
Rmeas(I)                                      0.095   0.032   4.820
Rmeas(I+/-)                                   0.087   0.022   6.103
Rpim(I)                                       0.040   0.013   2.791
Rpim(I+/-)                                    0.048   0.012   4.109
CC half                                       0.999   0.999   0.169
Wilson B factor                              11.460
Anomalous completeness                         91.4    92.6    46.5
Anomalous multiplicity                          2.8     3.3     1.4
Anomalous correlation                         0.434   0.743   0.040
Anomalous slope                               0.435
dF/F                                          0.091
dI/s(dI)                                      0.640
Total observations                            76654    4660    1176
Total unique                                  14995     892     512

@jbeilstenedmands
Copy link
Contributor Author

Updated the default threshold to be 0.75, to be more conservative, and this is also the value for the equivalent idea in XDS.

@jbeilstenedmands jbeilstenedmands marked this pull request as ready for review March 31, 2021 10:37
@graeme-winter
Copy link
Contributor

Functional review: in hindsight this problem is not as egregious as at first thought but the fix is very worthwhile and justifies a little fanfair.

Example data set / comparable images / before and after:

Screenshot 2021-03-31 at 12 47 59

With patches in this PR:

Screenshot 2021-03-31 at 12 46 43

As may be expected modest but worthwhile improvements in the quality of the merged data as recorded by the merging stats, no evidence of a negative impact -> thank you green light me

@graeme-winter graeme-winter self-requested a review March 31, 2021 11:51
Copy link
Contributor

@graeme-winter graeme-winter left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As above, change set looks good to go thank you

Agree with setting MIN_PK equivalent to 0.75

@ndevenish ndevenish merged commit fe9d3bc into main Mar 31, 2021
@ndevenish ndevenish deleted the filter_masked_reflections branch March 31, 2021 12:06
jbeilstenedmands added a commit that referenced this pull request Mar 31, 2021
…h fraction of invalid pixels. (#1640)

For profile-fitted integration, only integrate reflections that have a
high fraction of valid foreground pixels. Controlled by parameter
valid_foreground_threshold (range 0 -> 1).

* Add test to display broken profile-fitting behaviour.
* Screen out reflections for integration by profile-fitting where less
  than a given fraction of the foreground pixels are valid.
* Fix bug in integration terminal-report output
* Make threshold value 0.75 to be more conservative.
* Update further default values
* Add debug log statement for number of reflections filtered out
DiamondLightSource-build-server added a commit that referenced this pull request Mar 31, 2021
Features
--------

- ``dials.cosym``: Significantly faster via improved computation of functional, gradients and curvatures (#1639)
- ``dials.integrate``: Added parameter ``valid_foreground_threshold=``, to require a minimum fraction of valid pixels before profile fitting is attempted (#1640)

Bugfixes
--------

- ``dials.cosym``: Cache cases where Rij is undefined, rather than recalculating each time. This can have significant performance benefits when handling large numbers of sparse data sets. (#1634)
- ``dials.cosym``: Fix factor of 2 error when calculating target weights (#1635)
- ``dials.cosym``: Fix broken ``engine=scipy`` option (#1636)
- ``dials.integrate``: Reject reflections with a high number of invalid pixels, which were being integrated since 3.4.0. This restores better merging statistics, and prevents many reflections being incorrect profiled as zero-intensity. (#1640)
DiamondLightSource-build-server added a commit that referenced this pull request Apr 1, 2021
Features
--------

- ``dials.cosym``: Significantly faster via improved computation of functional, gradients and curvatures (#1639)
- ``dials.integrate``: Added parameter ``valid_foreground_threshold=``, to require a minimum fraction of valid pixels before profile fitting is attempted (#1640)

Bugfixes
--------

- ``dials.cosym``: Cache cases where Rij is undefined, rather than recalculating each time. This can have significant performance benefits when handling large numbers of sparse data sets. (#1634)
- ``dials.cosym``: Fix factor of 2 error when calculating target weights (#1635)
- ``dials.cosym``: Fix broken ``engine=scipy`` option (#1636)
- ``dials.integrate``: Reject reflections with a high number of invalid pixels, which were being integrated since 3.4.0. This restores better merging statistics, and prevents many reflections being incorrect profiled as zero-intensity. (#1640)
DiamondLightSource-build-server added a commit that referenced this pull request Apr 1, 2021
Features
--------

- ``dials.cosym``: Significantly faster via improved computation of functional, gradients and curvatures (#1639)
- ``dials.integrate``: Added parameter ``valid_foreground_threshold=``, to require a minimum fraction of valid pixels before profile fitting is attempted (#1640)

Bugfixes
--------

- ``dials.cosym``: Cache cases where Rij is undefined, rather than recalculating each time. This can have significant performance benefits when handling large numbers of sparse data sets. (#1634)
- ``dials.cosym``: Fix factor of 2 error when calculating target weights (#1635)
- ``dials.cosym``: Fix broken ``engine=scipy`` option (#1636)
- ``dials.integrate``: Reject reflections with a high number of invalid pixels, which were being integrated since 3.4.0. This restores better merging statistics, and prevents many reflections being incorrect profiled as zero-intensity. (#1640)
- Fix rare crash in symmetry calculations when no resolution limit could be calculated (#1641)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants