dials.integrate background.algorithm=simple for integrating detectors #706

dagewa · 2019-02-13T10:00:06Z

The default background.algorithm=glm is the best choice for photon-counting detectors, however my experience suggests that background.algorithm=simple is more appropriate for integrating detectors. The reason seems to be simply the inverse of the logic that dictates why the GLM approach is better for counts: pixel values on integrating detectors are better described by a normal distribution than a Poisson distribution. Assuming Poisson will tend to overestimate the background level. As a result, integrating beyond the resolution limit leads to intensities with a negative bias. This does not just affect CCDs, but also other types of integrating detector, such as certain types of detector used in electron diffraction and most likely detectors used at the XFEL, like the CS-PAD.

At the moment, the background algorithm is a user choice, but very few people know that the default may not be the most appropriate choice. Are there ways in which we could make this choice automatically? It seems like the detector_helper_sensors could be put to work here (though it is not a 1-1 mapping, some chips allow operation in either counting or integration modes).

The text was updated successfully, but these errors were encountered:

phyy-nx · 2019-02-13T17:19:31Z

For dials.stills_process the background algorithm default is simple, for the reasons you describe. Most xfel data has been collected on these integrated detectors. But with a lot of pilatus serial data being collected it's likely worth revisiting here too. I'd be onboard with adding an 'auto' option here and making that the default.

…

-Aaron

On Wed, Feb 13, 2019 at 2:00 AM David Waterman ***@***.***> wrote: The default background.algorithm=glm is the best choice for photon-counting detectors, however my experience suggests that background.algorithm=simple is more appropriate for integrating detectors. The reason seems to be simply the inverse of the logic that dictates why the GLM approach is better for counts: pixel values on integrating detectors are better described by a normal distribution than a Poisson distribution. Assuming Poisson will tend to overestimate the background level. As a result, integrating beyond the resolution limit leads to intensities with a negative bias. This does not just affect CCDs, but also other types of integrating detector, such as certain types of detector used in electron diffraction and most likely detectors used at the XFEL, like the CS-PAD. At the moment, the background algorithm is a user choice, but very few people know that the default may not be the most appropriate choice. Are there ways in which we could make this choice automatically? It seems like the detector_helper_sensors <https://github.com/cctbx/cctbx_project/blob/1fce4707574dc7e5d5e134dd2b4d0c6f064e3490/dxtbx/model/detector_helpers.py#L109-L112> could be put to work here (though it is not a 1-1 mapping, some chips allow operation in either counting or integration modes). — You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub <#706>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AM2OyiZx5jOAp4sjDY0YSOuvTg4VdYPwks5vM-IngaJpZM4a5B0K> .

dagewa · 2019-02-14T09:31:08Z

Good point - an inappropriate default can work both ways. Auto should solve most issues for detectors we know about. For detector types that could be either counting or integrating there is some chance the appropriate response model would be recorded in metadata (he says, while fighting natural pessimism)

dagewa · 2019-02-28T11:41:46Z

@jmp1985 wondering if you have an opinion here?

jmp1985 · 2019-02-28T13:51:56Z

@dagewa I agree that having an auto which knows about the detector type would be useful. In general, if the pixels can assume negative values then the glm method may be the best solution; it is hard to argue that this could be correctly modelled by a Poisson distribution (in particular thinking of those data you had which had average background < 0!)

dagewa · 2019-03-05T16:21:09Z

It sounds like to do this right we need something additional to the sensor type property. At minimum an integrating vs counting choice that is set by the format class, then used by background.algorithm=auto within dials.integrate.

Even better would be a measure of detector response statistics. Back in Waterman & Evans (2010) we summarised this by a 'cascade factor' γ and 'pixel factor' ψ, and showed how the expressions for integrated intensity errors could be extended to take these into account. For a (true) photon counting detector γ=1 and ψ=0, so that's easy. Unfortunately for integrating detectors the actual values of these parameters are not well-known and can be difficult to measure.

It might be reasonable to add these parameters to the detector model anyway, and use γ≠1 simply as a flag to determine the behaviour of background.algorithm=auto without yet extending the integrated intensity error estimates. At the moment, the choice of background algorithm can be critical and can be the difference between successful integration or not for data from some types of detector.

I'd welcome other opinions on this.

phyy-nx · 2019-03-06T17:23:19Z

Sounds interesting. What would the default cascade and pixel factors for integrating detectors be? Would putative integration code that uses these factors suffer if the defaults were incorrect?

…

On Tue, Mar 5, 2019 at 8:23 AM David Waterman ***@***.***> wrote: It sounds like to do this right we need something additional to the sensor type property. At minimum an integrating vs counting choice that is set by the format class, then used by background.algorithm=auto within dials.integrate. Even better would be a measure of detector response statistics. Back in Waterman & Evans (2010) <http://journals.iucr.org/j/issues/2010/06/00/ea5116/index.html> we summarised this by a 'cascade factor' γ and 'pixel factor' ψ, and showed how the expressions for integrated intensity errors could be extended to take these into account. For a (true) photon counting detector γ=1 and ψ=0, so that's easy. Unfortunately for integrating detectors the actual values of these parameters are not well-known and can be difficult to measure. It might be reasonable to add these parameters to the detector model anyway, and use γ≠1 simply as a flag to determine the behaviour of background.algorithm=auto without yet extending the integrated intensity error estimates. At the moment, the choice of background algorithm can be critical and can be the difference between successful integration or not for data from some types of detector. I'd welcome other opinions on this. — You are receiving this because you commented. Reply to this email directly, view it on GitHub <#706 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AM2Oyn2cYWixCnJRbyRdykCbsOT-H9e3ks5vTpnpgaJpZM4a5B0K> .

dagewa · 2019-03-07T12:22:13Z

For a Rayonix CCD module we measured the cascade factor γ=1.41, which is not too far off a value obtained by simulation using manufacturer specs, which gave γ=1.54. The pixel factor used in simulation was ψ=4.38. We didn't compare this with measurement, but the true value is probably close. This factor relies mainly on read noise, which is published by manufacturers. ψ has a lower limit of 1/12 set by digitisation error.

I don't know what the defaults should be for other detectors though. It would be quite a bit of work to figure them all out. CCDs are probably fairly similar, but we can't be sure without some experiments (which no one is likely to do).

Mainly the effect would be to mitigate the underestimation of integrated intensity errors rather than affect the intensities themselves, although profile fitting results would be affected slightly as the errors are used during the fit. It is already well known that the errors on integrated intensities are wrong, which is usually 'dealt with' by inflating them during scaling (see SdFac, SdAdd and SdB in Aimless). By providing less bad initial estimates the main effect would probably be that the inflation factor becomes smaller. So, I think the effect might be fairly minor, but at least the model would be more realistic (unless the factors are given crazy values). Perhaps it would even be possible to optimise the parameters for particular detector types according to how much the scaling program has to inflate errors afterwards, avoiding a lot of difficult experiments.

I suspect the value of investing much time in this is limited though, at least for CCDs (CS PAD etc. may be a different matter). At the moment I would be more interested in using this merely as a flag to indicate when a detector was integrating, not counting, and the simple background algorithm should be switched on.

phyy-nx · 2019-03-07T16:09:59Z

Thanks. I'm not sure introducing a value that is always wrong for a subset of detectors (because it's unknown) is the right call. I'd be more inclined to use an enum, integration vs. counting. I'd like to use an enum because of the potential possibility of other detector types (mixed-mode ADSC for example). Not that strong of an opinion though :)

…

On Thu, Mar 7, 2019 at 4:22 AM David Waterman ***@***.***> wrote: For a Rayonix CCD module we measured the cascade factor γ=1.41, which is not too far off a value obtained by simulation using manufacturer specs, which gave γ=1.54. The pixel factor used in simulation was ψ=4.38. We didn't compare this with measurement, but the true value is probably close. This factor relies mainly on read noise, which is published by manufacturers. ψ has a lower limit of 1/12 set by digitisation error. I don't know what the defaults should be for other detectors though. It would be quite a bit of work to figure them all out. CCDs are probably fairly similar, but we can't be sure without some experiments (which no one is likely to do). Mainly the effect would be to mitigate the underestimation of integrated intensity errors rather than affect the intensities themselves, although profile fitting results would be affected slightly as the errors are used during the fit. It is already well known that the errors on integrated intensities are wrong, which is usually 'dealt with' by inflating them during scaling (see SdFac, SdAdd and SdB in Aimless). By providing less bad initial estimates the main effect would probably be that the inflation factor becomes smaller. So, I think the effect might be fairly minor, but at least the model would be more realistic (unless the factors are given crazy values). Perhaps it would even be possible to optimise the parameters for particular detector types according to how much the scaling program has to inflate errors afterwards, avoiding a lot of difficult experiments. I suspect the value of investing much time in this is limited though, at least for CCDs (CS PAD etc. may be a different matter). At the moment I would be more interested in using this merely as a flag to indicate when a detector was integrating, not counting, and the simple background algorithm should be switched on. — You are receiving this because you commented. Reply to this email directly, view it on GitHub <#706 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AM2OyuF5frU14QTxaAFnigBkEtTn5wIxks5vUQR2gaJpZM4a5B0K> .

graeme-winter · 2019-03-07T16:12:10Z

We have CCD / PAD type already which is incomplete and ambiguous I can see a more complete description of the detector tech being ++ useful If we can classify with enough info that we can make sense, would be useful - integrating / counting / direct conversion / DGW parameters optionally would help things. Certainly integrating vs. counting would be useful for eiger / pilots / timepix / jungfrau / cspad etc.

dagewa · 2019-05-02T09:32:16Z

I agree we should keep things simple at first, starting with an integrating vs counting enum, allowing future expansion. This should be set by the format class, depending on the detector type and mode of operation. This alone will allow background.algorithm=auto to be implemented.

The current default behaviour is inappropriate for integrating detectors such as CCDs and the CS-PAD, so I see this as an important enhancement, and I would suggest a useful feature to have in place for DIALS 2.0.

dagewa · 2019-05-02T09:33:12Z

@jmp1985 I am happy to start the work here as I raised the issue, but it'd be good to have a conversation first about how to go about it.

jmp1985 · 2019-05-02T11:14:01Z

Hi @dagewa. I will pick this up when I am back in Diamond next week.

jmp1985 · 2019-05-07T12:21:39Z

Hi @dagewa. I think it would be very useful in the long term to have the more thorough description you suggest.

Another simple way of selecting with auto would be to simply query the trusted range of the detector. Counting detectors should not have negative pixels in the trusted range so we can use glm, otherwise use simple. Or are there cases where the pixel values are non-negative but we are still unable to make the assumption of the pixel values being Poisson distributed?

dagewa · 2019-05-07T12:53:38Z

@jmp1985 Interesting point. For CCDs and certain types of detector used in electron microscopes it is true that negative values are 'allowed' measurements, assuming that any applied bias or pedestal level has been subtracted from the data first. In those cases it is certainly true that the trusted_range should start from a negative value, if it is set correctly.

It would be interesting to query the registry and look at what trusted_range is currently set to for each Format. I wonder if we are setting this appropriately in each case.

This wouldn't work in every case though. The mixed mode detector @phyy-nx mentions would an example where the assumption breaks down.

dagewa · 2019-05-07T15:47:47Z

After a quick look I think it is clear we don't handle trusted_range properly, or at least we can't trust this to identify integrating detectors. There are image plates and CCDs where trusted_range is set to start from 0 (see FormatRAXISII.py, FormatDIP2030b.py, FormatSMVRigakuSaturn.py) and some weird logic used for SMV ADSC, where raw_data has the image pedestal value subtracted from it, but underload = pedestal - 1

dagewa · 2019-12-05T16:10:19Z

It seems the proposals here got lost by becoming too complicated.

How about I start by creating background.algorithm=auto as the default option and simply set this to glm if gain==1 and simple otherwise? I assert that background.algorithm=glm is not what you want when gain != 1

Choose between glm and simple methods depending on whether the detector appears to be counting or not (#706)

dagewa added the dials.integrate label Feb 13, 2019

dagewa added this to the DIALS 2.0 milestone May 2, 2019

dagewa added this to To do in DIALS 2.0 via automation Jul 2, 2019

dagewa modified the milestones: DIALS 2.0, DIALS 2.1 Jul 5, 2019

dagewa removed this from To do in DIALS 2.0 Jul 5, 2019

dagewa added a commit that referenced this issue Dec 6, 2019

Add AutoBackgroundExt

168c0d2

Choose between glm and simple methods depending on whether the detector appears to be counting or not (#706)

dagewa mentioned this issue Dec 6, 2019

Add AutoBackgroundExt #1063

Merged

dagewa added a commit that referenced this issue Dec 16, 2019

News. Closes #706

f232bdd

dagewa closed this as completed in e8eafc9 Dec 18, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

dials.integrate background.algorithm=simple for integrating detectors #706

dials.integrate background.algorithm=simple for integrating detectors #706

dagewa commented Feb 13, 2019

phyy-nx commented Feb 13, 2019 via email

dagewa commented Feb 14, 2019

dagewa commented Feb 28, 2019

jmp1985 commented Feb 28, 2019

dagewa commented Mar 5, 2019

phyy-nx commented Mar 6, 2019 via email

dagewa commented Mar 7, 2019

phyy-nx commented Mar 7, 2019 via email

graeme-winter commented Mar 7, 2019 via email

dagewa commented May 2, 2019

dagewa commented May 2, 2019

jmp1985 commented May 2, 2019

jmp1985 commented May 7, 2019

dagewa commented May 7, 2019

dagewa commented May 7, 2019

dagewa commented Dec 5, 2019 •

edited

dials.integrate background.algorithm=simple for integrating detectors #706

dials.integrate background.algorithm=simple for integrating detectors #706

Comments

dagewa commented Feb 13, 2019

phyy-nx commented Feb 13, 2019 via email

dagewa commented Feb 14, 2019

dagewa commented Feb 28, 2019

jmp1985 commented Feb 28, 2019

dagewa commented Mar 5, 2019

phyy-nx commented Mar 6, 2019 via email

dagewa commented Mar 7, 2019

phyy-nx commented Mar 7, 2019 via email

graeme-winter commented Mar 7, 2019 via email

dagewa commented May 2, 2019

dagewa commented May 2, 2019

jmp1985 commented May 2, 2019

jmp1985 commented May 7, 2019

dagewa commented May 7, 2019

dagewa commented May 7, 2019

dagewa commented Dec 5, 2019 • edited

dagewa commented Dec 5, 2019 •

edited