-
Notifications
You must be signed in to change notification settings - Fork 46
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Determine scaling error model using anomalous groups. #1332
Conversation
Codecov Report
@@ Coverage Diff @@
## master #1332 +/- ##
==========================================
+ Coverage 64.19% 64.21% +0.02%
==========================================
Files 617 617
Lines 69784 69832 +48
Branches 9557 9566 +9
==========================================
+ Hits 44797 44846 +49
+ Misses 23218 23217 -1
Partials 1769 1769
Continue to review full report at Codecov.
|
@jbeilstenedmands this looks very likely to be exceedingly relevant. Is No. 2 on my queue this morning to review :-) thank you |
Thanks, this is not going to solve the anomalous issue but I think is a step on the right path. |
Yes, I think this moves the right way. Your explanation makes sense, even to me :-) |
LIC strong #1 - easy, big anomalous signal Processing following main sequence, scaling with dials on master:
The same, on the branch
Enough of an improvement that I'm keen to have this in master, looks good. The tests all pass so 👍 Will look at the diffs now |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks sensible, and makes a measurable improvement, thank you.
I do think we should have some program documentation which explains how this works though - it's important for the user to be able to understand this without reading the code.
While investigating anomalous signal in DIALS processing #1215 , I have realised that anomalous pairs should be separated for error model determination in scaling.
Consider a low resolution reflection with a large anomalous difference measured many times. The I+ reflections should be normally distributed around I+, and the I- around I-. If the separation is large, then it is clearly wrong to consider that they should be normally distributed around Imean (which is the current assumption underlying the error model minimisation); and doing so should lead to an overinflation of the errors (larger error model parameters) and hence reduction in metrics such as dI/s(dI). I believe it is correct therefore that anomalous data should always be separated for error modelling, regardless of whether anomalous data is separate for scaling model determination.
I have tested this change on a selection of test datasets: In summary, the change in refined error model parameters leads to an overall increase in I/sigma, dI/s and anomalous slope, with some datasets affected more than others. The effect on anomalous correlation seems variable.
Beta-lactamase
x4-wide
Insulin (dataset 5, issue #1215)
Thaumatin (dataset 10, issue #1215) Less affected
weak thermolysin. Less affected