-
Notifications
You must be signed in to change notification settings - Fork 46
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Separate independent blocks of experiments in refinement #2336
Separate independent blocks of experiments in refinement #2336
Conversation
Still to do: - Report on which ids are in each block (table in log file) - Rejoin results after refining each block - Report on overall RMSDs by experiment after rejoining - Tests
Also, return None for refiner and history if refinement has been done in disjoint blocks. In such a case, refuse to write out certain types of debugging/analysis files.
compared with not, so put this behaviour under user control.
Codecov Report
@@ Coverage Diff @@
## main #2336 +/- ##
==========================================
+ Coverage 78.62% 82.99% +4.36%
==========================================
Files 603 593 -10
Lines 73644 68676 -4968
Branches 10005 9246 -759
==========================================
- Hits 57903 56998 -905
+ Misses 13612 9539 -4073
- Partials 2129 2139 +10 |
Very interesting. I believe Brewster 2018, Figure 6 is exactly this use case. The blue curve is normal indexing with a slightly wrong model, the green is refining a detector model per image and the red is joint refinement with a single detector model. Both red and green go to the right location, but green has a wide distribution due to independence of each sample. This was with what would have been the old default of I deposited the expt files needed to make those plots but apparently i didn't deposit the refl files, so repeating that exactly isn't easy. I'll see what I can to reproduce it on this branch. |
To reassure that differences in refinement are not due to a bug introduced by this change we can compare output if we use the basic Gauss–Newton engine, with no outlier rejection, and restricting the number of steps to make sure both blocks take the same number of steps (in fact, in this case, they do anyway):
In both cases the final RMSD table is
|
Ok, here's what I did. I re-indexed run 29 of the thermolysin data in Brewster 2018, but changing Z by 1mm. I then grabbed 1000 images and did hierarchy level 0 dials.refine on the main branch, with 1 or N detector models. Result looks like Figure 6 from the paper: Blue is before refinement, orange is refinement with N detector models and green is refinement with 1 detector model. Orange is wider than green as expected. Next I checked out this branch and refined with 1 detector model using the default separate_independent_sets=True, and with N detector models with separate_independent_sets=False. I verified I got identical unit cell distributions as on main. Note, the latter case took 4 minutes and 5 seconds. Finally I refined with N detector models using separate_independent_sets=True. Here however I hit a problem. The program ran for a while and then my interactive node at LCLS became unresponsive and I was kicked off. I moved the data to the 64 core dials.lbl.gov and tried again. The program ran for 22 minutes before it crashed strangely with just the message Killed. I was watching Not sure where to go from here. I've been curious about this since we wrote that paper since refining independently should give the same result as refining together if the models are independent, but it doesn't, likely given the reasons you gave above @dagewa. So I hope there is a solution to this memory problem. Regardless, I've put the files in a google drive and shared them with you if it helps. |
Thanks @phyy-nx, that's interesting. I'm not sure why the memory usage is so high this way, but I'll try to profile it. |
I can see the memory problem. It's because of
Each copy of |
I can work around that, but I've realised there is another problem. In your I guess it would be better to do the outlier rejection step prior to splitting, so it is really only the refinement that is done in independent sets. However, this is going to be a somewhat more significant change. At the moment, outlier rejection is done as part of the construction of a |
Hi @phyy-nx, with the change in 1e4763d this job:
completes in 1m25s on my laptop and uses very little memory. So, that's as it should be. But it's still not the job you want to be able to run. I had to set Separating the outlier rejection from the rest of the construction of the |
Ok, good to hear! |
1. _build_reflection_manager_and_predictor 2. _build_refiner This allows to interrupt building the refiner after outlier rejection is done. Add a method, reflections_after_outlier_rejection, which does this.
With those last changes,
and the same on main. The resulting tables of |
Hi @graeme-winter, I don't want to merge this without a review / testing from you, as this came from your use case in #2235 |
Noted, gimme a mo |
OK, ran a quick test against a 10 experiment lysozyme data set ->
Meanwhile,
Seems to have substantially reduced the wall clock time |
Modest but probably useful reduction in memory footprint
|
Output:
I consider these results to be equivalent |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Change set delivers on promise of reducing resource requirements without meaningfully affecting the quality of output -> win
Would ask: in the loop-over-data-set part for refinement please can we print the experiment ID being refined, as a progress report for the impatient user?
Otherwise looks excellent, thank you
Thank you. Looks like a useful set of suggestions / comments, which I will work through soon. |
…separate refinement jobs on real data
I've addressed all comments now, so just waiting for the green ticks. |
For #2235.
This changeset identifies subsets of the experiment list that are independent (i.e. don't share any models with other subsets) and performs refinement separately for each of these groups. This is put under control of the user of
dials.refine
via theseparate_independent_sets
parameter (True
by default).It turned out to be much easier to do this at a high level before any
Refiner
s are made rather than attempting a precise analysis of the structure of the refinement problem. For that reason, this is not attempted at all if any constraints or restraints are present, as these may link the models in a way that is not clear from the experiment list alone.Note that refinement results are not the same if refinement is done in separate groups compared to all at once. There could be many reasons for this:
Currently draft to assess the impact of the change. @phyy-nx, I'm interested to hear if this will affect any of your use cases. Note, if anything is shared between all the models (such as a beam or detector) then this will have no effect. Groups of experiments must be strictly independent from other groups for this to trigger. Also, as this occurs at a high level (inside the
run_dials_refine
function of the command line program) it does not affect other usages ofRefiner
s, such as indexing.I have not yet added any new tests or attempted to perform the refinement of separate groups in parallel. That could be interesting, but could be expensive in terms of memory usage.