-
Notifications
You must be signed in to change notification settings - Fork 22
Record history in experiment, consolidate on experiment list write #816
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
string, and includes a UTC timestamp
At the moment, consolidation is just sorting by timestamp. We could start combining history items though for cases including parallel file writes (dials.stills_process)
|
Results of @phyy-nx's timing test on my laptop. For this branch: on main: so, actually a bit quicker on this branch, but maybe that's a fluke. Anyway, the branch is not slow. I think what would be interesting is to actually run a I want to do similar tests for other multi-experiment use cases, such as |
|
Thanks for running those tests.
Are the history lines created automatically? I would assume the individual programs would need to add them. As for how many lines in the final file, So, question for the cctbx.xfel group, how does |
Yes, it's automatic and will work for other packages that use dxtbx, not just DIALS. The only thing that is not is adding the flag to indicate an integrated or scaled file. This flag is to make it easier to extract the interesting lines for MTZ export. |
|
Regarding your other question, I was thinking about this too and I reckon just select the final (latest timestamp) |
|
Final time stamp makes sense. This looks awesome. @yangha7 is gonna test this on the cctbx.xfel side. Thanks! |
This is better as a method of the ExperimentList class, providing an easier interface to history management. When you want to append a new history line to each experiment, just call el.consolidate_histories(), which gives you the now unique History object attached to each experiment and append the line to that object.
|
Hey @dagewa, @yangha7 and I are getting this error: Anything you've seen before? |
|
Hmm interesting. I tried to check the libtbx build but it actually failed for me much earlier (that is, the build failed), so I've only checked the cmake build. I'll have another look |
|
My bad, I failed to put the line in |
|
Hi @dagewa, this is working for my now. I ran test_stills_process and got dials.stills_process in the history. Very nice. I also ran an xfel_regression test and the history for cctbx.xfel.merge looks like this: I believe this is because cctbx.xfel.merge doesn't use entry points based on my read of your history generation code. What do you think about this diff? With that I get this history: Which seems better, though the version string isn't great. The diff also accounts for running a custom script by putting the script name in. |
|
Thanks for checking @phyy-nx. I thought I had handled that case, but clearly no. I'm surprised that Is there a standard way to get a version number for |
- correctly get dispatcher names for libtbx builds - get script name when run as main
|
No standard version for a dev build of any kind that I know about. cctbx.xfel (at present) isn't versioned regardless, something I've been thinking about in context of setting up a conda build for it. |
|
@phyy-nx are you happy for this to go in? It gets the basic structure in place even if we tweak things like version number capturing a bit more later |
|
Yep! |
…ctbx#816) * Working towards History as an object that can be shared between experiments * Make History pickleable * Add History to Experiment as a shareable object * Function to get unique set of history objects in the experiment list * Add append_history_item function that controls the format of the history string, and includes a UTC timestamp * bug fix * Add functions to (de)serialize and consolidate history. At the moment, consolidation is just sorting by timestamp. We could start combining history items though for cases including parallel file writes (dials.stills_process) * Change constructor to require history lines * Test for history * tidying * Add a type annotation and a docstring * News * Rename newsfragments/xxx.feature to newsfragments/816.feature * Bugfix for experiment lists with zero length history * Tidy up consolidation of histories This is better as a method of the ExperimentList class, providing an easier interface to history management. When you want to append a new history line to each experiment, just call el.consolidate_histories(), which gives you the now unique History object attached to each experiment and append the line to that object. * Missed line in SConscript * Fix issue when an ExperimentList is saved in an interactive session * Fix idiotic error in 7f1f770 * Changes based on @phyy-nx's suggestion to: - correctly get dispatcher names for libtbx builds - get script name when run as main --------- Co-authored-by: DiamondLightSource-build-server <DiamondLightSource-build-server@users.noreply.github.com>
Third time's a charm?
Following @phyy-nx's suggestion,
Historyis now attached toExperimentwith a shared pointer, in the same way that the experimental models likeBeam,Detectorand so on are. This avoids the problem of losing history when it was part of the experiment list only (#811) and avoids excessive duplication of history data when there are lots of experiments (#814).When an
ExperimentListis saved, the histories of its componentExperiments are consolidated. At the moment, that just means combined into a singleHistoryobject and sorted by timestamp. If necessary, this consolidation could be made more aggressive, for example keeping only one entry per program. I'm not sure if this is necessary though.Some DIALS changes are needed following this PR (but not too many). One issue is that the
dials.importhistory line is thrown away bydials.index. Another thing to do is thatExperimentList.as_jsonnow has a couple of extra options:history_as_integrated=Falseandhistory_as_scaled=False. Setting these toTrueadds a flag to the history line written on that file save to indicate that this was an integration job, or scaling, or both. Being explicit about that will make it much easier to extract the relevant lines for the MTZ history, for dials/dials#2861With just this PR, a standard DIALS processing run for a single rotation data set ends up with
scaled.exptcontaining a block like thisI will investigate what happens with more complex multi-experiment cases, like
dials.stills_processandxia2.multiplex.