Add support for Queueserver Agent#212
Conversation
|
Thanks for the PR @whs92 ! I should be able to review and get this merged sometime this week. For the longer term integration with bluesky-queueserver (or any external job queue), I am still thinking about a good design. Ax has some very informative docs related to how they do this in their framework which you may be interested in learning about: https://ax.dev/docs/orchestration And also this tutorial as an example: https://ax.dev/docs/tutorials/automating/ |
|
Thanks @thopkins32. The notes on the ax external job queue are really useful particularly: https://ax.dev/docs/orchestration/#orchestration-in-the-api I think we can take these concepts on board once we have something like the class I have suggested in place. I'm not tied to anything I've written, it was just the first go. |
thopkins32
left a comment
There was a problem hiding this comment.
Some minor changes and one point of discussion about the evaluation function. Otherwise it looks good to me!
For the tutorial, can you move it to the docs/wip/ directory? Otherwise, the docs won't build. I can try to write a follow-up PR that will allow it to run with local services (queueserver, zmq, tiled, etc.). I would like to avoid an external container, if possible, to make the setup explicit in the tutorial notebook.
|
|
||
| # Variables used to keep track of the current optimization | ||
| self.current_itteration = 0 | ||
| self.agent_suggestion_uid = None |
There was a problem hiding this comment.
Is it possible to get the BlueskyRun uid from the plan that is run by the queueserver? That way you won't have to search the Tiled Catalog for this custom agent_suggestion_uid from the metadata but can index the catalog directly.
There was a problem hiding this comment.
I think unfortunately it's not possible yet. In part because you might not execute a plan immediately. You would have to subscribe to changes and be told when a plan was running and what's it's uid is. @tacaswell do you know otherwise?
I guess another approach would be if the uid can be specified in the md dict, but I guess then it wouldn't be guaranteed to be unique.
Adding my own key and then searching for it seemed like the easiest solution. Also, when I used this in a real application I needed to submit multiple plans then keep track of them and reconstruct data that was acquired. Adding various keys was helpful for this.
There was a problem hiding this comment.
I added another comment where I think you can use the start document to get the uid if you prefer doing that. This can eliminate the search in your tutorial's evaluation function.
But otherwise, I don't see a problem with allowing arbitrary identifiers in the blop.protocols.EvaluationFunction protocol. I'm fine with both ways.
Co-authored-by: Thomas Hopkins <thomashopkins000@gmail.com>
Co-authored-by: Thomas Hopkins <thomashopkins000@gmail.com>
Co-authored-by: Thomas Hopkins <thomashopkins000@gmail.com>
| def _stop_doc_callback(self, start_doc, stop_doc): | ||
| """ | ||
| In here we can decide whether our experiment requested has completed | ||
|
|
||
| If it has completed, we can digest the data from it and move on to the next point. | ||
| """ | ||
|
|
||
| if self._listen_to_events: | ||
| # Mark the current acquisition as finished | ||
|
|
||
| logger.info("A stop document has been received, evaluating") | ||
|
|
||
| # Evaluate it with the evaluation function | ||
| outcomes = self.optimization_problem.evaluation_function(uid=self.agent_suggestion_uid, suggestions=self.trials) |
There was a problem hiding this comment.
start_doc should have a uid if you prefer to use that over searching. And this is called when the stop doc is generated so it should exist in the database at this point.
There was a problem hiding this comment.
The problem arises if this callback is called for a different start doc, then we get the wrong uid.
If the queue is not empty, we'll get a few start and stop docs before the one we want. I think without being able to dictate or inspect the uid of a plan submitted to the QS, there is no way round searching
There was a problem hiding this comment.
Ah I see, so the callback is registered to the queue in general and not the specific plan you sent. Apologies for the confusion, I am not familiar with the queueserver api.
There was a problem hiding this comment.
Yep, that's it. I'll add something to clarify in the tutorial. It's not obvious
|
|
||
| # Variables used to keep track of the current optimization | ||
| self.current_itteration = 0 | ||
| self.agent_suggestion_uid = None |
There was a problem hiding this comment.
I added another comment where I think you can use the start document to get the uid if you prefer doing that. This can eliminate the search in your tutorial's evaluation function.
But otherwise, I don't see a problem with allowing arbitrary identifiers in the blop.protocols.EvaluationFunction protocol. I'm fine with both ways.
thopkins32
left a comment
There was a problem hiding this comment.
LGTM. Thanks @whs92 !
In this merge request I try to address #110 by:
EvalutationFunctionWhat's missing, is a link to a set of containers to start the required services. I think for people already using the queueserver the instructions will probably be sufficient, but I am interested to hear what others think.
I have tried to keep to the structure set out in v0.9.0 of blop where possible.