Skip to content

Add support for Queueserver Agent#212

Merged
thopkins32 merged 28 commits intobluesky:mainfrom
whs92:whs_qserver_zmq_v0.9.0
Dec 18, 2025
Merged

Add support for Queueserver Agent#212
thopkins32 merged 28 commits intobluesky:mainfrom
whs92:whs_qserver_zmq_v0.9.0

Conversation

@whs92
Copy link
Copy Markdown
Member

@whs92 whs92 commented Dec 14, 2025

In this merge request I try to address #110 by:

  • Adding an agent which
    • is based on the Ax Agent class but that emits JSON to a Queueserver rather than messages to a RE
    • Is connected to a Tiled Server through the EvalutationFunction
    • Observes documents through a ZMQ buffer to work out if the requested plans have finished
  • Adding a tutorial document which describes:
    • Changes required to a Queueserver Environment
    • Running plans remotely and observing the results using the agent

What's missing, is a link to a set of containers to start the required services. I think for people already using the queueserver the instructions will probably be sufficient, but I am interested to hear what others think.

I have tried to keep to the structure set out in v0.9.0 of blop where possible.

@thopkins32
Copy link
Copy Markdown
Collaborator

Thanks for the PR @whs92 ! I should be able to review and get this merged sometime this week.

For the longer term integration with bluesky-queueserver (or any external job queue), I am still thinking about a good design. Ax has some very informative docs related to how they do this in their framework which you may be interested in learning about: https://ax.dev/docs/orchestration

And also this tutorial as an example: https://ax.dev/docs/tutorials/automating/

@whs92
Copy link
Copy Markdown
Member Author

whs92 commented Dec 15, 2025

Thanks @thopkins32. The notes on the ax external job queue are really useful particularly: https://ax.dev/docs/orchestration/#orchestration-in-the-api

I think we can take these concepts on board once we have something like the class I have suggested in place. I'm not tied to anything I've written, it was just the first go.

Copy link
Copy Markdown
Collaborator

@thopkins32 thopkins32 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some minor changes and one point of discussion about the evaluation function. Otherwise it looks good to me!

For the tutorial, can you move it to the docs/wip/ directory? Otherwise, the docs won't build. I can try to write a follow-up PR that will allow it to run with local services (queueserver, zmq, tiled, etc.). I would like to avoid an external container, if possible, to make the setup explicit in the tutorial notebook.

Comment thread docs/source/tutorials/qserver-experiment.md Outdated
Comment thread docs/source/tutorials/qserver-experiment.md Outdated
Comment thread src/blop/ax/qserver_agent.py Outdated

# Variables used to keep track of the current optimization
self.current_itteration = 0
self.agent_suggestion_uid = None
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it possible to get the BlueskyRun uid from the plan that is run by the queueserver? That way you won't have to search the Tiled Catalog for this custom agent_suggestion_uid from the metadata but can index the catalog directly.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think unfortunately it's not possible yet. In part because you might not execute a plan immediately. You would have to subscribe to changes and be told when a plan was running and what's it's uid is. @tacaswell do you know otherwise?

I guess another approach would be if the uid can be specified in the md dict, but I guess then it wouldn't be guaranteed to be unique.

Adding my own key and then searching for it seemed like the easiest solution. Also, when I used this in a real application I needed to submit multiple plans then keep track of them and reconstruct data that was acquired. Adding various keys was helpful for this.

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added another comment where I think you can use the start document to get the uid if you prefer doing that. This can eliminate the search in your tutorial's evaluation function.

But otherwise, I don't see a problem with allowing arbitrary identifiers in the blop.protocols.EvaluationFunction protocol. I'm fine with both ways.

whs92 and others added 3 commits December 16, 2025 00:08
Co-authored-by: Thomas Hopkins <thomashopkins000@gmail.com>
Co-authored-by: Thomas Hopkins <thomashopkins000@gmail.com>
Co-authored-by: Thomas Hopkins <thomashopkins000@gmail.com>
Comment on lines +174 to +187
def _stop_doc_callback(self, start_doc, stop_doc):
"""
In here we can decide whether our experiment requested has completed

If it has completed, we can digest the data from it and move on to the next point.
"""

if self._listen_to_events:
# Mark the current acquisition as finished

logger.info("A stop document has been received, evaluating")

# Evaluate it with the evaluation function
outcomes = self.optimization_problem.evaluation_function(uid=self.agent_suggestion_uid, suggestions=self.trials)
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

start_doc should have a uid if you prefer to use that over searching. And this is called when the stop doc is generated so it should exist in the database at this point.

Copy link
Copy Markdown
Member Author

@whs92 whs92 Dec 16, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The problem arises if this callback is called for a different start doc, then we get the wrong uid.

If the queue is not empty, we'll get a few start and stop docs before the one we want. I think without being able to dictate or inspect the uid of a plan submitted to the QS, there is no way round searching

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah I see, so the callback is registered to the queue in general and not the specific plan you sent. Apologies for the confusion, I am not familiar with the queueserver api.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep, that's it. I'll add something to clarify in the tutorial. It's not obvious


# Variables used to keep track of the current optimization
self.current_itteration = 0
self.agent_suggestion_uid = None
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added another comment where I think you can use the start document to get the uid if you prefer doing that. This can eliminate the search in your tutorial's evaluation function.

But otherwise, I don't see a problem with allowing arbitrary identifiers in the blop.protocols.EvaluationFunction protocol. I'm fine with both ways.

@whs92
Copy link
Copy Markdown
Member Author

whs92 commented Dec 18, 2025

I've added the docs to the wip/docs directory, now they build. I have made some other issues (#214 #213) following on from this.

Copy link
Copy Markdown
Collaborator

@thopkins32 thopkins32 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Thanks @whs92 !

@thopkins32 thopkins32 merged commit 8ee29a3 into bluesky:main Dec 18, 2025
9 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants