Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

backend/processing/segmenter: Simplify deployment/usage for batch processing #378

Open
6 tasks
Tracked by #163
ethanjli opened this issue Mar 12, 2024 · 3 comments
Open
6 tasks
Tracked by #163
Assignees
Milestone

Comments

@ethanjli
Copy link
Member

ethanjli commented Mar 12, 2024

This is a tracking issue for our project to make the PlanktoScope segmenter easier to deploy/use for batch processing outside the PlanktoScope's Raspberry Pi.

Motivation

Currently, the segmenter is implemented as a worker/server which requires an MQTT broker, as well as something to generate an MQTT command for the segmenter. Thus, deployment of the PlanktoScope segmenter via https://github.com/PlanktoScope/pallet-segmenter is somewhat complicated, since it has to bring up an MQTT broker and a GUI to generate an MQTT command for the segmenter. This is unnecessary complexity if the user just wants to process some datasets without a GUI, and it may also make deployment on HPC clusters more challenging due to the requirement (which isn't inherently needed just for batch processing) for port 1883 to be available for MQTT communication at a minimum. If we could just initiate batch processing by launching the segmenter differently (e.g. with different command-line arguments), then we could avoid all these complexities and constraints in a common use-case for running the segmenter headlessly outside of the PlanktoScope's Raspberry Pi.

The relevance of this use-case is reflected by what Katie Crider & Margaret Mulholland are trying to do with PlanktoScopes for HABs monitoring (they want to batch-process datasets on an HPC cluster), and the issue of unnecessary complexity with MQTT for running the segmenter on other computers is validated by the changes Salima Rafai made in her version of the segmenter to run as a Python script without depending on MQTT.

Goals

  • Make it possible to invoke the segmenter for batch processing via command-line arguments, without any MQTT involved
  • Make it possible to run the segmenter for batch processing in a Jupyter Notebook.
  • Enable multiple instances of the segmenter to be launched for batch-processing different directories in parallel

Steps

  • Refactor the segmenter to separate MQTT from image-processing functionality
  • Make a nice Python API for image-processing functionality
  • Refactor the segmenter so that it can be launched as either a worker/server or a batch-processing command (i.e. make a command-line interface for the segmenter)
  • Ensure that the segmenter won't get blocked if nothing is receiving its object preview MJPEG stream (or if it does get blocked, fix that!)
  • Make it possible to install & launch the segmenter via pipx? (though maybe system libraries would still be needed for numpy, opencv, etc.)
  • ???

Unresolved Questions

  • Maybe it would also be useful to launch the segmenter to handle HTTP requests instead of MQTT commands? I haven't yet seen any concrete use-case (or any request for help with such a use-case) where this would be the simplest solution though, so this doesn't seem like an important thing to do yet.
@ethanjli ethanjli added this to the Backlog milestone Mar 12, 2024
@ethanjli ethanjli changed the title backend/processing/segmenter: Enable launching as a script for batch processing backend/processing/segmenter: Simply deployment/usage for batch processing Mar 12, 2024
@ethanjli ethanjli self-assigned this Mar 12, 2024
@ethanjli
Copy link
Member Author

This issue was discussed at the 2024-03-14 software meeting. We decided by consensus that we will not prioritize this item for now; FairScope might assign an incoming intern to work on this, depending on the interests of the interns.

@tstilwel
Copy link

Would it be possible to publish the python script without the MQTT dependencies mentioned above? For those not in the Planktoscope slack channel. Thanks!

@ethanjli
Copy link
Member Author

ethanjli commented Mar 18, 2024

Here is what was posted on the PlanktoScope Slack:

Note that I have not attempted to review the code in these attached files, and I have not attempted to run these scripts. From the descriptions, it sounds like these versions replace MQTT with a simple Python GUI.

If anyone else reading this discussion needs a way to run the segmenter without MQTT and the above files don't work for you, please add a reply comment describing your intended use-case! We use feedback from people running PlanktoScopes to help us (re)prioritize what we need to work on

@ethanjli ethanjli changed the title backend/processing/segmenter: Simply deployment/usage for batch processing backend/processing/segmenter: Simplify deployment/usage for batch processing Mar 19, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: 🔖 Ready
Development

No branches or pull requests

2 participants