New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Process and agent state management #64
Comments
While re-reading this I noticed that I should put this in writing too: I guess if we could live with saying that it is the users responsibility to always have exactly one process of each type running, we could skip the .pid file part and go straight to the internal state database. |
My suggestion for the processes would be anyway to have them launched separately by Systemd. That way the system can ensure that things are restarted, … If you want to use SysV-init (or whatever) instead, then let that create the pid files. A status table, it makes sense. Though here are a few thoughts:
|
|
A keep-alive would be perfect for #48. |
Continue keep-alive issue in #76 to keep this tidy. |
While looking at #53 and thinking about how this could be implemented the best way, I've noticed a few issues:
Completely separate processes
With #52 we gained the ability to launch the capture, schedule and ingest processes separately. I think that this is an important feature (ability to run as independent services on the system), but makes it hard to know which processes are actually running.
My proposal for this is to create .pid files for each service (even if we use
run_all
) which have to be checked before startup, if they exist, check if the process with the pid is still alive. If so exit, else start process. Obviously we should only ever have one process of each type, otherwise we will place ourselves in a special kind of hell.Agent state management
This is a tricky one, as it is tied to the constraints of capture agent states in opencast. We are only ever able to define one state.
But what if we start recording, while a ingest process ist still working, the ingest process finishes and sets the state to
idle
although we have not finished recording yet? This is a possible scenario with tightly clocked events and slow uplink.My proposal for this is to implement a internal state table for each process (
working
oridle
), which will then be used to set the state according to a priority list:offline
capturing
uploading
shutting_down
(not used anywhere yet)idle
Say every process but the capture process is
idle
, then we would set the agent state tocapturing
. Now our ingest process startsworking
. We do not change the agent state touploading
, becausecapturing
supercedes it. As soon as the capture process isidle
again, the ingest process is firstworking
in priority list, so agent state is nowuploading
, etc.With respect to my comment in #53 the behaviour for the scheduler process needs to be special: if the scheduler process exists, the internal state is
idle
. If there is no scheduler process the internal state isworking
(okay, that is a bad name. suggestions?), we are absolutelyoffline
. If any other process would take precedence over this, it would give the illusion that the agent is ready to fetch new scheduled events.The text was updated successfully, but these errors were encountered: