Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

HICAT-994 Device servers using Python shared memory managers #243

Merged
merged 18 commits into from
Feb 9, 2022

Conversation

jamienoss
Copy link
Contributor

@jamienoss jamienoss commented Nov 30, 2021

Upstream PR for https://github.com/spacetelescope/hicat-package/pull/607.

See catkit/testbed/README.md

Changes to existing key features:

  • Infrastructure code has been factored out of Experiment into catkit.testbed.experiment.Testbed. This is mainly just the safety tests, however, the hicat derivation will also include ownership of all of the remote servers, i.e., the exception-handler, all device servers, the testbed state server, and the sim server.
  • Experiment.start() (that on the parent process) doesn't run the safety checks, these are now run on a child process by Experiment.Testbed.
  • Synchronization between processes, especially for halting upon a safety exception, is now implemented using events.
  • For concurrency to work, nothing should ideally be run on the parent process such that a call to Experiment.start() doesn't block the start of another experiment.
  • Experiment.start() doesn't wait and block until the experiment completes, it returns just as Process.start() does.
  • Experiment.join() has been added and mimics Process.join().
  • Experiment.run_experiment() (that run on the child process) monitors synchronization events from a dedicated monitor thread. This monitor thread will interrupt the main thread (that running Experiment.experiment()) upon certain events.
  • The outer script pattern is now:
with Testbed():
  experiment.start()
  experiment.join()
  • For concurrent usage, the out script pattern is now:
with Testbed():
  with experiment1:
    with experiment2:
      pass  # experment2.join() will then be called.
  • In the case that the desired start and wait/join order prevents the use of nested context managers, multiple try-finally blocks must be implemented such that all joins are executed in the correct finally statement.

@jamienoss jamienoss self-assigned this Nov 30, 2021
@jamienoss jamienoss force-pushed the feature/HICAT-994-device-server-using-mem-manager branch from 38095ad to 859fd71 Compare December 1, 2021 05:37
@@ -15,3 +18,24 @@ def apply_shape_to_both(self, dm1_shape, dm2_shape):
@abstractmethod
def apply_shape(self, dm_shape, dm_num):
"""Forms a command for a single DM, with zeros padded for the DM not in use."""

# # A "static" (non-autoproxy) proxy example. NOTE: Using this makes no difference to performance.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Kept as an example.

@jamienoss jamienoss force-pushed the feature/HICAT-994-device-server-using-mem-manager branch 3 times, most recently from f39ad0f to 7c762ac Compare December 3, 2021 22:04
@ehpor ehpor self-requested a review February 4, 2022 16:33
ehpor
ehpor previously approved these changes Feb 4, 2022
Copy link
Collaborator

@ehpor ehpor left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Aside from the (trivial) merge conflict during the merge of #240, I don't see much wrong with this PR. I approve this PR for merging, and will not do detailed notes since that's totally uninteresting for everyone.

Some general comments.

  • A lot of the code in this PR is very advanced, to the point that maintenance will likely (read: certainly) become an issue in the future. However, I feel that this is to some degree unavoidable with the current state of catkit.
  • The mutex patterns are still unclear to me. This is the main concern for me for future unexpected bugs (from a user perspective at least).
  • I really like the new Testbed class, making it do the safety checks rather than an experiment. Great separation of concerns.
  • Thanks for the slight refactor of the data log into a context manager.
  • Documentation is a bit sparse. Most internal comments are not accessible to a non-expert, and I presume even an expert that is reading it for the first time. However, catkit/testbed/README.md is a good start for people wanting to learn about the internals of catkit multiprocessing. Especially the example in "Hardware access full stack flow" is a good first step.
  • The signal/events for experiment lifetime and safety are very well structured. See also the arguments of Experiment.__init__() for a further description.

Comment on lines +11 to +14
STOP_EVENT = "catkit_stop_event"
FINISH_EVENT = "catkit_soft_stop_event"
SAFETY_EVENT = "catkit_safety_event"
SAFETY_BARRIER = "catkit_safety_barrier"
Copy link
Collaborator

@ehpor ehpor Feb 4, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No enum? No need to change this; just curious.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🤣 Touché

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll punt this to another PR.

Signed-off-by: James Noss <jnoss@stsci.edu>
Signed-off-by: James Noss <jnoss@stsci.edu>
Signed-off-by: James Noss <jnoss@stsci.edu>
Signed-off-by: James Noss <jnoss@stsci.edu>
Signed-off-by: James Noss <jnoss@stsci.edu>
Signed-off-by: James Noss <jnoss@stsci.edu>
Signed-off-by: James Noss <jnoss@stsci.edu>
Signed-off-by: James Noss <jnoss@stsci.edu>
Signed-off-by: James Noss <jnoss@stsci.edu>
Signed-off-by: James Noss <jnoss@stsci.edu>
Signed-off-by: James Noss <jnoss@stsci.edu>
 * Allow for daemonic experiments.
 * Return iterator proxy for Camera.stream_exposures.
 * Add base Instrument proxy that doesn't mutex all client sides calls.
 * Remove possible deadlock that could result in devices not being closed.
 * Add NamedEvent
 * Add FINISH_EVENT to aid user with synchronizing when to finish/stop concurrent experiments.

Signed-off-by: James Noss <jnoss@stsci.edu>
Signed-off-by: James Noss <jnoss@stsci.edu>
Signed-off-by: James Noss <jnoss@stsci.edu>
…e we used to

Signed-off-by: James Noss <jnoss@stsci.edu>
Signed-off-by: James Noss <jnoss@stsci.edu>
@jamienoss jamienoss force-pushed the feature/HICAT-994-device-server-using-mem-manager branch from 47dc705 to e38a6da Compare February 4, 2022 22:35
@jamienoss jamienoss marked this pull request as ready for review February 4, 2022 22:35
Signed-off-by: James Noss <jnoss@stsci.edu>
@jamienoss jamienoss marked this pull request as draft February 4, 2022 22:39
@jamienoss jamienoss marked this pull request as ready for review February 4, 2022 22:39
@jamienoss jamienoss marked this pull request as draft February 5, 2022 21:54
@jamienoss jamienoss marked this pull request as ready for review February 5, 2022 21:54
Signed-off-by: James Noss <jnoss@stsci.edu>
@jamienoss jamienoss force-pushed the feature/HICAT-994-device-server-using-mem-manager branch from a26dc27 to b1532c2 Compare February 6, 2022 15:06
@jamienoss jamienoss marked this pull request as draft February 6, 2022 15:07
@jamienoss jamienoss marked this pull request as ready for review February 6, 2022 15:07
@jamienoss
Copy link
Contributor Author

@ehpor Just for the sake of it, would you mind re-approving this PR, please? Note that it's just the upstream hicat CI that has failed, as expected.

@ehpor ehpor self-requested a review February 9, 2022 18:21
Copy link
Collaborator

@ehpor ehpor left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reapproved.

@jamienoss jamienoss merged commit 54cd177 into develop Feb 9, 2022
@jamienoss jamienoss deleted the feature/HICAT-994-device-server-using-mem-manager branch February 9, 2022 19:36
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants