Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update lock_module decorator to handle abandoned locks #1276

Merged
merged 14 commits into from
Jun 5, 2023

Conversation

BradleySappington
Copy link
Collaborator

@BradleySappington BradleySappington commented May 31, 2023

Update @lock_module decorator

When any protected module kicks off, it will check if the lock exists.
If it DOES exist, it will extract the PID from the file and do the following
If the PID is currently running
The command will exit because its locked.
If the PID isn’t running
It will delete the erroneous lock file
It will email jwql@stsci.edu to let us know that something happened with the previous run (no
action needed on our end, but will help us answer questions from scientists as to why their data is available)
It will start a fresh run
If there is a bad lock file without a PID in it
We will receive an email with instructions to manually investigate and the process will NOT currently run (this allows us to manually lock something for any reason we want by creating an empty lock file) We will also get emailed every time until the bad lock file is removed so we wont forget about it.

@BradleySappington BradleySappington self-assigned this May 31, 2023
@BradleySappington BradleySappington linked an issue May 31, 2023 that may be closed by this pull request
@BradleySappington BradleySappington changed the title WIP: Clean locks Update lock_module decorator to handle abandoned locks Jun 1, 2023
@BradleySappington
Copy link
Collaborator Author

Ready for review

Copy link
Collaborator

@melanieclarke melanieclarke left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Testing locally, changing the notification address to my email, this looks good. It behaved appropriately for all conditions I checked:

  • missing PID sent email
  • running after a killed process sent email
  • a successfully completed process removed its lock file
  • running again while previous process was running did nothing

However, the test in test_protect_module.py needs to be modified. It currently sends a "No PID found" email every time it runs.

@BradleySappington
Copy link
Collaborator Author

@melanieclarke thanks for the test reminder!

@BradleySappington
Copy link
Collaborator Author

@melanieclarke - tests updated

@mfixstsci
Copy link
Collaborator

Thanks for reviewing and approving @melanieclarke. @BradleySappington are you okay with me updating the branch and merging? Do you have any other features to add?

@BradleySappington
Copy link
Collaborator Author

@mfixstsci - This is ready for merge

@mfixstsci mfixstsci merged commit 853ee7c into spacetelescope:develop Jun 5, 2023
6 checks passed
@BradleySappington BradleySappington deleted the clean_locks branch June 30, 2023 17:05
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Automated check for failed jobs
3 participants