You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
In BackgroundJobQueue.run_pending_job, out of memory errors are not caught and handled (since java.lang.OutOfMemoryError exceptions are not caught by StandardError in Ruby) so the watchdog thread continues reporting that the job is running, but it will never complete because the actual job has been destroyed in memory. Restarting the backend will pick the job back up and likely result in the same error. If the job is cancelled then the service still needs to be restarted because jobs are no longer being processed.
In my testing, I was able to include handling for java.lang.OutOfMemoryError specifically (actually I was dropping down to java.lang.VirtualMachineError instead, but that was as far as I was comfortable going at the time) which would allow us to properly fail the job and continue processing.
That being said, there is a lot of discussion around concerning if out of memory errors should be "handled" so I didn't want to submit a PR without further discussion.
The text was updated successfully, but these errors were encountered:
@cposton I would like to close this issue. Our philosophy is that out-of-memory errors should not be handled by the ArchivesSpace application but should instead be managed by the institution installing ArchivesSpace. That being said, are you still having issues with this?
I understand and agree with the stated philosophy. We have since modified memory allocations and have not had to deal with the issue since as far as I am aware.
In
BackgroundJobQueue.run_pending_job
, out of memory errors are not caught and handled (sincejava.lang.OutOfMemoryError
exceptions are not caught by StandardError in Ruby) so the watchdog thread continues reporting that the job is running, but it will never complete because the actual job has been destroyed in memory. Restarting the backend will pick the job back up and likely result in the same error. If the job is cancelled then the service still needs to be restarted because jobs are no longer being processed.In my testing, I was able to include handling for
java.lang.OutOfMemoryError
specifically (actually I was dropping down tojava.lang.VirtualMachineError
instead, but that was as far as I was comfortable going at the time) which would allow us to properly fail the job and continue processing.That being said, there is a lot of discussion around concerning if out of memory errors should be "handled" so I didn't want to submit a PR without further discussion.
The text was updated successfully, but these errors were encountered: