- Remove ETS-specific path cleaning code (#35)
- Module paths are now prepended to
sys.pathinstead of appended (#26)
JobMonitormore resilient to SMTP settings problems (#34)
- Improve exception handling when trying to send back job results.
- Heartbeats start before fetching input to prevent invalid crash detection
- Test against 3.4 instead of 3.3 on Travis
- Updated copyright notices to say 2014
- Add INFO-level logging messages about how jobs are running
- Switch to using
importlibinstead of using
- Job name is no longer set using DRMAA native specification and instead uses
Just fixed a couple minor issues.
exceptionis now the properly set as the cause of death when a job encounters an exception.
- Fixed a potential memory leak in the
qmasterprocess caused by not cleaning up job info as recommended in the DRMAA Python documentation.
- Changed default
Noneto be more Pythonic, instead of -1 like it was before.
With the previous release things could still go wrong if a process died at just the wrong moment when we're trying to get it's status, so I've added some exception handling to take care of that. I've also:
- Added a
- Fixed an issue where log files weren't being attached to error reports.
- Changed the wording of some logging messages.
This release mostly features greatly improved reliability of stalled job detection, but also includes some refactoring. Here's the complete list:
- Modified CPU load calculations used to determine if a job is stalled now included all of the children of a process. Before, if a parent process was sleeping and children were doing all the work, the job would get incorrectly detected as stalled and be resubmitted. This was particularly problematic for SKLL.
- CPU usage and memory histories are now reset when a job is resubmitted. This means error emails will contain more sensible graphs for resubmitted jobs.
- Now raise a
JobExceptionif we give up on a job instead of ending up in a bad state.
SEND_ERROR_MAILSenvironment variable to
- Removed deprecated
pg_mapfunction. It was replaced by
runnermodule from generated API documentation, because no one should really need to use it directly.
- Added missing
- Added a bunch more unit tests.
Bug-fix release. Changes are:
JobMonitoris now a context manager so that all jobs get killed when an exception occurs.
- All jobs are now killed if a single job encounters an exception.
- Jobs no longer pass back exceptions as strings, where they go totally unnoticed.
- Cleaned up debug output a bit.
- Much prettier tracebacks for job exceptions.
Just a minor bugfix release. Changes are:
- Switched to using official version of drmaa-python, because it is now up-to-date on PyPI.
- Added .gitatttributes file to keep line endings normalized.
- Fixed issue where jobs that exceed resubmission limit would infinitely send error reports.
- We now check to see if the DRMAA C library was imported correctly and switch to local mode if it wasn't, instead of just crashing (#19).
- Switch to using psutil for process CPU and memory stats (#18).