GridMap 0.13.0

@dan-blanchard dan-blanchard released this Oct 8, 2014 · 36 commits to master since this release

Fixes:

  • Remove ETS-specific path cleaning code (#35)
  • Module paths are now prepended to sys.path instead of appended (#26)
  • Made JobMonitor more resilient to SMTP settings problems (#34)

Improvements:

  • Improve exception handling when trying to send back job results.
  • Heartbeats start before fetching input to prevent invalid crash detection
  • Test against 3.4 instead of 3.3 on Travis
  • Updated copyright notices to say 2014
  • Add INFO-level logging messages about how jobs are running
  • Switch to using importlib instead of using __import__
  • Job name is no longer set using DRMAA native specification and instead uses JobTemplate.jobName.

Downloads

GridMap 0.12.5

@dan-blanchard dan-blanchard released this Aug 4, 2014 · 56 commits to master since this release

Fix issue where _process_jobs_locally would not work with max_processes > 1

Downloads

Version 0.12.4

@dan-blanchard dan-blanchard released this May 12, 2014 · 59 commits to master since this release

  • Added max_processes argument to grid_map function for consistency.

Downloads

Version 0.12.3

@dan-blanchard dan-blanchard released this Feb 25, 2014 · 62 commits to master since this release

Fixes local mode fallback when DRMAA Python isn't available.

Downloads

Version 0.12.2

@dan-blanchard dan-blanchard released this Feb 25, 2014 · 67 commits to master since this release

Just fixed a couple minor issues.

  • exception is now the properly set as the cause of death when a job encounters an exception.
  • Fixed a potential memory leak in the qmaster process caused by not cleaning up job info as recommended in the DRMAA Python documentation.
  • Changed default session_id in JobMonitor to None to be more Pythonic, instead of -1 like it was before.

Downloads

Version 0.12.1

@dan-blanchard dan-blanchard released this Dec 19, 2013 · 81 commits to master since this release

With the previous release things could still go wrong if a process died at just the wrong moment when we're trying to get it's status, so I've added some exception handling to take care of that. I've also:

  • Added a --version option for gridmap_web
  • Fixed an issue where log files weren't being attached to error reports.
  • Changed the wording of some logging messages.

Downloads

Version 0.12.0

@dan-blanchard dan-blanchard released this Dec 13, 2013 · 86 commits to master since this release

This release mostly features greatly improved reliability of stalled job detection, but also includes some refactoring. Here's the complete list:

  • Modified CPU load calculations used to determine if a job is stalled now included all of the children of a process. Before, if a parent process was sleeping and children were doing all the work, the job would get incorrectly detected as stalled and be resubmitted. This was particularly problematic for SKLL.
  • CPU usage and memory histories are now reset when a job is resubmitted. This means error emails will contain more sensible graphs for resubmitted jobs.
  • Now raise a JobException if we give up on a job instead of ending up in a bad state.
  • Renamed SEND_ERROR_MAILS environment variable to SEND_ERROR_MAIL.
  • Removed deprecated pg_map function. It was replaced by grid_map in 0.9.2
  • Removed runner module from generated API documentation, because no one should really need to use it directly.
  • Renamed Job.job_id to Job.id
  • Added missing local option to grid_map.
  • Added a bunch more unit tests.

Downloads

Version 0.11.4

@dan-blanchard dan-blanchard released this Dec 13, 2013 · 124 commits to master since this release

Fix typo in gridmap.runner.get_memory_usage

Downloads

Version 0.11.3

@dan-blanchard dan-blanchard released this Dec 9, 2013 · 131 commits to master since this release

Bug-fix release. Changes are:

  • JobMonitor is now a context manager so that all jobs get killed when an exception occurs.
  • All jobs are now killed if a single job encounters an exception.
  • Jobs no longer pass back exceptions as strings, where they go totally unnoticed.
  • Cleaned up debug output a bit.
  • Much prettier tracebacks for job exceptions.

Downloads

Version 0.11.2

@dan-blanchard dan-blanchard released this Dec 9, 2013 · 157 commits to master since this release

Just a minor bugfix release. Changes are:

  • Switched to using official version of drmaa-python, because it is now up-to-date on PyPI.
  • Added .gitatttributes file to keep line endings normalized.
  • Fixed issue where jobs that exceed resubmission limit would infinitely send error reports.
  • We now check to see if the DRMAA C library was imported correctly and switch to local mode if it wasn't, instead of just crashing (#19).
  • Switch to using psutil for process CPU and memory stats (#18).

Downloads