Skip to content

Releases: TACC/remora

2.0.0

18 Nov 23:44
Compare
Choose a tag to compare

New Features:
binary metric collectors,
unit testing for collection and plot creation,
snapshot capabilities and monitor selection.

Release v1.8.5

27 Jan 20:04
b46128d
Compare
Choose a tag to compare

Converted all scripts to bash. Updated temperature module. Created unit tests and updated graphics for some modules.

Minor release to fix problems

25 Apr 19:50
Compare
Choose a tag to compare

This release fixes:

  • Provide OPA support (changes in install.sh and fixes is opt)
  • Put links to impi fraction and breakdown in remora_summary.html
  • Fixed and changed numa to show THP hits. (use foreign instead of miss-- which are the real misses)
  • updated in ib to include hfi1 devices

Minor release to fix a couple of annoying problems

08 Aug 18:49
Compare
Choose a tag to compare

This release fixes:

  • The help message. Some people really care about help messages, so it's a good idea to cater for them too. We broke this a couple of releases back.
  • If users are running locally, it's good to add a couple of extra checks when retrieving the hostname. If nothing works, set the hostname to localhost and give it a go.

House cleaning (and some bug fixing)

30 Jun 14:31
Compare
Choose a tag to compare

This is a minor version that fixes some recently identified issues (#37, #40, #41, #42, #43) and partially fixes #39. When filesystem latency is very high, it is still possible to find files in the wrong places after REMORA has finished. That's work in progress.

Adding MPI statistics

21 Mar 15:21
Compare
Choose a tag to compare

In this version we add MPI statistics for mvapich2 and Intel MPI. We are working on openmpi and cray-mpich, and they will be added in a future release. Two new plots will be generated in the MPI directory, a pie plot with the percent of time spent on MPI (aggregated communications and MPI-IO), and a bar plot with the top 5 most time consuming MPI calls.

Internally, we have also changed the way some of the scripts are invoked in order to be consistent. This change has the added benefit of allowing for a much cleaner and better working verbose mode. If you are thinking of adding a new module nothing changes for you, this modification took place only at the top level of the script hierarchy.

Now with csh / tcsh support

07 Dec 21:51
Compare
Choose a tag to compare

This is a bug fix release that extends remora support for users with default csh or tcsh shells. No other changes in functionality have been made.

Bells and whistles

01 Dec 21:05
Compare
Choose a tag to compare

Version 1.7.0 is a feature release that adds power and temperature monitoring, support for Infiniband devices other than mlx4, support for PBS schedulers, improved Inifiniband and Ethernet network monitoring, full support for Intel Knights Landing with automated NUMA node detection, protection from output directory being overwritten, and improved graphics.

This version also includes two new scripts, one that can generate a summary after a code crash, and one that can monitor memory utilization and kill the job before it exceeds the available memory on the node.

1.6.0.1

15 Apr 20:15
Compare
Choose a tag to compare
1.6.0.1 Pre-release
Pre-release

Minor release not intended for production systems.

Introducing real-time monitoring

11 Mar 22:45
Compare
Choose a tag to compare

In this release we introduce a real-time monitoring mode that should be useful to track jobs that are suspected to be problematic. We also fixed a minor bug that could produce unnecessary warnings when using over 32GB of memory. This was due to a hardcoded value that is now calculated on the fly.