Skip to content

2018 06 19

Andre Merzky edited this page Jun 19, 2018 · 2 revisions

Agenda / Notes

- use cases -> requirements -> architecture -> implementation -> testing -> release
- Milestones:
  - Use cases, Requirements (early April)
  - Feasibility, Prototype  (early June )
  - Implementation          (end   June )

  • updates:
    • Team 1: Ioannis, Will, Jumana
      • ticket
      • use case document
      • TODO Will: tests for slurm LRMS (other LRMS missing)
      • JD: data from BW, Titan, Stampede-2
        • TODO: use an RP script (BoT df /tmp/) to get LSF size out of band
        • goal: conceptual solution, possibly prototype
      • node failure rates are not public (specifically comet)
    • Team 2: Vivek, Srinivas, George
      • literature study
      • mpi/non-mpi impl/tests done, tagging, placeholder done/tested
      • WIP: integration locally
      • remote integration testing: Team 1?
        • TODO JD (BW)
        • begin with dummy workload
        • if possible, choose a workloads which could improve perf
    • toward integration
      • TODO IP: sync with devel
      • TODO ALL: reviews
      • TODO: press forward with integration


Clone this wiki locally