2014 05 15

Who: Mark, Matteo, Shantenu, Ole, AndreM, Antons
Agenda:
- open TODOs
  - TODO OW: get quantitative EnMDTK requirements
  - TODO AM: micro benchmarks for RP
  - TODO AM: re-check 2048 ceiling
  - DONE OW: provide MPI prototype for stampede
  - TODO MS: base MPI agent on this
  - TODO OW: start on Cray agent, based on ATs scripts
  - DONE AM/OW: set up regular 10min meets between AT and OW
  - TODO AT: expand scripts toward MPI jobs, and further to inter-node MPI jobs
  - TODO SJ: check pipeline example (MTMS)
  - DONE MS: repost data proposal on list
  - TODO ALL: provide feedback
- MS-7 checkpoints:
  - May 8:
    - OW: simple MPI support for Stampede complete (prototype)
    - AT, OW: draft architecture for Cray agent
  - May 15:
    - MS: implementation proposal for MPI support beyond stampede
    - AM: MPI integration tests set up
    - OW: first prototype of non-MPI agent for cray
    - ALL: agree on implementation plan for Cray agent
- status reports
- discussion on Mark's data proposal
- benchmarking plans
- (?) what role plays scheduling on agent level?
Notes:
- TODO MS, OW: check module load / shell startup issues
- ibrun vs. mpiexec
- TODO OW: bootstrap for agent on archer
- mongodb on headnode of archer
- DONE OW: email about port forwarding to Iain(?)
- Antons integrates scripts in agent, expands towards MPI / aprun
- Antons: might not need agent hierarchy
- data feedback:
  - OW: clunky, decoupled from CU (cannot refer to data from other CUs)
  - MS: it acts within the sandbox, which was not possible before; its a building block
  - MS: CU deps can / will be addressed above
  - OW: actual deps are out of scope anyways...
  - MS: next: higher abstraction, implicit data locations for intermediate data
  - MS: lifetime management of staging are is up to higher levels
  - implementation: now agent can also pull data and copy/link/move
  - adds saga dependency to agent: should be optional then
  - OW: staging-area is transient, may want to use proper object?
  - next steps: come up with serious pilot data
- MS: agent is very stand-alone in terms of code, does not even share constants, nor data-db abstraction layer, should be addressed in the long run
- benchmarking: benchmarks != tracing
- MT: want cancel() on any state
- TODO OW: yes, makes sense, will do
- MS: RP state model: doesn't easily cover actively staging agent TODO MS: proposal

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

2014 05 15

Clone this wiki locally