Skip to content

2023.03.09 Meeting Notes

Philipp Grete edited this page Mar 9, 2023 · 2 revisions

Agenda

  • Individual/group updates
  • Intel update (phydro runs on ponte vecchio!)
  • review non-WIP PRs

Individual updates

JM

  • couple of PRs in flight
  • found "bug" in prolong/restrict (PR with guard rails open) due to inconsistent Metadata (setting properties versus handling logic in the constructor) -> have issue open to discuss "fix" to Metadata
  • cleaned up and generalized integrators (now as class), contains SSPRK4, PR open, need review
    • would be nice to also support RKL(2) supertimestepping as part of the machinery, PM and PG will check
  • presenting in monthly institutional computing group meeting for Parthenon + downstream codes
    • if so. wants to highlight things there, send material to JM

BP

  • new PR for decallocating MPI comms in reductions
  • depending on MPI lib this resulted in host running out of memory or even failing without error message

LR

  • running sims with 64 sparse vars (that indeed save memory)
  • still ran OOM, tracked down to LoadBalancing which didn't differentiate between (different) sparse and dense var when rebuilding the tree. implemented and tested
    • reduced buffer size approach <- seems to be faster
    • completely buffer free approach
  • code currently lives in riot branch, so will eventually end up in develop

PM

  • fighting MPI issues (not related to downstream codes) like GTL_DEBUG: [69] cudaEventQuery: uncorrectable NVLink error detected during the execution MPICH ERROR [Rank 69] - Abort(373423874) (rank 69 in comm 0): Fatal error in PMPI_Test: Invalid count, error stack

FG

  • figuring out copyright/open sourcing issues

JS

  • having fun with Polaris and different Kokkos version
  • observed perf. regression (2x) from 3.7 to 4.0, need to investigate in Parthenon, too

BW

  • would be interested in SDC (spectral deferred correction) integrator
  • could be used to control error in operator splitting or arb. high order integration

PG

  • catching up on PR review/issue backlog
  • Ascent PR ready for merge

Intel update

  • an internal (to be released) version of OneAPI contains the fixes required to make parthenon(-hydro) compile (ahead of time, AOT) mode
  • works with Kokkos 4.0
  • PG will now work with Intel on getting performance numbers and profiling data for more detailed comparison

non-WIP PR

next meeting 23 Mar

Clone this wiki locally