Skip to content
Zack Galbreath edited this page Apr 21, 2023 · 1 revision

Attendees

  • Aashish Chaudhary
  • Alec Scott
  • Dan LaManna
  • Jacob Nesbitt
  • John Parent
  • Massimiliano Culpo
  • Michael VanDenburgh
  • Ryan Krattiger
  • Scott Wittenburg
  • Tammy Grimmett
  • Todd Gamblin
  • Zack Galbreath

cache.spack.io improvements

  • PR for improvements now open! https://github.com/spack/cache.spack.io/pull/3
  • Suggestions:
    • show variant values (rather just true/false)
    • monospace hash font
    • split OS out of arch (for filtering)
    • more general details / gentle introduction on top-level page: what caches are, etc.
  • Additional discussion should take place on the PR

Release process improvements

  • In preparation for Spack v0.20 (and particularly the inevitable v0.20.1 patch release) we'd like to improve our process around backporting PRs. Zack to investigate how ghostflow-director currently achieves this, and whether or not it would be a good fit for Spack's existing workflow.
  • We also want to update our release branch pipelines to be "rebuild everything", while our release tag pipelines will become "copy only". For more details, see the "Proposed Solution" under spack-infra issue #473

AWS cost reduction & monitoring

  • We will need some relatively brief downtime (hours, not days) to address the missing NAT gateways issue discussed previously. We plan to schedule this work next week.
  • We are also setting up automatic budget alerts for our AWS account to help us keep better track of our spending rates.

ParallelCluster testing

  • This is going well. We are still on target to demonstrate this capability prior to ISC.

Costs per Job

  • We attempted to capture EC2 instance type information from our GitLab post-build webhook in spack-infra PR #475, but this approach appears to be infeasible, as the builder pod is already gone by the time our webhook fires. The next solution we're going to investigate is envars-from-node-labels

GitLab upgrade

  • We have a staging cluster deployed complete with GitLab, runners, and karpenter provisioners. This will allow us to practice upgrading gitlab.spack.io to help ensure that it will go smoothly in production.

Secrets management

  • spack-infra PR #462 will allow us to safely store Kubernetes secrets under version control without leaking sensitive information.

Priorities

  • Continue pushing on goals for ISC:
    • pcluster runners + stack
    • cache.spack.io refresh
    • help with the v0.20 release process, particularly buildcache population
  • Deploy missing NAT gateways to reduce data transfer costs
  • Continue working towards the "costs per job" metric
  • Upgrade GitLab
  • Draft a strategy for implementing periodic, date-based "snapshot" buildcaches
  • Try to capture concrete numbers comparing costs over time:
    • on-demand vs. spot
    • consolidation vs. emptiness
    • These details would be useful in a blog post about our CI adventures
Clone this wiki locally