Skip to content

A python package to orchestrate experiments across remote nodes.

License

Notifications You must be signed in to change notification settings

pbhandar2/expK8

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

18 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

expK8

A python package to orchestrate experiments across remote nodes. I am writing this for my PhD thesis where I have to run a large number (1000s) of long running experiments (upto 7 days for a single run) in many ephemeral nodes (1000s of nodes leased and expired). I run expK8 on my personal laptop (control node). The data and experiment output which require high-capacity storage is stored in a cluster in my university (data node). The experiments run on Cloudlab (compute node), a public cloud platform commonly used in academia. I wanted to do the following from my personal laptop (control node) without having to login to any remote resources:

  • schedule experiments to compute nodes
  • cleanup compute nodes after experiments
  • track live experiments, overall progress, node health
  • kill experiments / reset node safely
  • add/remove nodes which automatically adjusts scheduling
  • transfer data to/from data node to compute node

Note that the control node, data node and compute node can be the same machine as well. It just happens to be different in my case.

About

A python package to orchestrate experiments across remote nodes.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages