A demonstration of an automatic workflow for rapid DNN training using remote AI system resource.
The sample DNN used in the demo can be obtained from the ddp
branch of BraggNN
- funcx_endpoint=0.3.2
- PyTorch=1.9.0
- horovod=0.22.1
- h5py=2.10.0
- numpy=1.19.2
- globus-automate-client=0.12.0
- funcx=0.3.2
more detail can be found from https://arxiv.org/abs/2105.13967
please reach to zhengchun.liu#@#anl.gov if you run into problems.