This reposity provides an example of training codes for ABCI with support for multi-node training.
cd ./docker
./build_docker.bash
./docker2singularity.bash
# simple.sif was created here.
This repository contains a simple training code with a simple model with support for multi-node training.
You have to specify your group id and compute nodes you want to use.
qsub -j y -g gce12345 -l rt_AF=1 -l h_rt=0:30:00 ./job/train.bash