Skip to content

yukara-ikemiya/abci-code-sample

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

2 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

A simple code sample for distributed training on ABCI

This reposity provides an example of training codes for ABCI with support for multi-node training.

1. Create a Singularity image file (SIF)

cd ./docker
./build_docker.bash
./docker2singularity.bash
# simple.sif was created here.

2. Prepare training codes and job scripts

This repository contains a simple training code with a simple model with support for multi-node training.

3. Run a job on ABCI nodes

You have to specify your group id and compute nodes you want to use.

Example script:

qsub -j y -g gce12345 -l rt_AF=1 -l h_rt=0:30:00 ./job/train.bash

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published