Skip to content
No description, website, or topics provided.
Branch: master
Clone or download
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Permalink
Type Name Latest commit message Commit time
Failed to load latest commit information.
e2e
template
.gitignore
LICENSE
Makefile
README.md
requirements.txt

README.md

chainer-cfn: Cloudformation Template for ChainerMN on AWS

This template automates to build ChainerMN cluster on AWS. The overview of AWS resources to be created by this template are below:

  • VPC and Subnet where cluster places (you can configure existing VPC/Subnet)
  • S3 Bucket for sharing ephemeral ssh-key which is used to communicate among MPI processes in the cluster
  • Placement group for optimizing network performance
  • ChainerMN cluster which consists
    • 1 master EC2 instance
    • N (>=0) worker instnaces (via AutoScalingGroup)
    • chainer user to run mpi job in each instance
    • hostfile to run mpi job in each instance
    • All the instances are launched from Chainer AMI
  • (Option) Amazon Elastic Filesystem (you can configure existing filesystem)
    • This is mounted on cluster instances automatically to share your code and data.
  • Several required SecurityGroups, IAM Role

Please see template/main.py for detailed resource definitions.

The Latest Published Template

Quick Start

Please also refer to our blog: ChainerMN on AWS with CloudFormation

launch stack

Development Manual

How to build a template

make build

How to test

# Configure AWS account properly first.

# this will create a stack via a template you built.
make create-stack TEST_STACK=YOUR_TEST_STACK_NAME KEY_PAIR_NAME=YOUR_KEY_PAIR_NAME

# perform ChainerMN's train_mnist.py
make e2e-test TEST_STACK=YOUR_TEST_STACK_NAME KEY_PAIR_NAME=YOUR_KEY_PAIR_NAME

# cleanup stack
make delete-stack TEST_STACK=YOUR_TEST_STACK_NAME  KEY_PAIR_NAME=YOUR_KEY_PAIR_NAME

How to release

# Configure AWS account properly first.

# build template
make build

# perform e2e test
make create-stack TEST_STACK=YOUR_TEST_STACK_NAME KEY_PAIR_NAME=YOUR_KEY_PAIR_NAME
make e2e-test TEST_STACK=YOUR_TEST_STACK_NAME KEY_PAIR_NAME=YOUR_KEY_PAIR_NAME
make delete-stack TEST_STACK=YOUR_TEST_STACK_NAME KEY_PAIR_NAME=YOUR_KEY_PAIR_NAME

# publish to stage
make publish STAGE=(production|staging)

Release Notes

Version 0.1.0

License

MIT License (see LICENSE file).

You can’t perform that action at this time.