Switch branches/tags
Nothing to show
Find file History
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Permalink
..
Failed to load latest commit information.
cnn_text
mxnet-setup
.gitignore
README.md
multi_run_experiments.py
multi_run_experiments_with_failures.py
nvidia-smi.png
run_experiments.py

README.md

image

Using SigOpt, NVIDIA, and MXNet to Tune Machine Learning Pipelines

Learn more at the associated blog post: Fast CNN Tuning with AWS GPU Instances and SigOpt.

Limited Access: For those interested in replicating this blog post, let us know so that we can provide a SigOpt account for you beyond your free trial!

AWS Setup

This section will get the following up and running:

  • OS: Ubuntu 16.04
  • DNN Library: MXNet 0.9.3
  • GPU Driver: CUDA 8.0, cuDNN 5.1
  • GPU: NVIDIA K80 GPU
  • Server: Amazon EC2’s P2 instances
  1. Sign up for Amazon Web Services.

  2. This example is made for the US East (N. Virginia) region by selecting it from the dropdown menu on the top-right of the EC2 Dashboard.

  3. Create Key pair in the region you'll be spinning up your instance.

  • Tip: To make it easier to ssh or scp using this keypair, add it to the authentication agent on your machine: ssh-add /path/to/key.pem
  1. Pick an instance type and image to be using. This example uses the p2.xlarge instance type, though really any AWS EC2 P2 instance would work.

  2. Get amazon instance up and running with sufficient storage. One simple way to do this is using and installing the AWS CLI and entering the command below.

  • The community image SigOpt made and used to generate this example is ami-193e860f. (This image is based on Ubuntu Cloud's Amazon EC2 AMI Locator using image id ami-2757f631.)

  •    aws ec2 run-instances \
         --image-id ami-193e860f \
         --instance-type p2.xlarge \
         --key-name <key_name> \
         --ebs-optimized \
         --block-device-mapping \
         "[ { \"DeviceName\": \"/dev/sda1\", \"Ebs\": { \"VolumeSize\": 32 } } ]"
    
    
  1. Wait about a minute to access and reboot your instance.
  • If not operating on a private network, determine the Public DNS of your instance. Otherwise, stick with a private hostname.
  • ssh ubuntu@<hostname>
  • A system restart seems to be requested by the OS most times when instantiating this image. sudo reboot to be safe.
  1. Verify the NVIDIA GPU is up and running by entering nvidia-smi. You should see the status of the GPU driver as indicated below.

nvidia-smi

Replicate Blog Post

  1. Sign up for a SigOpt account through our website or AWS Marketplace.

  2. Copy this repository over to your new MXNet + NVIDIA Ubuntu instance!

  • scp -r dnn-tuning-nvidia-mxnet/ ubuntu@<hostname>:/home/ubuntu
  1. ssh back into your machine, install the SigOpt Python client:
  • sudo pip install sigopt
  1. Add an API key as an environmental variable (get this from the API tokens page):
  • export SIGOPT_API_TOKEN=<YOUR_API_KEY>
  1. Run the example!
  • cd dnn-tuning-nvidia-mxnet
  • python run_experiments.py
  • python run_experiments.py --with-architecture
  • Protip: Use nohup so you don't have to stay logged in.
    • nohup python run_experiments.py &> logs-no-architecture.out &
    • nohup python run_experiments.py --with-architecture &> logs-with-architecture.out &
  1. Check out your experiment dashboard and view your experiment progress from anywhere!
  • As a reminder, we run useful analytics so you may start introspecting how your choice of hyperparameters impact your objective function.