Skip to content

Commit

Permalink
update Dockerfile
Browse files Browse the repository at this point in the history
- newer go
- improved go get syntax
- cleaner apt-get usage
  • Loading branch information
Traun Leyden committed Apr 28, 2015
1 parent 484c278 commit 013b811
Show file tree
Hide file tree
Showing 10 changed files with 650 additions and 370 deletions.
2 changes: 1 addition & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -161,7 +161,7 @@ Here's what will be created:
```
$ vagrant ssh core-01
$ docker run --name sync-gateway -P couchbase/sync-gateway sync-gw-start -c feature/forestdb_bucket -g https://fixme.com
$ docker run --name elastic-thought -P --link sync-gateway:sync-gateway tleyden5iwx/elastic-thought-cpu-develop bash -c 'refresh-elastic-thought-refresher; refresh-elastic-thought; httpd'
$ docker run --name elastic-thought -P --link sync-gateway:sync-gateway tleyden5iwx/elastic-thought-cpu-develop bash -c 'refresh-elastic-thought; elastic-thought'
```


Expand Down
30 changes: 13 additions & 17 deletions docker/cpu/develop/Dockerfile
Original file line number Diff line number Diff line change
Expand Up @@ -4,34 +4,30 @@ FROM tleyden5iwx/caffe-cpu-master
MAINTAINER Traun Leyden tleyden@couchbase.com

ENV GOPATH /opt/go
ENV PATH $GOPATH/bin:/usr/local/go/bin:$PATH
ENV GOROOT /usr/local/go
ENV PATH $PATH:$GOPATH/bin:$GOROOT/bin

# Get dependencies
RUN apt-get update && \
apt-get -q -y install mercurial && \
apt-get -q -y install make && \
apt-get -q -y install binutils && \
apt-get -q -y install bison && \
apt-get -q -y install build-essential
apt-get -q -y install \
mercurial \
make \
binutils \
bison \
build-essential

RUN mkdir -p $GOPATH

# Install Go 1.3 manually (since Go 1.3 is required, and ubuntu 14.04 still uses Go 1.2)
RUN curl -O https://storage.googleapis.com/golang/go1.3.1.linux-amd64.tar.gz && \
tar -C /usr/local -xzf go1.3.1.linux-amd64.tar.gz
# Download and install Go 1.4
RUN wget http://golang.org/dl/go1.4.2.linux-amd64.tar.gz && \
tar -C /usr/local -xzf go1.4.2.linux-amd64.tar.gz && \
rm go1.4.2.linux-amd64.tar.gz

# Add refresh script
ADD scripts/refresh-elastic-thought /usr/local/bin/
ADD scripts/refresh-elastic-thought-refresher /usr/local/bin/

# Go get ElasticThought
RUN go get -u -v -t github.com/tleyden/elastic-thought && \
go get -u -v -t github.com/tleyden/elastic-thought/cli/httpd && \
go get -u -v -t github.com/tleyden/elastic-thought/cli/worker && \
RUN go get -u -v -t github.com/tleyden/elastic-thought/...&& \
cd $GOPATH/src/github.com/tleyden/elastic-thought && \
git log -3

# Copy binaries
RUN cp /opt/go/bin/worker /usr/local/bin && \
cp /opt/go/bin/httpd /usr/local/bin

217 changes: 146 additions & 71 deletions docker/cpu/develop/README.md
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
[![Join the chat at https://gitter.im/tleyden/elastic-thought](https://badges.gitter.im/Join%20Chat.svg)](https://gitter.im/tleyden/elastic-thought?utm_source=badge&utm_medium=badge&utm_campaign=pr-badge&utm_content=badge)
[![Build Status](https://drone.io/github.com/tleyden/elastic-thought/status.png)](https://drone.io/github.com/tleyden/elastic-thought/latest) [![GoDoc](https://godoc.org/github.com/tleyden/elastic-thought?status.png)](https://godoc.org/github.com/tleyden/elastic-thought) [![Coverage Status](https://coveralls.io/repos/tleyden/elastic-thought/badge.svg?branch=master)](https://coveralls.io/r/tleyden/elastic-thought?branch=master) [![Join the chat at https://gitter.im/tleyden/elastic-thought](https://badges.gitter.im/Join%20Chat.svg)](https://gitter.im/tleyden/elastic-thought?utm_source=badge&utm_medium=badge&utm_campaign=pr-badge&utm_content=badge)

Scalable REST API wrapper for the [Caffe](http://caffe.berkeleyvision.org) deep learning framework.

Expand Down Expand Up @@ -36,15 +36,15 @@ If running on AWS, each [CoreOS](https://coreos.com/) instance would be running

Although not shown, all components would be running inside of [Docker](https://www.docker.com/) containers.

[CoreOS Fleet](https://coreos.com/docs/launching-containers/launching/launching-containers-fleet/) would be leveraged to auto-restart any failed components, including Caffe workers.
It would be possible to start more nodes which only had Caffe GPU workers running.

## Roadmap

*Current Status: everything under heavy construction, not ready for public consumption yet*

1. **[done]** Working end-to-end with IMAGE_DATA caffe layer using a single test set with a single training set, and ability to query trained set.
1. **[in progress]** ---> Support LEVELDB / LMDB data formats, to run mnist example.
1. Support the majority of caffe use cases
1. **[done]** Support LEVELDB / LMDB data formats, to run mnist example.
1. **[in progress]** Support the majority of caffe use cases
1. Package everything up to make it easy to deploy <-- initial release
1. Ability to auto-scale worker instances up and down based on how many jobs are in the message queue.
1. Attempt to add support for other deep learning frameworks: pylearn2, cuda-convnet, etc.
Expand All @@ -63,17 +63,18 @@ Although not shown, all components would be running inside of [Docker](https://w

* [REST API](http://docs.elasticthought.apiary.io/)
* [Godocs](http://godoc.org/github.com/tleyden/elastic-thought)
* This README

## Grid Computing
## System Requirements

ElasticThought is not trying to be a grid computing (aka distributed computation) solution.
ElasticThought requires CoreOS to run.

For that, check out:
If you want to access the GPU, you will need to do extra work to get [CoreOS working with Nvidia CUDA GPU Drivers](http://tleyden.github.io/blog/2014/11/04/coreos-with-nvidia-cuda-gpu-drivers/)

* [ParameterServer](http://parameterserver.org/)
* [Caffe Issue 876](https://github.com/BVLC/caffe/issues/876)

## Kick things off: Aws
## Installing elastic-thought on AWS (Production mode)

It should be possible to install elastic-thought anywhere that CoreOS is supported. Currently, there are instructions for AWS and Vagrant (below).

### Launch EC2 instances via CloudFormation script

Expand All @@ -83,6 +84,16 @@ For that, check out:
* Choose 3 node cluster with m3.medium or g2.2xlarge (GPU case) instance type
* All other values should be default

### Verify CoreOS cluster

Run:

```
$ fleetctl list-machines
```

Which should show all the CoreOS machines in your cluster. (this uses etcd under the hood, so will also validate that etcd is setup correctly).

### Kick off ElasticThought

Ssh into one of the machines (doesn't matter which): `ssh -A core@ec2-54-160-96-153.compute-1.amazonaws.com`
Expand All @@ -99,30 +110,62 @@ It should look like this:

```
UNIT MACHINE ACTIVE SUB
cbfs_announce@1.service 2340c553.../10.225.17.229 active running
cbfs_announce@2.service fbd4562e.../10.182.197.145 active running
cbfs_announce@3.service 0f5e2e11.../10.168.212.210 active running
cbfs_node@1.service 2340c553.../10.225.17.229 active running
cbfs_node@2.service fbd4562e.../10.182.197.145 active running
cbfs_node@3.service 0f5e2e11.../10.168.212.210 active running
couchbase_bootstrap_node.service 0f5e2e11.../10.168.212.210 active running
couchbase_bootstrap_node_announce.service 0f5e2e11.../10.168.212.210 active running
couchbase_node.1.service 2340c553.../10.225.17.229 active running
couchbase_node.2.service fbd4562e.../10.182.197.145 active running
elastic_thought_gpu@1.service 2340c553.../10.225.17.229 active running
elastic_thought_gpu@2.service fbd4562e.../10.182.197.145 active running
elastic_thought_gpu@3.service 0f5e2e11.../10.168.212.210 active running
sync_gw_announce@1.service 2340c553.../10.225.17.229 active running
sync_gw_announce@2.service fbd4562e.../10.182.197.145 active running
sync_gw_announce@3.service 0f5e2e11.../10.168.212.210 active running
sync_gw_node@1.service 2340c553.../10.225.17.229 active running
sync_gw_node@2.service fbd4562e.../10.182.197.145 active running
sync_gw_node@3.service 0f5e2e11.../10.168.212.210 active running
cbfs_announce@1.service 2340c553.../10.225.17.229 active running
cbfs_announce@2.service fbd4562e.../10.182.197.145 active running
cbfs_announce@3.service 0f5e2e11.../10.168.212.210 active running
cbfs_node@1.service 2340c553.../10.225.17.229 active running
cbfs_node@2.service fbd4562e.../10.182.197.145 active running
cbfs_node@3.service 0f5e2e11.../10.168.212.210 active running
couchbase_bootstrap_node.service 0f5e2e11.../10.168.212.210 active running
couchbase_bootstrap_node_announce.service 0f5e2e11.../10.168.212.210 active running
couchbase_node.1.service 2340c553.../10.225.17.229 active running
couchbase_node.2.service fbd4562e.../10.182.197.145 active running
elastic_thought_gpu@1.service 2340c553.../10.225.17.229 active running
elastic_thought_gpu@2.service fbd4562e.../10.182.197.145 active running
elastic_thought_gpu@3.service 0f5e2e11.../10.168.212.210 active running
sync_gw_announce@1.service 2340c553.../10.225.17.229 active running
sync_gw_announce@2.service fbd4562e.../10.182.197.145 active running
sync_gw_announce@3.service 0f5e2e11.../10.168.212.210 active running
sync_gw_node@1.service 2340c553.../10.225.17.229 active running
sync_gw_node@2.service fbd4562e.../10.182.197.145 active running
sync_gw_node@3.service 0f5e2e11.../10.168.212.210 active running
```

At this point you should be able to access the [REST API](http://docs.elasticthought.apiary.io/) on the public ip any of the three Sync Gateway machines.

## Kick things off: Vagrant
## Installing elastic-thought on a single CoreOS host (Development mode)

If you are on OSX, you'll first need to install Vagrant, VirtualBox, and CoreOS. See [CoreOS on Vagrant](https://coreos.com/docs/running-coreos/platforms/vagrant/) for instructions.

Here's what will be created:

┌─────────────────────────────────────────────────────────┐
│ CoreOS Host │
│ ┌──────────────────────────┐ ┌─────────────────────┐ │
│ │ Docker Container │ │ Docker Container │ │
│ │ ┌───────────────────┐ │ │ ┌────────────┐ │ │
│ │ │ Elastic Thought │ │ │ │Sync Gateway│ │ │
│ │ │ Server │ │ │ │ Database │ │ │
│ │ │ ┌───────────┐ │ │ │ │ │ │ │
│ │ │ │In-process │ │◀─┼──┼───▶│ │ │ │
│ │ │ │ Caffe │ │ │ │ │ │ │ │
│ │ │ │ worker │ │ │ │ │ │ │ │
│ │ │ └───────────┘ │ │ │ └────────────┘ │ │
│ │ └───────────────────┘ │ └─────────────────────┘ │
│ └──────────────────────────┘ │
└─────────────────────────────────────────────────────────┘

```
$ vagrant ssh core-01
$ docker run --name sync-gateway -P couchbase/sync-gateway sync-gw-start -c feature/forestdb_bucket -g https://fixme.com
$ docker run --name elastic-thought -P --link sync-gateway:sync-gateway tleyden5iwx/elastic-thought-cpu-develop bash -c 'refresh-elastic-thought; elastic-thought'
```


## Installing elastic-thought on Vagrant

### Update Vagrant

Expand All @@ -133,71 +176,103 @@ $ vagrant -v
1.7.1
```

### Install CoreOS
### Install CoreOS on Vagrant

See https://coreos.com/docs/running-coreos/platforms/vagrant/
Clone the coreos/vagrant fork that has been customized for running ElasticThought.

### Update cloud-config
```
$ cd ~/Vagrant
$ git clone git@github.com:tleyden/coreos-vagrant.git
$ cd coreos-vagrant
$ cp config.rb.sample config.rb
$ cp user-data.sample user-data
```

By default this will run a **two node** cluster, if you want to change this, update the `$num_instances` variable in the `config.rb` file.

Open the user-data file, and add:
### Run CoreOS

```
write_files:
- path: /etc/systemd/system/docker.service.d/increase-ulimit.conf
owner: core:core
permissions: 0644
content: |
[Service]
LimitMEMLOCK=infinity
- path: /var/lib/couchbase/data/.README
owner: core:core
permissions: 0644
content: |
Couchbase Data files are stored here
- path: /var/lib/couchbase/index/.README
owner: core:core
permissions: 0644
content: |
Couchbase Index files are stored here
- path: /var/lib/cbfs/data/.README
owner: core:core
permissions: 0644
content: |
CBFS files are stored here
$ vagrant up
```

### Increase RAM size of VM's
Ssh in:

Couchbase Server wants a lot of RAM. Bump up the vm memory size to 2GB.
```
$ vagrant ssh core-01 -- -A
```

Edit your Vagrantfile:
If you see:

```
$vb_memory = 2048
Failed Units: 1
user-cloudinit@var-lib-coreos\x2dvagrant-vagrantfile\x2duser\x2ddata.service
```

### Setup port forwarding for Couchbase UI (optional)
Jump to **Workaround CoreOS + Vagrant issues** below.

This is only needed if you want to be able to connect to the Couchbase web UI from a browser on your host OS (ie, OSX)
Verify things started up correctly:

Add the following snippet to your Vagrant file:
```
core@core-01 ~ $ fleectctl list-machines
```

If you get errors like:

```
if i == 1
# create a port forward mapping to view couchbase web ui
config.vm.network "forwarded_port", guest: 8091, host: 5091
end
2015/03/26 16:58:50 INFO client.go:291: Failed getting response from http://127.0.0.1:4001/: dial tcp 127.0.0.1:4001: connection refused
2015/03/26 16:58:50 ERROR client.go:213: Unable to get result for {Get /_coreos.com/fleet/machines}, retrying in 100ms
```

### Disable Transparent Huge Pages (optional)
Jump to **Workaround CoreOS + Vagrant issues** below.

Not sure how crucial this is, but I'll mention it just in case. After the CoreOS machines startup, ssh into each one:
### Workaround CoreOS + Vagrant issues:

First exit out of CoreOS:

```
$ sudo bash
# echo never > /sys/kernel/mm/transparent_hugepage/enabled && echo never > /sys/kernel/mm/transparent_hugepage/defrag
core@core-01 ~ $ exit
```

On your OSX workstation, try the following workaround:

```
$ sed -i '' 's/420/0644/' user-data
$ sed -i '' 's/484/0744/' user-data
$ vagrant reload --provision
```

Ssh back in:

```
$ vagrant ssh core-01 -- -A
```

Verify it worked:

```
core@core-01 ~ $ fleectctl list-machines
```

You should see:

```
MACHINE IP METADATA
ce0fec18... 172.17.8.102 -
d6402b24... 172.17.8.101 -
```

I filed [CoreOS cloudinit issue 328](https://github.com/coreos/coreos-cloudinit/issues/328) to figure out why this error is happening (possibly related issues: [CoreOS cloudinit issue 261](https://github.com/coreos/coreos-cloudinit/issues/261) or [CoreOS cloudinit issue 190](https://github.com/coreos/bugs/issues/190))


### Continue steps above

Scroll up to the **Installing elastic-thought on AWS** section and start with **Verify CoreOS cluster**

## FAQ

* Is this useful for grid computing / distributed computation? **Ans**: No, this is not trying to be a grid computing (aka distributed computation) solution. You may want to check out [Caffe Issue 876](https://github.com/BVLC/caffe/issues/876) or [ParameterServer](http://parameterserver.org/)

## License

Apache 2
30 changes: 13 additions & 17 deletions docker/cpu/master/Dockerfile
Original file line number Diff line number Diff line change
Expand Up @@ -4,34 +4,30 @@ FROM tleyden5iwx/caffe-cpu-master
MAINTAINER Traun Leyden tleyden@couchbase.com

ENV GOPATH /opt/go
ENV PATH $GOPATH/bin:/usr/local/go/bin:$PATH
ENV GOROOT /usr/local/go
ENV PATH $PATH:$GOPATH/bin:$GOROOT/bin

# Get dependencies
RUN apt-get update && \
apt-get -q -y install mercurial && \
apt-get -q -y install make && \
apt-get -q -y install binutils && \
apt-get -q -y install bison && \
apt-get -q -y install build-essential
apt-get -q -y install \
mercurial \
make \
binutils \
bison \
build-essential

RUN mkdir -p $GOPATH

# Install Go 1.3 manually (since Go 1.3 is required, and ubuntu 14.04 still uses Go 1.2)
RUN curl -O https://storage.googleapis.com/golang/go1.3.1.linux-amd64.tar.gz && \
tar -C /usr/local -xzf go1.3.1.linux-amd64.tar.gz
# Download and install Go 1.4
RUN wget http://golang.org/dl/go1.4.2.linux-amd64.tar.gz && \
tar -C /usr/local -xzf go1.4.2.linux-amd64.tar.gz && \
rm go1.4.2.linux-amd64.tar.gz

# Add refresh script
ADD scripts/refresh-elastic-thought /usr/local/bin/
ADD scripts/refresh-elastic-thought-refresher /usr/local/bin/

# Go get ElasticThought
RUN go get -u -v -t github.com/tleyden/elastic-thought && \
go get -u -v -t github.com/tleyden/elastic-thought/cli/httpd && \
go get -u -v -t github.com/tleyden/elastic-thought/cli/worker && \
RUN go get -u -v -t github.com/tleyden/elastic-thought/...&& \
cd $GOPATH/src/github.com/tleyden/elastic-thought && \
git log -3

# Copy binaries
RUN cp /opt/go/bin/worker /usr/local/bin && \
cp /opt/go/bin/httpd /usr/local/bin

0 comments on commit 013b811

Please sign in to comment.