this is the btrfs branch with the following change/feature:
- the registry just can't use the winnfs share (win problem?)! so now it doesn't write to the local files. However, as a work around, the registry uses a mounted filesystem created on your shared file folder. The downside is the you have to set some size by setting
VARLIBDOCKER_GB(GB in integers) in config/project.env. it turns out that any program that needs advanced access to the filesystem, cannot use the nfs share. so i'm thinking of making this change as part of the master branch.
CoreOS-based personal compute cloud
formal writeup in appendix A of thesis
git clone --recursive https://github.com/majidaldo/personal-compute-cloud.git
because scientific computing (some explanation). Briefly, the goal is cater to a workflow that starts with local development, and seamlesslessly brings more compute power on demand.
What it Does
Two types of machines are started to support the scientific computing workflow (using Docker). There is a local virtualized controller machine (called init) prividing coordination and services; and compute machines that are more ephemeral. A local compute machine is brought up for 'development'. But when a remote compute machine is acquired, it would use the same (ansible) setup script. Therefore, the local compute machine is really a stand-in for a remote machine.
The controller and compute machines together provide:
- global network addressing of docker containers across clouds (thanks to weave)
- private docker registry accessible on all compute hosts (started on boot). The images in the registry persist over instantantiations of the machines as they are stored on the local file system.
- automatic building of Dockerfiles and pushing them to the registry (on boot)
- global NFS fileshare .. no messing with sending and receiving files (functioning but not properly but seems find for working with code)
- automatic configuration of ssh access
- CUDA installation (if machine has NVIDIA gpu)
- Saving of compute machine state in EC2 or Vagrant for quick resumption of work.
- Linux: duh. windows users can use (plain)
cygwin. but i prefer
- Ansible: tested with 1.9. works on windows with cygwin with
setup/cygwin/install-myansible.sh. But as of 8/'15, you'll have to get my version of Ansible even on Linux until this gets figured out.
- Vagrant: windows users should install vagrant-winnfsd (see
setup/install-vagrant.bat). Kill the winnfs.exe process if you have nfs mounting issues
Project-level variables are located in
.env files in the
config/ folder. CoreOS-specific variables are in
config/coreos. Ansible-specific variables are in their appropriate Ansible best practice location in
ansible/. There is no immediate need for changing these variables as I tried to make everything as automatic and reasonable as possible.
Exceptions: You may want to remove the line
control_path = /tmp in
ansible/ansible.cfg as it is a cygwin hack. Also, NFS mount options can be overriden by specifiying
config/coreos/global.env if you are having trouble with NFS mounting (an attempt is made to automatically set them). On a related note,
config/coreos/init.env is hard-coded to correspond with
ansible/library/vagrant. Change as needed.
So all you have to do is add your Dockerfiles in the
docker/ folder like
docker/999-mybusybox. The build script will only build folders that start with an number followed by a hyphen, in order. Make use of this behavior to satisfy Docker image dependencies.
setup/setup.sh from within its directory.
cd ansible. Start the init machine:
ansible-playbook init.yml. Now you can
Then aquire the machines with the provided ansible playooks with any of the following providers.
EC2 (suggested method)
Setup your EC2 account. Add the following substituting your credientials to
ansible-playbook ec2.yml. To get a GPU machine:
ansible-playbook ec2.yml -e type=gpu.
Compute Machine Setup
After getting the machines, set them up:
ansible-playbook setup.yml -e hosts=ansiblepattern.
ansiblepattern is usually going to be the provider name. You can also use any of the groups defined in
After setup you can
ssh ec2hostname or
ssh vagrant because hosts are automatically added to
~/.ssh/config. Furthermore, hosts are aliased with a prefix made of a group name followed by a hypen. So,
ssh cpu-vagrant or
ssh ec2-someec2hostt will work since there are groups for providers (eg. vagrant or EC2) and compute type (cpu or gpu). EC2 machines have more groups than the ones defined in the hosts file such as instance type and instance id. (Depending on your shell, you might be able to just hit tab after partially issuing the
ssh command to complete the command.)
- Shortcut local machine setup: ansible/all-local.sh. Sets up init machine and a (local) vagrant compute machine.
ansible/destroy-acomputeprovider.shto decommision its hosts.
$REGISTRY_HOSTis a variable on all machines to access the private docker registry like
docker pull $REGISTRY_HOST/mybusybox. See note about setting up your dockerfiles in the Setup section.
- Use the build script
docker/build.shto iterate on your dockerfiles.
- Make use of
- Make use of the file share on
vagrantcommands on the local machine.
cudadocker image to build your CUDA application.
- Clean out your old hosts by removing entries in the directory
~/.ssh/configfile (just delete them if you're feeling brave. todo: automate this)
ansible/save-aprovider.shto save the state of its machines. Resume by running the corresponding provisioning and setup programs.
- No claims are made as to the security (or lack there of) of this setup. Convenience (in the form of simplicity and automation) takes priority over security measures.
- fleet and etcd, part of CoreOS, have been disabled. I don't see a use for them for the intended workflow.
- Given harware-assisted virtualization (enabled in virtualbox), perfomance should be close to bare-metal performance. Unfortunately, GPU passthrough (for the local compute machine) is not a simple matter (help!).