Building the QIIME AMI
Clone this wiki locally
Building the QIIME AMI (i.e., the QIIME Amazon Virtual Machine)
We begin with the StarCluster base x86_64 AMI. The latest StarCluster AMI can be found on the StarCluster website. For QIIME 1.9.1, we used
ami-765b3e1f. This is an Ubuntu 12.04 virtual machine.
Next, we launch an instance of that AMI (since the Amazon interface changes, we're assuming that the reader knows how to do that). We launch an
m1.largefor building the instance, which provides sufficient CPU, memory and storage for building and testing QIIME 1.9.1. Log into that instance over
Add a repository in order to obtain a newer version of R than is available by default on the system:
echo "deb http://cran.rstudio.com/bin/linux/ubuntu precise/" | sudo tee -a /etc/apt/sources.list
Update your system to retrieve the latest list of packages (this step is especially important because we added a new package repository in the previous step):
sudo apt-get -y update
Remove OpenBLAS from the image. OpenBLAS would produce annoying warnings when any QIIME command was run (see QIIME's #1704 for more detail). Also remove matplotlib, which comes pre-installed via aptitude (failure to remove matplotlib results in a broken installation when installing/upgrading via pip below):
sudo apt-get -y remove libopenblas-base python-matplotlib*
Install QIIME's requisite libraries:
sudo apt-get --force-yes -y install python-dev libncurses5-dev libssl-dev libzmq-dev libgsl0-dev openjdk-6-jdk libxml2 libxslt1.1 libxslt1-dev ant git subversion build-essential zlib1g-dev libpng12-dev libfreetype6-dev mpich2 libreadline-dev gfortran unzip libmysqlclient18 libmysqlclient-dev ghc sqlite3 libsqlite3-dev libc6-i386 libbz2-dev tcl-dev tk-dev r-base r-base-dev libatlas-dev libatlas-base-dev liblapack-dev swig libhdf5-serial-dev echo "options(warn=2); install.packages(c('randomForest', 'optparse', 'vegan', 'biom', 'ape', 'RColorBrewer'), repos='http://cran.r-project.org')" | sudo R --slave --vanilla echo "options(warn=2); source('http://bioconductor.org/biocLite.R'); biocLite(c('metagenomeSeq', 'DESeq2'))" | sudo R --slave --vanilla
Perform a base QIIME install:
sudo easy_install -U distribute pip sudo pip install numpy numexpr h5py ipython[all] --upgrade sudo pip install qiime
Add a matplotlib config file to specify a non-GUI backend:
mkdir -p $HOME/.config/matplotlib echo 'backend : agg' > $HOME/.config/matplotlib/matplotlibrc
Prepare to run
sudo mkdir /qiime_software sudo chown $USER /qiime_software sudo chgrp $USER /qiime_software cd /qiime_software/ git clone git://github.com/qiime/qiime-deploy.git git clone git://github.com/qiime/qiime-deploy-conf.git cd qiime-deploy
qiime-deploy. If this fails, try re-running it as sometimes package downloads may timeout, etc.:
python qiime-deploy.py /qiime_software/ -f /qiime_software/qiime-deploy-conf/qiime-1.9.1/qiime.conf --force-remove-failed-dirs --force-remove-previous-repos
Adding sourcing of the QIIME activation script to
sed -i '$ d' ~/.bashrc echo -e '\nSOFTWARE_HOME=/qiime_software\n. $SOFTWARE_HOME/activate.sh' | sudo tee -a /etc/profile
mkdir /qiime_software/.bash_completion.d pyqi make-bash-completion --command-config-module biom.interfaces.optparse.config --driver-name biom -o /qiime_software/.bash_completion.d/biom echo -e '\nfor f in /qiime_software/.bash_completion.d/*;\ndo\n source $f;\ndone' | sudo tee -a /etc/profile
Modify QIIME config file's temp directory:
mkdir ~/temp echo -e 'temp_dir\t$HOME/temp/' >> /qiime_software/qiime_config
Set up IPython Notebook server to be accessible from a web browser (modified from instructions in #1367):
ipython profile create sed -i "s/# c.NotebookApp.ip = 'localhost'/c.NotebookApp.ip = '*'/" ~/.ipython/profile_default/ipython_notebook_config.py sed -i "s/# c.NotebookApp.open_browser = True/c.NotebookApp.open_browser = False/" ~/.ipython/profile_default/ipython_notebook_config.py sed -i "s/# c.NotebookApp.password = u''/c.NotebookApp.password = u'sha1:8f4908e22921:69d3122a66fba11bf1922b116fbb290c1ffec501'/" ~/.ipython/profile_default/ipython_notebook_config.py
The following text should be included as the message of the day (so it is printed when users log into a QIIME virtual machine instance). To do this, first remove the StarCluster message of the day (we instead credit StarCluster in ours):
sudo rm /etc/update-motd.d/00-starcluster
Then create a new file,
/etc/update-motd.d/00-qiime, with the following contents:
#!/bin/sh cat<<"EOF" ___ _____ _____ ____ ____ ________ .' `. |_ _||_ _||_ \ / _||_ __ | / .-. \ | | | | | \/ | | |_ \_| | | | | | | | | | |\ /| | | _| _ \ `-' \_ _| |_ _| |_ _| |_\/_| |_ _| |__/ | `.___.\__||_____||_____||_____||_____||________| QIIME 1.9.1 AMI (derived from the StarCluster Ubuntu 12.04 AMI) www.qiime.org Getting help: help.qiime.org QIIME script index: scripts.qiime.org QIIME workshops: workshops.qiime.org QIIME help videos: videos.qiime.org StarCluster (building AWS-based clusters): star.mit.edu/cluster IPython, and the IPython Notebook: ipython.org Software Carpentry (educational resources for Linux and scientific computing): software-carpentry.org QIIME is powered by scikit-bio: scikit-bio.org Qiita, QIIME-powered microbiome data storage and analysis: qiita.microbio.me biocore, collaboratively developed bioinformatics software: github.com/biocore To print configuration and version info for QIIME and its dependencies, run: print_qiime_config.py Current System Stats: EOF landscape-sysinfo | grep -iv 'graph this data'
Add execute permissions to the file since it is a script:
sudo chmod a+x /etc/update-motd.d/00-qiime
Log out and log back in.
Before finalizing the instance and creating the AMI, run
print_qiime_config.pyto ensure that everything is installed correctly:
This command should only have a single failure for usearch. Be sure that no warnings are printed in the output (we saw warnings at the top of the output when testing the release candidate).
Next, run scikit-bio's test suite (this caught some installation errors while testing the release candidate):
nosetests skbio --with-doctest -s -I DONOTIGNOREANYTHING
Finally, run QIIME's full test suite:
cd ~/temp wget https://pypi.python.org/packages/source/q/qiime/qiime-1.9.1.tar.gz tar xvf qiime-1.9.1.tar.gz
We recommend running the tests from within a
screensession, redirecting all output to a file:
screen python qiime-1.9.1/tests/all_tests.py &> all-tests-output.txt
All tests except for those related to usearch, sfffile, sffinfo, and torque must pass.
Remove history (this must be the final step before creating the AMI with StarCluster):
rm -rf ~/.bash_history ~/.viminfo ~/.lesshst ~/temp/*
Logout and create the AMI with
starcluster ebsimage. Using StarCluster is important because it cleans up private information that will be stored on the instance, such as your public key. WARNING: This will prevent you (and anyone else) from ever logging into this instance again!
starcluster ebsimage <instance ID> qiime-191
From the AWS console, terminate the instance.
Tag the new AMI and corresponding snapshot to indicate it is the QIIME 1.9.1 AMI.
Start a new instance using the AMI you just created. Follow the instructions in the QIIME AWS tutorial to start an IPython Notebook server. Make sure that you can log in to the server from a web browser and that running QIIME commands works as expected (e.g.,
From the AWS console, terminate the instance.
Test out a few QIIME commands on the cluster, including serial and parallel commands. For example, you could run the commands in the Illumina overview tutorial. Be sure to pass
-aO <n>to the parallel commands (e.g.,
pick_open_reference_otus.py) to enable parallel job execution. Make sure that jobs are being submitted to the queue (e.g., using
qstat) and that processes are being run on the worker nodes (e.g., using
htop). Test submitting jobs (serial and parallel) to the queue using
start_parallel_jobs_sc.py, as well as running commands directly on the master node.
Using StarCluster, terminate the cluster.
Once done testing, make the AMI public (so that it is available under Community AMIs).
Update the QIIME resources page to include the new AMI.
Notify users via email, blog post, forum post, Twitter, etc.