Skip to content
Browse files

Add bio-linux packages; allow selective installation of package group…

…s; test all packages on latest Ubuntu 10.04
  • Loading branch information...
1 parent d7c96be commit 7060666bd8cf5948d320e8e8c4307e22b8cbeb0d @chapmanb committed May 1, 2010
View
4 ec2/biolinux/README
@@ -14,7 +14,7 @@ Amazon's command line tools.
Initial set up
--------------
The first time using EC2, you'll need to install the toolkit and credentials
-for connecting. These basic directions follow:
+for connecting. Follow these basic directions:
http://docs.amazonwebservices.com/AWSEC2/latest/GettingStartedGuide/
Login to Amazon EC2 account (http://aws.amazon.com/account/) and go to
@@ -59,7 +59,7 @@ http://docs.amazonwebservices.com/AWSEC2/latest/GettingStartedGuide/running-an-i
The first step is to pick an AMI to use. For instance:
* bioperl-max (http://fortinbras.us/bioperl-max/) -- ami-1ad03273
-* unbuntu (http://alestic.com/) -- ami-6743ae0e
+* unbuntu (http://alestic.com/) -- ami-714ba518
Using this AMI, you begin an instance and ensure that it is running:
View
29 ec2/biolinux/config/main.yaml
@@ -0,0 +1,29 @@
+---
+# Top level configuration file that specifies which groups of programs
+# should be installed. New sections that are added to individual config
+# files should go here. Comment out any groups you don't want to have
+# installed.
+packages:
+ - programming
+ - amazon
+ - python
+ - ruby
+ - r
+ - perl
+ - java
+ - erlang
+ - haskell
+ - databases
+ - math
+ - viz
+ - web
+ - bio_general
+ - bio_search
+ - bio_alignment
+ - bio_nextgen
+ - bio_sequencing
+ - bio_annotation
+ - bio_microarray
+ - bio_visualization
+ - bio_utils
+ - phylogeny
View
174 ec2/biolinux/config/packages.yaml
@@ -9,11 +9,15 @@
#
# http://fortinbras.us/bioperl-max/
#
+# and biolinux:
+#
+# http://www.jcvi.org/cms/research/projects/jcvi-cloud-biolinux/included-software/
+#
# Package names are the terminal symbols in the tree (the text on any
# line which begins with whitespace followed by a hypen and a space).
#
# The package list is organized taxonomically so that parts of it can
-# be selectively installed/ignored.
+# be selectively installed/ignored. See main.yaml for top level configuration.
programming:
editors:
- emacs
@@ -29,12 +33,12 @@ programming:
- exuberant-ctags
build:
- swig
-utils:
- - tree
-lang:
- - aspell
- - dictionaries-common
- - libaspell-dev
+ util:
+ - tree
+ lang:
+ - aspell
+ - dictionaries-common
+ - libaspell-dev
amazon:
- s3cmd
- ec2-ami-tools
@@ -79,11 +83,11 @@ python:
- python-biggles
# FIXME please organize this huge list better!
misc:
+ - python-beautifulsoup
- python-constraint
- python-cxx
- python-cxx-dev
- python-excelerator
- - python-extclass
- python-extended-threading
- python-fuse
- python-gadfly
@@ -96,20 +100,14 @@ python:
- python-irclib
- python-jaxml
- python-json
- - python-ll-core
- python-matplotlib
- python-matplotlib-data
- python-mode
- python-networkx
- python-nose
- - python-numarray
- - python-numarray-ext
- - python-numeric-tutorial
- python-numpy-ext
- python-plplot
- - python-pmock
- python-pprocess
- - python-processing
- python-psycopg2
- python-pycha
- python-pychart
@@ -135,13 +133,11 @@ python:
- python-sqlrelay
- python-statgrab
- python-stats
- - python-syck
- python-symeig
- python-sympy
- python-testresources
- python-visual
- python-xlrd
- - python-xml
- python-yaml
- python-yapgvb
- python-lxml
@@ -180,8 +176,11 @@ java:
- sun-java6-bin
- sun-java6-jre
- sun-java6-jdk
+ - openjdk-6-jdk
+ - openjdk-6-jre
- ant
- libbiojava-java
+ - eclipse
erlang:
- erlang
- erlang-base
@@ -209,42 +208,143 @@ databases:
postgres:
- postgresql
- postgresql-client
- - postgresql-plpython-8.3
- - postgresql-plperl-8.3
+ - postgresql-plpython-8.4
+ - postgresql-plperl-8.4
math:
- prover9
- octave3.0
viz:
- x11-apps
- - mayavi
+ - mayavi2
- mtasc # for modest maps
- graphviz
- libgraphviz-dev
web:
- apache2
-bioinformatics:
+bio_general:
+ - emboss
+ - emboss-data
+ - emboss-lib
+ - primer3
+ - readseq
+ - bio-linux-taverna
+bio_search:
- blast2
+ - hmmer
- ncbi-tools-bin
+ - bio-linux-blast+
+ - bio-linux-blixem
+ - bio-linux-fasta
+ - bio-linux-mspcrunch
+ - bio-linux-mview
+ - bio-linux-nrdb
+bio_alignment:
- clustalw
- - t-coffee
+ - exonerate
- mafft
+ - muscle
+ - mummer
- probcons
- - emboss
- - emboss-data
- - emboss-lib
- - exonerate
- - hmmer
+ - t-coffee
+ - seaview
+ - bio-linux-clustal
+ - bio-linux-dotter
+ - bio-linux-jalview
+ - bio-linux-pfaat
+ - bio-linux-prank
+ - bio-linux-squint
+ - bio-linux-wise2
+bio_nextgen:
- maq
- - muscle
- - phylip
- - phyml
- - primer3
- samtools
+bio_sequencing:
+ - bio-linux-assembly-conversion-tools
+ - bio-linux-cap3
+ - bio-linux-dust
+ - bio-linux-mira
+ - bio-linux-mira-3rd-party
+ - bio-linux-msatfinder
+ - bio-linux-staden
+ - bio-linux-stars
+ - bio-linux-trace2dbest
+bio_annotation:
+ - mcl
- tigr-glimmer
-# ToDo, from BioPerl max
-# bwa
-# hyphy
-# bioperl-db
-# more perl modules
-# Add Bio-linux packages
-# http://www.jcvi.org/cms/research/projects/jcvi-cloud-biolinux/included-software/
+ - bio-linux-act
+ - bio-linux-artemis
+ - bio-linux-big-blast
+ - bio-linux-cd-hit
+ - bio-linux-estscan
+ - bio-linux-glimmer3
+ - bio-linux-partigene
+ - bio-linux-priam
+ - bio-linux-prot4est
+ - bio-linux-rbs-finder
+ - bio-linux-transterm-hp
+ - bio-linux-trnascan
+ - bio-linux-yamap
+ - bio-linux-tetra
+bio_microarray:
+ - bio-linux-ocount
+ - bio-linux-oligoarray
+ - bio-linux-oligoarrayaux
+bio_visualization:
+ - rasmol
+ - bio-linux-clcworkbench
+ - bio-linux-maxd
+bio_utils:
+ - bio-linux-das-prep
+ - bio-linux-exchanger
+ - bio-linux-genquery
+ - bio-linux-handlebar
+ - bio-linux-keyring
+ - bio-linux-base-directories
+ - bio-linux-backups
+ - bio-linux-bldp-files
+ - bio-linux-pedro
+ - bio-linux-envbase-for-pedro
+ - bio-linux-sampledata
+ - bio-linux-sequin
+ - bio-linux-taxinspector
+ - bio-linux-themes
+ - bio-linux-themes-v5
+ - bio-linux-usb-maker
+phylogeny:
+ - phylip
+ - phyml
+ - mrbayes
+ - njplot
+ - tree-puzzle
+ - bio-linux-coalesce
+ - bio-linux-dendroscope
+ - bio-linux-fastDNAml
+ - bio-linux-fluctuate
+ - bio-linux-forester
+ - bio-linux-happy
+ - bio-linux-mesquite
+ - bio-linux-migrate
+ - bio-linux-mrbayes-multi
+ - bio-linux-mothur
+ - bio-linux-paml
+ - bio-linux-omegamap
+ - bio-linux-qtlcart
+ - bio-linux-recombine
+ - bio-linux-splitstree
+ - bio-linux-treeview
+# To Add
+# from bioperl-max:
+# bwa
+# hyphy
+# bioperl-db
+# more perl modules
+# from bio-linux:
+# Celera Assembler
+# arb
+# ape
+# genespring-2
+# gsrint
+# lamarc
+# lucy
+# peptidemapper
+# pftools
+# transterm
View
63 ec2/biolinux/fabfile.py
@@ -17,33 +17,40 @@
from fabric.contrib.files import *
import yaml
+env.config_dir = os.path.join(os.getcwd(), "config")
+
def ec2_ubuntu_environment():
"""Setup default environmental variables for Ubuntu EC2 servers.
+
+ Works on a US EC2 server running Ubunutu 10.04 lucid. This should be pretty
+ general but should support system specific things for other platform
+ targets.
"""
env.user = "ubuntu"
env.sources_file = "/etc/apt/sources.list"
env.std_sources = [
- "deb http://us.archive.ubuntu.com/ubuntu/ karmic universe",
- "deb http://us.archive.ubuntu.com/ubuntu/ karmic universe",
- "deb-src http://us.archive.ubuntu.com/ubuntu/ karmic universe",
- "deb http://us.archive.ubuntu.com/ubuntu/ karmic-updates universe",
- "deb-src http://us.archive.ubuntu.com/ubuntu/ karmic-updates universe",
- "deb http://us.archive.ubuntu.com/ubuntu/ karmic multiverse",
- "deb-src http://us.archive.ubuntu.com/ubuntu/ karmic multiverse",
- "deb http://us.archive.ubuntu.com/ubuntu/ karmic-updates multiverse",
- "deb-src http://us.archive.ubuntu.com/ubuntu/ karmic-updates multiverse",
+ "deb http://us.archive.ubuntu.com/ubuntu/ lucid universe",
+ "deb-src http://us.archive.ubuntu.com/ubuntu/ lucid universe",
+ "deb http://us.archive.ubuntu.com/ubuntu/ lucid-updates universe",
+ "deb-src http://us.archive.ubuntu.com/ubuntu/ lucid-updates universe",
+ "deb http://us.archive.ubuntu.com/ubuntu/ lucid multiverse",
+ "deb-src http://us.archive.ubuntu.com/ubuntu/ lucid multiverse",
+ "deb http://us.archive.ubuntu.com/ubuntu/ lucid-updates multiverse",
+ "deb-src http://us.archive.ubuntu.com/ubuntu/ lucid-updates multiverse",
+ "deb http://archive.canonical.com/ lucid partner",
]
def install_biolinux():
"""Main entry point for installing Biolinux on a remote server.
"""
ec2_ubuntu_environment()
- _apt_packages()
+ pkg_install = _read_main_config()
+ _apt_packages(pkg_install)
-def _apt_packages():
+def _apt_packages(to_install):
"""Install packages available via apt-get.
"""
- pkg_config = os.path.join(os.getcwd(), "config", "packages.yaml")
+ pkg_config = os.path.join(env.config_dir, "packages.yaml")
# Setup and update apt sources on the remote host
# lastest R versions and Bio-Linux. debian-med should already be there.
sources_add = [
@@ -55,40 +62,54 @@ def _apt_packages():
append(source, env.sources_file, use_sudo=True)
sudo("apt-get update")
# Retrieve packages to get and install each of them
- packages = _yaml_to_packages(pkg_config)
- _setup_licenses()
+ packages = _yaml_to_packages(pkg_config, to_install)
+ _setup_automation()
for package in packages:
sudo("apt-get -y --force-yes install %s" % package)
-def _setup_licenses():
- """Handle automated license acceptance for things like Sun java.
+def _setup_automation():
+ """Setup the environment to be fully automated for installs.
+ Sun Java license acceptance:
http://www.davidpashley.com/blog/debian/java-license
+
+ MySQL root password questions:
+ http://snowulf.com/archives/540-Truly-non-interactive-unattended-apt-get-install.html
"""
+ run("export DEBIAN_FRONTEND=noninteractive")
license_info = [
- "sun-java5-jdk shared/accepted-sun-dlj-v1-1 select true",
- "sun-java5-jre shared/accepted-sun-dlj-v1-1 select true",
"sun-java6-jdk shared/accepted-sun-dlj-v1-1 select true",
"sun-java6-jre shared/accepted-sun-dlj-v1-1 select true",
"sun-java6-bin shared/accepted-sun-dlj-v1-1 select true",
]
for l in license_info:
sudo("echo %s | /usr/bin/debconf-set-selections" % l)
-def _yaml_to_packages(yaml_file):
+def _yaml_to_packages(yaml_file, to_install):
"""Read a list of packages from a nested YAML configuration file.
"""
with open(yaml_file) as in_handle:
full_data = yaml.load(in_handle)
- data = full_data.values()
+ # filter the data based on what we have configured to install
+ data = [(k, v) for (k,v) in full_data.iteritems() if k in to_install]
+ data.sort()
+ data = [v for (_, v) in data]
packages = []
while len(data) > 0:
cur_info = data.pop(0)
if cur_info:
if isinstance(cur_info, (list, tuple)):
- packages.extend(cur_info)
+ packages.extend(sorted(cur_info))
elif isinstance(cur_info, dict):
data.extend(cur_info.values())
else:
raise ValueError(cur_info)
return packages
+
+def _read_main_config():
+ """Pull a list of groups to install based on our main configuration YAML.
+ """
+ yaml_file = os.path.join(env.config_dir, "main.yaml")
+ with open(yaml_file) as in_handle:
+ full_data = yaml.load(in_handle)
+ return full_data['packages']
View
54 ec2/biolinux/utils/get_biolinux_packages.py
@@ -0,0 +1,54 @@
+"""Scrape the Biolinux website to retrieve a list of packages they install.
+
+http://www.jcvi.org/cms/research/projects/jcvi-cloud-biolinux/included-software
+
+This needs to run on a machine with an apt system to check for the existance of
+package names.
+"""
+import sys
+import urllib2
+import re
+import subprocess
+import StringIO
+
+from BeautifulSoup import BeautifulSoup
+
+def main():
+ url = "http://www.jcvi.org/cms/research/projects/jcvi-cloud-biolinux/included-software"
+ in_handle = urllib2.urlopen(url)
+ soup = BeautifulSoup(in_handle)
+ tables = soup.findAll("table", {"class": "contenttable"})
+ to_check = []
+ for t in tables:
+ for row in soup.findAll("tr", {"class" : re.compile("tableRow.*")}):
+ for i, item in enumerate(row.findAll("p", {"class": "bodytext"})):
+ if i == 0:
+ to_check.append(str(item.contents[0]))
+ to_check = list(set(to_check))
+ packages = [get_package(n) for n in to_check]
+ not_ported = [to_check[i] for i, p in enumerate(packages) if p is None]
+ packages = [p for p in packages if p]
+ print len(to_check), len(packages)
+ with open("biolinux-packages.txt", "w") as out_handle:
+ out_handle.write("\n".join(sorted(packages)))
+ with open("biolinux-missing.txt", "w") as out_handle:
+ out_handle.write("\n".join(sorted(not_ported)))
+
+def get_package(pname):
+ """Try and retrieve a standard or biolinux package for the package name.
+ """
+ # custom hacking for painfully general names that take forever
+ if pname in ["act", "documentation"]:
+ pname = "bio-linux-%s" % pname
+ print 'In', pname
+ cl = subprocess.Popen(["apt-cache", "search", pname], stdout=subprocess.PIPE)
+ cl.wait()
+ for line in cl.stdout.read().split():
+ package = line.split()[0]
+ if package == pname or package == "bio-linux-%s" % pname:
+ print 'Out', package
+ return package
+ return None
+
+if __name__ == "__main__":
+ main(*sys.argv[1:])

0 comments on commit 7060666

Please sign in to comment.
Something went wrong with that request. Please try again.