Personal Cancer Genome Reporter deployment recipes
Cancer reporting systems require prepopulating several gigabytes of genomic reference data and provisioning all software pieces, docker containers and configuration.
PCGR eases that,
pcgr-deploy simplifies it futher.
This ansible playbook contains tasks to deploy PCGR into Amazon and OpenStack clouds, with HPC-specific tasks added as a module (mainly NFS mounting).
ansible.site.yml's roles section according to your needs (are you a HPC or AWS user?).
The following lines will install the deployment modules, deploy PCGR and run its built-in example as a validation:
python3 -m venv venv && source venv/bin/activate && pip install ansible ansible-playbook aws.yaml -e 'ansible_python_interpreter=/usr/bin/python3' ssh ubuntu@<AWS INSTANCE> cd /mnt/pcgr ./pcgr.py --input_vcf examples/tumor_sample.COAD.vcf.gz --input_cna examples/tumor_sample.COAD.cna.tsv /mnt/pcgr-* output tumor_sample.COAD
Amazon or OpenStack or HPC?
This playbook allows for all of them, it has tested on the Australian NCI supercomputing centre Tenjin private cloud.
The only changes needed are on
ansible/group_vars/all as mentioned on the Quickstart and rearranging
site.yml so that it includes the
hpc role after
databundle.Then running the playbook in the following way should deploy PCGR in your (OpenStack?) VM:
ansible-playbook site.yml -e 'ansible_python_interpreter=/usr/bin/python3' -i <YOUR CLUSTER IP/HOSTNAME>,
Alternatively, if you have python3 already installed in your virtual environment, instantiating and deploying to OpenStack is as easy as:
Assuming you are employed by the University of Melbourne and running on Tenjin, that's all you need to do ;)
(Optional) Amazon: Saving money with Spot instances
The following script included in
ansible queries AWS's spot history and determines if the
instance we are asking for will be available. For instance, running the script with a
asking price gives us:
python ~/bin/get_spot_duration.py \ --region ap-southeast-2 \ --product-description 'Linux/UNIX' \ --bids c4.large:0.08
That is 168 hours uptime at that particular asking price for
is ~87% savings at the time of writing this:
$ ./get_spot_duration.sh Duration Instance Type Availability Zone 168.0 c4.large ap-southeast-2c 108.2 c4.large ap-southeast-2a 15.7 c4.large ap-southeast-2b
Open ended experiment for now, there are some errors that need some attention.
ERROR: package is not a legal parameter in an Ansible task or handler is a symptom of a too old ansible version (probably 1.9.x). You need Ansible >=2.x to deploy this.