This app encodes the steps of creating bcbio CWL and running a bcbio workflow on the platform. It also makes use of assets for efficient caching of the bcbio software and Docker image.
Building the app
The public bcbio_resources project on DNAnexus contains reference genomes, assets and applets used in bcbio analysis.
Setting up environment
dx login dx select bcbio_resources
Build the bcbio Docker asset
On a Linux machine with the dx-toolkit and Docker installed:
# bcbio_resources PROJECT=project-F541fX00f5v9vKJjJ34gvgbv dx select $PROJECT docker pull quay.io/bcbio/bcbio-vc dx-docker create-asset quay.io/bcbio/bcbio-vc --output_path $PROJECT:/containers/ dx ls -l /containers | head -6 dx describe `dx ls containers/ | head -1 | cut -d ' ' -f 3`
From the last two commands you need to identify the latest docker build file
record-NNN) and the docker file reference from this (
Build the bcbio-vm asset
The bcbio-vm asset includes installation of requirements for running using bioconda recipes. If bcbio, bcbio-vm or dx-cwl have changed, update the recipes prior to rebuilding to include these latest changes when rebuilding this asset.
dx select bcbio_resources dx build_asset bcbio-dnanexus-wrapper/bcbio-vm-asset -d bcbio_resources:/bcbio_assets dx mv bcbio-vm-asset /bcbio_assets dx ls -l bcbio_resources:/bcbio_assets
From the last command note the latest bcbio-vm-asset (
Adding the assets to the app
Update the version in dxapp.json
Add the bcbio-vm record asset ID and Docker record asset ID from above to assetDepends in dxapp.json
Add the Docker file ID to
Building the applet
Build the bcbio applet in
dx select bcbio_resources dx build -a -d bcbio_resources:/applets/ bcbio-dnanexus-wrapper/bcbio-run-workflow
You can run this directly for testing in the same way as versioned apps.
Release the app
To release a new version app, publish and make public:
dx build --publish --app bcbio-dnanexus-wrapper/bcbio-run-workflow -b org-az_cgr_services dx add users app-bcbio-run-workflow PUBLIC
R&D mode and reuse existing workflow results
Generally in an R&D/pre-production mode you want to test a pipeline on a handful up to hundreds of samples. There still may be bugs/issues that would require changes in the underlying bcbio Docker image. For this case, you'd like to reuse results up to the point of failure but still use a modified Docker image.
Rather than use a cached asset on the platform (as described above), when running the workflow for the first time, provide this option to the app:
dx run bcbio-run-workflow -ipull_from_docker_registry=true ...
where '...' are the remaining options you would typically supply to the app. This option ensures that the compiled workflow directly pulls from the Docker registry as opposed to using a cached asset. This is a little less efficient and robust when compared to using a cached asset, but for tens to hundreds of runs it may be preferable to accellerate iteration for R&D purposes.
Now, if you noticed a bug and subseequently modify the Docker image, you can reuse this workflow instead of compiling a new one:
dx run bcbio-run-workflow -ireuse_workflow=workflow-XXXX ...
dx run bcbio-run-workflow -ireuse_workflow=path/to/workflow-name ...
The execution of this app will allow reuse of existing results already computed for the workflow but will use the modified Docker image for any remaining jobs to be executed.