# Running BLAST from a container with Docker

**Author:** Hiren Ghosh  
**Affiliation:** Uniklinik Freiburg

---
## Introduction
We’ll be running a BLAST (Basic Local Alignment Search Tool) example with a container from BioContainers. BLAST is a tool bioinformaticians use to compare a sample genetic sequence to a database of known seqeuences; it’s one of the most widely used bioinformatics tools.

### Instalation
To run BLAST from a Docker container, you can follow these steps:
Install Docker: If you don't have Docker installed, download and install Docker for your operating system from the official Docker website (https://www.docker.com/get-started).


In [4]:
# Install docker on mac 
#!brew cask install docker
#!docker version

In [5]:
!docker images

REPOSITORY            TAG           IMAGE ID       CREATED       SIZE
ubuntu                latest        3b418d7b466a   2 weeks ago   77.8MB
biocontainers/blast   v2.2.31_cv2   5b25e08b9871   4 years ago   2.03GB
kodekloud/webapp      latest        1a45ba829f10   5 years ago   432MB


In [6]:
# delete some unnecessary package 
#!docker rmi -f 0cbfc4f213a4

In [8]:
#Pull the BLAST container image: Use the Docker command-line interface (CLI) to pull the BLAST container image from a Docker repository. 
#For example, you can pull the official NCBI BLAST+ image using the following command:

!docker pull biocontainers/blast:v2.2.31_cv2
#!docker run biocontainers/blast:v2.2.31_cv2 blastp -help

v2.2.31_cv2: Pulling from biocontainers/blast
Digest: sha256:238717ec69830ec62a19fc05c6f70183f218a13f7678864060f0157dc63dc54f
Status: Image is up to date for biocontainers/blast:v2.2.31_cv2
docker.io/biocontainers/blast:v2.2.31_cv2


In [None]:
#Let’s download some data to start blasting:
#!mkdir blast_example
#!cd blast_example
!wget http://www.uniprot.org/uniprot/P04156.fasta

In [None]:
#This is a human prion FASTA sequence. We’ll also need a reference database to blast against:
!curl -O ftp://ftp.ncbi.nih.gov/refseq/D_rerio/mRNA_Prot/zebrafish.1.protein.faa.gz
!gunzip zebrafish.1.protein.faa.gz

  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100 23.3M  100 23.3M    0     0  7618k      0  0:00:03  0:00:03 --:--:-- 7618k
zebrafish.1.protein.faa already exists -- do you wish to overwrite (y or n)? 

### Mounting host directories inside the Docker container

To analyze data within a Docker container, you can mount a directory containing the data in the container. In this example, let's assume you have a FASTQ file called `my_sample.fq.gz` in your current working directory and you want to analyze it using software within a Docker container. You can mount the current working directory in the `/data` folder within the container.

The following command mounts the current working directory and executes a command within the container to process the data:

```bash
docker run --rm -v $(pwd):/data your_docker_image command_to_process_data /data/my_sample.fq.gz


In [None]:
!docker run -v `pwd`:/data/ biocontainers/blast:v2.2.31_cv2 makeblastdb -in zebrafish.1.protein.faa -dbtype prot

In [None]:
!docker run -v `pwd`:/data/ biocontainers/blast:v2.2.31_cv2 blastp -query P04156.fasta.1 -db zebrafish.1.protein.faa -out results.txt -outfmt "6 qseqid sseqid pident length mismatch evalue"

In [3]:
!head results.txt

sp|P04156|PRIO_HUMAN	XP_017207509.1	42.50	80	31	2e-04
sp|P04156|PRIO_HUMAN	XP_017207511.1	42.50	80	31	2e-04
sp|P04156|PRIO_HUMAN	XP_021323434.1	42.50	80	31	3e-04
sp|P04156|PRIO_HUMAN	XP_017207510.1	42.50	80	31	3e-04
sp|P04156|PRIO_HUMAN	XP_021323433.1	42.50	80	31	3e-04
sp|P04156|PRIO_HUMAN	XP_009291733.1	42.50	80	31	3e-04
sp|P04156|PRIO_HUMAN	NP_001268391.1	24.84	157	103	0.072
sp|P04156|PRIO_HUMAN	XP_009291898.1	24.84	157	103	0.075
sp|P04156|PRIO_HUMAN	XP_021323367.1	24.84	157	103	0.075
sp|P04156|PRIO_HUMAN	XP_021323366.1	24.84	157	103	0.076
