- Version: 0.9.3
- Maintainer: Michael Barton mail@michaelbarton.me.uk
- Outline
- Inputs
- General Definition
- Description
- Mounts
- fastq
- fragment_size
- General Definition
- Outputs
- fasta
- Signature
- Example
This specification describes the interface for containerised short-read genome assemblers. A genome assembler converts one or more FASTQ files of DNA short reads into larger contiguous ('contigs') regions of DNA. In addition to the specifications described below, this container MUST implement the specifications defined in 'Generic bioinformatics container'.
A biobox requires an input YAML that follows the below definition and is valid according to this schema.
---
version: NUMBER.NUMBER.NUMBER
arguments:
- fastq: LIST
- fragment_size: LIST
-
version: The current version is specified directly under the heading.
-
arguments: The arguments field consists out of the following fields * fastq * fragment_size
You can find a definition for every field below this section.
- The .yaml MUST be mounted to /bbx/input/biobox.yaml.
- Your output directory MUST be mounted to /bbx/output.
- Your input files MUST be mounted to /bbx/input.
- value: STRING
id: STRING or NUMBER
type: paired or single
- value: Path MUST begin with a slash ('/'), which points to gzipped FASTQ file. This file has to be mounted to a path that is prefixed by
/bbx/input
. - id: A unique id for every entry in the fastq list.
- type: Two options: * paired: Paired end fastq reads. By choosing this type the value field hast to be interleaved gzipped fastq. * single: Single end fastq reads.
- id: STRING,
value: NUMBER
- id: The specified id MUST match exactly one entry in the fastq entry list.
- number: Number for the fragment size.
---
version: NUMBER.NUMBER.NUMBER
arguments:
- fasta: LIST
This yaml with the name biobox.yaml
will be available on a successful run in your mounted output directory.
- version: The current version is specified directly under the heading.
- arguments: The arguments field consists out of the fasta field
- If the directory
/bbx/metadata
is mounted then the following files should be placed inside the directory:log.txt
Logging information that is generated by the application inside the container.
- value: STRING
id: STRING or NUMBER
type: contig or scaffold
- value: This is the path to a fasta file containing the contigs relative to your mounted output directory.
- id: A unique id for every entry in the fasta list.
- type: Two options:
- contig
- scaffold
Any biobox based assembler accepts at least one of the following signatures:
[fastq A], [Maybe fragment_size A] -> contigs B, scaffolds C
[fastq A], [fragment_size A] -> contigs B, scaffolds C
where
Maybe
indicates an optional value
This is an example biobox.yaml file:
---
version: 0.9.0
arguments:
- fastq:
- value: "/path/to/lib1"
id: "pe_1"
type: paired
- value: "/path/to/lib2"
id: "pe_2"
type: paired
- value: "/path/to/lib2"
id: "lmp_1"
type: paired
- fragment_size:
- value: 240
id: pe_1
- value: 5000
id: lmp_1