Skip to content

Commit

Permalink
Replace screenshots with code block examples
Browse files Browse the repository at this point in the history
- Also show the input code which leads to the error
  so learners can reproduce them.
  • Loading branch information
gcapes committed Jan 21, 2022
1 parent 1b65cc6 commit 0c1aa08
Show file tree
Hide file tree
Showing 4 changed files with 260 additions and 6 deletions.
266 changes: 260 additions & 6 deletions _episodes/debug.md
Original file line number Diff line number Diff line change
Expand Up @@ -39,15 +39,168 @@ cwltool --debug CWL_SCRIPT.cwl
First of all, errors in the YAML syntax. When writing a piece of code, it is very easy to make a mistake.

Some very common YAML errors are:
- Using tabs instead of spaces. In YAML files indentations are made using spaces, not tabs.
Errors when using tabs will show `found character \t`.
- Using tabs instead of spaces. In YAML files indentations are made using spaces, not tabs.
Errors when using tabs will show `found character \t`.
![]({{page.root}}/fig/YAML_error_tab.png)
- Typos in field names. It is very easy to forget for example the capital letters in field names.
Errors with typos in field names will show `invalid field`.
![]({{page.root}}/fig/YAML_error_typo_fieldname.png)

~~~
cwlVersion: v1.2
class: Workflow
inputs:
rna_reads_human: File
ref_genome: Directory
annotations: File
steps:
quality_control:
run: bio-cwl-tools/fastqc/fastqc_2.cwl
in:
reads_file: rna_reads_human
out: [html_file]
mapping_reads:
requirements:
ResourceRequirement:
ramMin: 9000
run: bio-cwl-tools/STAR/STAR-Align.cwl
in:
RunThreadN: {default: 4}
GenomeDir: ref_genome
ForwardReads: rna_reads_human
OutSAMtype: {default: BAM}
SortedByCoordinate: {default: true}
OutSAMunmapped: {default: Within}
out: [alignment]
index_alignment:
run: bio-cwl-tools/samtools/samtools_index.cwl
in:
bam_sorted: mapping_reads/alignment
out: [bam_sorted_indexed]
count_reads:
requirements:
ResourceRequirement:
ramMin: 500
run: bio-cwl-tools/subread/featureCounts.cwl
in:
mapped_reads: index_alignment/bam_sorted_indexed
annotations: annotations
out: [featurecounts]
outputs:
qc_html:
type: File
outputsource: quality_control/html_file
bam_sorted_indexed:
type: File
outputSource: index_alignment/bam_sorted_indexed
featurecounts:
type: File
outputSource: count_reads/featurecount
~~~
{: .language-yaml}


~~~
$ cwltool rna_seq_workflow.cwl workflow_input.yml
~~~
{: .language-bash}

~~~
ERROR Tool definition failed validation:
rna_seq_workflow.cwl:1:1: Object `rna_seq_workflow.cwl` is not valid because
tried `Workflow` but
rna_seq_workflow.cwl:46:1: the `outputs` field is not valid because
rna_seq_workflow.cwl:47:3: item is invalid because
rna_seq_workflow.cwl:49:5: invalid field `outputsource`, expected one of: 'label',
'secondaryFiles', 'streamable', 'doc', 'id', 'format', 'outputSource',
'linkMerge', 'pickValue', 'type'
~~~
{: .error}

- Typos in variable names. Similar to typos in field names, it is easy to make a mistake in referencing to a variable.
These errors will show `Field references unknown identifier.`
![]({{page.root}}/fig/YAML_error_typo_variable.png)

~~~
cwlVersion: v1.2
class: Workflow
inputs:
rna_reads_human: File
ref_genome: Directory
annotations: File
steps:
quality_control:
run: bio-cwl-tools/fastqc/fastqc_2.cwl
in:
reads_file: rna_reads_human
out: [html_file]
mapping_reads:
requirements:
ResourceRequirement:
ramMin: 9000
run: bio-cwl-tools/STAR/STAR-Align.cwl
in:
RunThreadN: {default: 4}
GenomeDir: ref_genome
ForwardReads: rna_reads_human
OutSAMtype: {default: BAM}
SortedByCoordinate: {default: true}
OutSAMunmapped: {default: Within}
out: [alignment]
index_alignment:
run: bio-cwl-tools/samtools/samtools_index.cwl
in:
bam_sorted: mapping_reads/alignments
out: [bam_sorted_indexed]
count_reads:
requirements:
ResourceRequirement:
ramMin: 500
run: bio-cwl-tools/subread/featureCounts.cwl
in:
mapped_reads: index_alignment/bam_sorted_indexed
annotations: annotations
out: [featurecounts]
outputs:
qc_html:
type: File
outputSource: quality_control/html_file
bam_sorted_indexed:
type: File
outputSource: index_alignment/bam_sorted_indexed
featurecounts:
type: File
outputSource: count_reads/featurecounts
~~~
{: .language-bash}

~~~
$ cwltool rna_seq_workflow.cwl workflow_input.yml
~~~
{: .language-bash}

~~~
ERROR Tool definition failed validation:
rna_seq_workflow.cwl:9:1: checking field `steps`
rna_seq_workflow.cwl:30:3: checking object `rna_seq_workflow.cwl#index_alignment`
rna_seq_workflow.cwl:32:5: checking field `in`
rna_seq_workflow.cwl:33:7: checking object `rna_seq_workflow.cwl#index_alignment/bam_sorted`
Field `source` references unknown identifier
`mapping_reads/alignments`, tried
file:///.../rna_seq_workflow.cwl#mapping_reads/alignments
~~~
{: .error}

### Wiring error
Wiring errors often occur when you forget to add an output from a workflow's step to the `outputs` section.
Expand All @@ -61,13 +214,114 @@ When you declare a variable in the `inputs` section, the type of this variable h
and the type used in one of the workflows steps.
The error message that is shown when this error occurs will tell you on which line the mismatch happens.

![]({{page.root}}/fig/Type_error.png)
~~~
cwlVersion: v1.2
class: Workflow
inputs:
rna_reads_human: int
ref_genome: Directory
annotations: File
steps:
quality_control:
run: bio-cwl-tools/fastqc/fastqc_2.cwl
in:
reads_file: rna_reads_human
out: [html_file]
mapping_reads:
requirements:
ResourceRequirement:
ramMin: 9000
run: bio-cwl-tools/STAR/STAR-Align.cwl
in:
RunThreadN: {default: 4}
GenomeDir: ref_genome
ForwardReads: rna_reads_human
OutSAMtype: {default: BAM}
SortedByCoordinate: {default: true}
OutSAMunmapped: {default: Within}
out: [alignment]
index_alignment:
run: bio-cwl-tools/samtools/samtools_index.cwl
in:
bam_sorted: mapping_reads/alignment
out: [bam_sorted_indexed]
count_reads:
requirements:
ResourceRequirement:
ramMin: 500
run: bio-cwl-tools/subread/featureCounts.cwl
in:
mapped_reads: index_alignment/bam_sorted_indexed
annotations: annotations
out: [featurecounts]
outputs:
qc_html:
type: File
outputSource: quality_control/html_file
bam_sorted_indexed:
type: File
outputSource: index_alignment/bam_sorted_indexed
featurecounts:
type: File
outputSource: count_reads/featurecounts
~~~
{: .language-yaml}

~~~
$ cwltool rna_seq_workflow.cwl workflow_input.yml
~~~
{: .language-bash}

~~~
ERROR Tool definition failed validation:
rna_seq_workflow.cwl:5:3: Source 'rna_reads_human' of type "int" is incompatible
rna_seq_workflow.cwl:24:7: with sink 'ForwardReads' of type ["File", {"type": "array", "items":
"File"}]
rna_seq_workflow.cwl:5:3: Source 'rna_reads_human' of type "int" is incompatible
rna_seq_workflow.cwl:13:7: with sink 'reads_file' of type ["File"]
~~~
{: .error}

### Format error
Some files need a specific format that needs to be specified in the YAML inputs file, for example the fastq file in the RNA-seq analysis.
When you don't specify a format, an error will occur. You can for example use the [EDAM](https://www.ebi.ac.uk/ols/ontologies/edam) ontology.

![]({{page.root}}/fig/Format_error.png)
~~~
rna_reads_human:
class: File
location: rnaseq/raw_fastq/Mov10_oe_1.subset.fq
ref_genome:
class: Directory
location: rnaseq/hg19-chr1-STAR-index
annotations:
class: File
location: rnaseq/reference_data/chr1-hg19_genes.gtf
~~~
{: .language-yaml}

~~~
$ cwltool rna_seq_workflow.cwl workflow_input.yml
~~~
{: .language-bash}

~~~
ERROR Exception on step 'mapping_reads'
ERROR [step mapping_reads] Cannot make job: Expected value of 'ForwardReads' to have format http://edamontology.org/format_1930 but
File has no 'format' defined: {
"class": "File",
"location": "file:///home/mbexegc2/Documents/projects/bioexcel/follow-cwl-novice-tutorial/novice-tutorial-exercises/rnaseq/raw_fastq/Mov10_oe_1.subset.fq",
"size": 75706556,
"basename": "Mov10_oe_1.subset.fq",
"nameroot": "Mov10_oe_1.subset",
"nameext": ".fq"
}
~~~
{: .error}
{% include links.md %}
Binary file removed fig/Format_error.png
Binary file not shown.
Binary file removed fig/YAML_error_typo_fieldname.png
Binary file not shown.
Binary file removed fig/YAML_error_typo_variable.png
Binary file not shown.

0 comments on commit 0c1aa08

Please sign in to comment.