diff --git a/_episodes/debug.md b/_episodes/debug.md index 3a2b7a4..07f3652 100644 --- a/_episodes/debug.md +++ b/_episodes/debug.md @@ -39,15 +39,168 @@ cwltool --debug CWL_SCRIPT.cwl First of all, errors in the YAML syntax. When writing a piece of code, it is very easy to make a mistake. Some very common YAML errors are: -- Using tabs instead of spaces. In YAML files indentations are made using spaces, not tabs. -Errors when using tabs will show `found character \t`. +- Using tabs instead of spaces. In YAML files indentations are made using spaces, not tabs. +Errors when using tabs will show `found character \t`. ![]({{page.root}}/fig/YAML_error_tab.png) - Typos in field names. It is very easy to forget for example the capital letters in field names. Errors with typos in field names will show `invalid field`. -![]({{page.root}}/fig/YAML_error_typo_fieldname.png) + + ~~~ + cwlVersion: v1.2 + class: Workflow + + inputs: + rna_reads_human: File + ref_genome: Directory + annotations: File + + steps: + quality_control: + run: bio-cwl-tools/fastqc/fastqc_2.cwl + in: + reads_file: rna_reads_human + out: [html_file] + + mapping_reads: + requirements: + ResourceRequirement: + ramMin: 9000 + run: bio-cwl-tools/STAR/STAR-Align.cwl + in: + RunThreadN: {default: 4} + GenomeDir: ref_genome + ForwardReads: rna_reads_human + OutSAMtype: {default: BAM} + SortedByCoordinate: {default: true} + OutSAMunmapped: {default: Within} + out: [alignment] + + index_alignment: + run: bio-cwl-tools/samtools/samtools_index.cwl + in: + bam_sorted: mapping_reads/alignment + out: [bam_sorted_indexed] + + count_reads: + requirements: + ResourceRequirement: + ramMin: 500 + run: bio-cwl-tools/subread/featureCounts.cwl + in: + mapped_reads: index_alignment/bam_sorted_indexed + annotations: annotations + out: [featurecounts] + + outputs: + qc_html: + type: File + outputsource: quality_control/html_file + bam_sorted_indexed: + type: File + outputSource: index_alignment/bam_sorted_indexed + featurecounts: + type: File + outputSource: count_reads/featurecount + ~~~ + {: .language-yaml} + + + ~~~ + $ cwltool rna_seq_workflow.cwl workflow_input.yml + ~~~ + {: .language-bash} + + ~~~ + ERROR Tool definition failed validation: + rna_seq_workflow.cwl:1:1: Object `rna_seq_workflow.cwl` is not valid because + tried `Workflow` but + rna_seq_workflow.cwl:46:1: the `outputs` field is not valid because + rna_seq_workflow.cwl:47:3: item is invalid because + rna_seq_workflow.cwl:49:5: invalid field `outputsource`, expected one of: 'label', + 'secondaryFiles', 'streamable', 'doc', 'id', 'format', 'outputSource', + 'linkMerge', 'pickValue', 'type' + ~~~ + {: .error} + - Typos in variable names. Similar to typos in field names, it is easy to make a mistake in referencing to a variable. These errors will show `Field references unknown identifier.` -![]({{page.root}}/fig/YAML_error_typo_variable.png) + + ~~~ + cwlVersion: v1.2 + class: Workflow + + inputs: + rna_reads_human: File + ref_genome: Directory + annotations: File + + steps: + quality_control: + run: bio-cwl-tools/fastqc/fastqc_2.cwl + in: + reads_file: rna_reads_human + out: [html_file] + + mapping_reads: + requirements: + ResourceRequirement: + ramMin: 9000 + run: bio-cwl-tools/STAR/STAR-Align.cwl + in: + RunThreadN: {default: 4} + GenomeDir: ref_genome + ForwardReads: rna_reads_human + OutSAMtype: {default: BAM} + SortedByCoordinate: {default: true} + OutSAMunmapped: {default: Within} + out: [alignment] + + index_alignment: + run: bio-cwl-tools/samtools/samtools_index.cwl + in: + bam_sorted: mapping_reads/alignments + out: [bam_sorted_indexed] + + count_reads: + requirements: + ResourceRequirement: + ramMin: 500 + run: bio-cwl-tools/subread/featureCounts.cwl + in: + mapped_reads: index_alignment/bam_sorted_indexed + annotations: annotations + out: [featurecounts] + + outputs: + qc_html: + type: File + outputSource: quality_control/html_file + bam_sorted_indexed: + type: File + outputSource: index_alignment/bam_sorted_indexed + featurecounts: + type: File + outputSource: count_reads/featurecounts + ~~~ + {: .language-bash} + + ~~~ + $ cwltool rna_seq_workflow.cwl workflow_input.yml + ~~~ + {: .language-bash} + + ~~~ + ERROR Tool definition failed validation: + rna_seq_workflow.cwl:9:1: checking field `steps` + rna_seq_workflow.cwl:30:3: checking object `rna_seq_workflow.cwl#index_alignment` + rna_seq_workflow.cwl:32:5: checking field `in` + rna_seq_workflow.cwl:33:7: checking object `rna_seq_workflow.cwl#index_alignment/bam_sorted` + Field `source` references unknown identifier + `mapping_reads/alignments`, tried + file:///.../rna_seq_workflow.cwl#mapping_reads/alignments + + ~~~ + {: .error} ### Wiring error Wiring errors often occur when you forget to add an output from a workflow's step to the `outputs` section. @@ -61,13 +214,114 @@ When you declare a variable in the `inputs` section, the type of this variable h and the type used in one of the workflows steps. The error message that is shown when this error occurs will tell you on which line the mismatch happens. -![]({{page.root}}/fig/Type_error.png) +~~~ +cwlVersion: v1.2 +class: Workflow + +inputs: + rna_reads_human: int + ref_genome: Directory + annotations: File + +steps: + quality_control: + run: bio-cwl-tools/fastqc/fastqc_2.cwl + in: + reads_file: rna_reads_human + out: [html_file] + + mapping_reads: + requirements: + ResourceRequirement: + ramMin: 9000 + run: bio-cwl-tools/STAR/STAR-Align.cwl + in: + RunThreadN: {default: 4} + GenomeDir: ref_genome + ForwardReads: rna_reads_human + OutSAMtype: {default: BAM} + SortedByCoordinate: {default: true} + OutSAMunmapped: {default: Within} + out: [alignment] + + index_alignment: + run: bio-cwl-tools/samtools/samtools_index.cwl + in: + bam_sorted: mapping_reads/alignment + out: [bam_sorted_indexed] + + count_reads: + requirements: + ResourceRequirement: + ramMin: 500 + run: bio-cwl-tools/subread/featureCounts.cwl + in: + mapped_reads: index_alignment/bam_sorted_indexed + annotations: annotations + out: [featurecounts] + +outputs: + qc_html: + type: File + outputSource: quality_control/html_file + bam_sorted_indexed: + type: File + outputSource: index_alignment/bam_sorted_indexed + featurecounts: + type: File + outputSource: count_reads/featurecounts +~~~ +{: .language-yaml} + +~~~ +$ cwltool rna_seq_workflow.cwl workflow_input.yml +~~~ +{: .language-bash} + +~~~ +ERROR Tool definition failed validation: + +rna_seq_workflow.cwl:5:3: Source 'rna_reads_human' of type "int" is incompatible +rna_seq_workflow.cwl:24:7: with sink 'ForwardReads' of type ["File", {"type": "array", "items": + "File"}] +rna_seq_workflow.cwl:5:3: Source 'rna_reads_human' of type "int" is incompatible +rna_seq_workflow.cwl:13:7: with sink 'reads_file' of type ["File"] +~~~ +{: .error} ### Format error Some files need a specific format that needs to be specified in the YAML inputs file, for example the fastq file in the RNA-seq analysis. When you don't specify a format, an error will occur. You can for example use the [EDAM](https://www.ebi.ac.uk/ols/ontologies/edam) ontology. -![]({{page.root}}/fig/Format_error.png) +~~~ +rna_reads_human: + class: File + location: rnaseq/raw_fastq/Mov10_oe_1.subset.fq +ref_genome: + class: Directory + location: rnaseq/hg19-chr1-STAR-index +annotations: + class: File + location: rnaseq/reference_data/chr1-hg19_genes.gtf +~~~ +{: .language-yaml} +~~~ +$ cwltool rna_seq_workflow.cwl workflow_input.yml +~~~ +{: .language-bash} +~~~ +ERROR Exception on step 'mapping_reads' +ERROR [step mapping_reads] Cannot make job: Expected value of 'ForwardReads' to have format http://edamontology.org/format_1930 but + File has no 'format' defined: { + "class": "File", + "location": "file:///home/mbexegc2/Documents/projects/bioexcel/follow-cwl-novice-tutorial/novice-tutorial-exercises/rnaseq/raw_fastq/Mov10_oe_1.subset.fq", + "size": 75706556, + "basename": "Mov10_oe_1.subset.fq", + "nameroot": "Mov10_oe_1.subset", + "nameext": ".fq" +} +~~~ +{: .error} {% include links.md %} diff --git a/fig/Format_error.png b/fig/Format_error.png deleted file mode 100644 index a1456e3..0000000 Binary files a/fig/Format_error.png and /dev/null differ diff --git a/fig/YAML_error_typo_fieldname.png b/fig/YAML_error_typo_fieldname.png deleted file mode 100644 index 267c6d6..0000000 Binary files a/fig/YAML_error_typo_fieldname.png and /dev/null differ diff --git a/fig/YAML_error_typo_variable.png b/fig/YAML_error_typo_variable.png deleted file mode 100644 index c40b85a..0000000 Binary files a/fig/YAML_error_typo_variable.png and /dev/null differ