-
Notifications
You must be signed in to change notification settings - Fork 13
/
bam-bedgraph-bigwig.cwl
187 lines (157 loc) · 5.51 KB
/
bam-bedgraph-bigwig.cwl
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
cwlVersion: v1.0
class: Workflow
requirements:
- class: StepInputExpressionRequirement
- class: InlineJavascriptRequirement
inputs:
bam_file:
type: File
label: "Input BAM file"
doc: "Input BAM file, sorted by coordinates"
chrom_length_file:
type: File
label: "Chromosome length file"
doc: "Tab delimited chromosome length file: <chromName><TAB><chromSize>"
scale:
type: float?
label: "Genome coverage scaling coefficient"
doc: "Coefficient to scale the genome coverage by a constant factor"
mapped_reads_number:
type: int?
label: "Mapped reads number"
doc: |
Parameter to calculate scale as 1000000/mapped_reads_number. Ignored by bedtools-genomecov.cwl in
bam_to_bedgraph step if scale is provided
pairchip:
type: boolean?
label: "Enable paired-end genome coverage calculation"
doc: "Enable paired-end genome coverage calculation"
fragment_size:
type: int?
label: "Fixed fragment size"
doc: "Set fixed fragment size for genome coverage calculation"
strand:
type: string?
label: "Enable strand specific genome coverage calculation"
doc: "Calculate genome coverage of intervals from a specific strand"
bigwig_filename:
type: string?
label: "bigWig output filename"
doc: "Output filename for generated bigWig"
bedgraph_filename:
type: string?
label: "bedGraph output filename"
doc: "Output filename for generated bedGraph"
split:
type: boolean?
label: "Split reads by 'N' and 'D'"
doc: "Calculate genome coverage for each part of the splitted by 'N' and 'D' read"
dutp:
type: boolean?
label: "Enable dUTP"
doc: "Change strand af the mate read, so both reads come from the same strand"
outputs:
bigwig_file:
type: File
outputSource: sorted_bedgraph_to_bigwig/bigwig_file
label: "bigWig output file"
doc: "bigWig output file"
bedgraph_file:
type: File
outputSource: sort_bedgraph/sorted_file
label: "bedGraph output file"
doc: "bedGraph output file"
steps:
bam_to_bedgraph:
run: ../tools/bedtools-genomecov.cwl
in:
input_file: bam_file
depth:
default: "-bg"
split:
source: split
valueFrom: |
${
if (self == null){
return true;
} else {
return self;
}
}
output_filename: bedgraph_filename
pairchip: pairchip
fragment_size: fragment_size
scale: scale
mapped_reads_number: mapped_reads_number
strand: strand
du: dutp
out: [genome_coverage_file]
sort_bedgraph:
run: ../tools/linux-sort.cwl
in:
unsorted_file: bam_to_bedgraph/genome_coverage_file
key:
default: ["1,1","2,2n"]
out: [sorted_file]
sorted_bedgraph_to_bigwig:
run: ../tools/ucsc-bedgraphtobigwig.cwl
in:
bedgraph_file: sort_bedgraph/sorted_file
chrom_length_file: chrom_length_file
output_filename: bigwig_filename
out: [bigwig_file]
$namespaces:
s: http://schema.org/
$schemas:
- http://schema.org/docs/schema_org_rdfa.html
s:name: "bam-bedgraph-bigwig"
s:downloadUrl: https://raw.githubusercontent.com/Barski-lab/workflows/master/workflows/bam-bedgraph-bigwig.cwl
s:codeRepository: https://github.com/Barski-lab/workflows
s:license: http://www.apache.org/licenses/LICENSE-2.0
s:isPartOf:
class: s:CreativeWork
s:name: Common Workflow Language
s:url: http://commonwl.org/
s:creator:
- class: s:Organization
s:legalName: "Cincinnati Children's Hospital Medical Center"
s:location:
- class: s:PostalAddress
s:addressCountry: "USA"
s:addressLocality: "Cincinnati"
s:addressRegion: "OH"
s:postalCode: "45229"
s:streetAddress: "3333 Burnet Ave"
s:telephone: "+1(513)636-4200"
s:logo: "https://www.cincinnatichildrens.org/-/media/cincinnati%20childrens/global%20shared/childrens-logo-new.png"
s:department:
- class: s:Organization
s:legalName: "Allergy and Immunology"
s:department:
- class: s:Organization
s:legalName: "Barski Research Lab"
s:member:
- class: s:Person
s:name: Michael Kotliar
s:email: mailto:misha.kotliar@gmail.com
s:sameAs:
- id: http://orcid.org/0000-0002-6486-3898
- class: s:Person
s:name: Andrey Kartashov
s:email: mailto:Andrey.Kartashov@cchmc.org
s:sameAs:
- id: http://orcid.org/0000-0001-9102-5681
doc: |
Workflow converts input BAM file into bigWig and bedGraph files
s:about: |
Workflow converts input BAM file into bigWig and bedGraph files.
Input BAM file should be sorted by coordinates (required by `bam_to_bedgraph` step).
If `split` input is not provided use true by default. Default logic is implemented in `valueFrom` field of `split`
input inside `bam_to_bedgraph` step to avoid possible bug in cwltool with setting default values for workflow inputs.
`scale` has higher priority over the `mapped_reads_number`. The last one is used to calculate `-scale` parameter for
`bedtools genomecov` (step `bam_to_bedgraph`) only in a case when input `scale` is not provided. All logic is
implemented inside `bedtools-genomecov.cwl`.
`bigwig_filename` defines the output name only for generated bigWig file. `bedgraph_filename` defines the output name
for generated bedGraph file and can influence on generated bigWig filename in case when `bigwig_filename` is not provided.
All workflow inputs and outputs don't have `format` field to avoid format incompatibility errors when workflow is used
as subworkflow.