-
Notifications
You must be signed in to change notification settings - Fork 13
/
xenbase-chipseq-se.cwl
133 lines (114 loc) · 3.64 KB
/
xenbase-chipseq-se.cwl
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
cwlVersion: v1.0
class: Workflow
requirements:
- class: SubworkflowFeatureRequirement
inputs:
sra_input_file:
type: File
format: "http://edamontology.org/format_3698"
illumina_adapters_file:
type: File
format: "http://edamontology.org/format_1929"
bowtie2_indices_folder:
type: Directory
chr_length_file:
type: File
format: "http://edamontology.org/format_2330"
threads:
type: int?
outputs:
bowtie2_log:
type: File
outputSource: fastq_to_bigwig/bowtie2_log
picard_metrics:
type: File
outputSource: fastq_to_bigwig/picard_metrics
bam_file:
type: File
outputSource: fastq_to_bigwig/bam_file
bamtools_log:
type: File
outputSource: fastq_to_bigwig/bamtools_log
bed:
type: File
outputSource: fastq_to_bigwig/bed
bigwig:
type: File
outputSource: fastq_to_bigwig/bigwig
steps:
sra_to_fastq:
run: ../subworkflows/xenbase-sra-to-fastq-se.cwl
in:
sra_input_file: sra_input_file
illumina_adapters_file: illumina_adapters_file
threads: threads
out: [fastq]
fastq_to_bigwig:
run: ../subworkflows/xenbase-fastq-bowtie-bigwig-se-pe.cwl
in:
upstream_fastq: sra_to_fastq/fastq
bowtie2_indices_folder: bowtie2_indices_folder
chr_length_file: chr_length_file
threads: threads
out:
- bowtie2_log
- picard_metrics
- bam_file
- bamtools_log
- bed
- bigwig
$namespaces:
s: http://schema.org/
$schemas:
- http://schema.org/docs/schema_org_rdfa.html
s:name: "xenbase-chipseq-se"
s:downloadUrl: https://raw.githubusercontent.com/Barski-lab/workflows/master/workflows/xenbase-chipseq-se.cwl
s:codeRepository: https://github.com/Barski-lab/workflows
s:license: http://www.apache.org/licenses/LICENSE-2.0
s:isPartOf:
class: s:CreativeWork
s:name: Common Workflow Language
s:url: http://commonwl.org/
s:creator:
- class: s:Organization
s:legalName: "Cincinnati Children's Hospital Medical Center"
s:location:
- class: s:PostalAddress
s:addressCountry: "USA"
s:addressLocality: "Cincinnati"
s:addressRegion: "OH"
s:postalCode: "45229"
s:streetAddress: "3333 Burnet Ave"
s:telephone: "+1(513)636-4200"
s:logo: "https://www.cincinnatichildrens.org/-/media/cincinnati%20childrens/global%20shared/childrens-logo-new.png"
s:department:
- class: s:Organization
s:legalName: "Allergy and Immunology"
s:department:
- class: s:Organization
s:legalName: "Barski Research Lab"
s:member:
- class: s:Person
s:name: Michael Kotliar
s:email: mailto:misha.kotliar@gmail.com
s:sameAs:
- id: http://orcid.org/0000-0002-6486-3898
doc: |
XenBase workflow for analysing ChIP-Seq single-end data
s:about: |
1. Convert input SRA file into FASTQ file (run fastq-dump)
2. Analyze quality of FASTQ file (run fastqc)
3. If any of the following fields in fastqc generated report is marked as failed:
"Per base sequence quality",
"Per sequence quality scores",
"Overrepresented sequences",
"Adapter Content",
- trim adapters (run trimmomatic)
4. Align original or trimmed FASTQ file to reference genome (run Bowtie2)
5. Sort and index generated by Bowtie2 BAM file (run samtools sort, samtools index)
6. Remove duplicates in sorted BAM file (run picard)
7. Sort and index BAM file after duplicates removing (run samtools sort, samtools index)
8. Count mapped reads number in sorted BAM file (run bamtools stats)
9. Generate genome coverage BED file (run bedtools genomecov)
10. Sort genearted BED file (run sort)
11. Generate genome coverage bigWig file from BED file (run bedGraphToBigWig)