Permalink
Newer
Older
100644 236 lines (173 sloc) 9.49 KB
Nov 15, 2016 @blazejmarciniak Windows fixes (/n/r), search by frag name, fq file support
1 NAME
2 ====
3 fatool
4
5
6 VERSION
7 =======
8
9 0.3.1
10
11 LICENSE
12 =======
13 APACHE 2.0 Specified in LICENSE.md file
14
15 INTRODUCTION
16 ============
17
18 Package and Command line tool in python 2.7. It operates on fa/fasta/etc. files. version: 0.2.1. To install package use setup.py install.
19
20
21 PREREQUISITES
22 =============
23 PYTHON 2.7
24
25 USAGE
26 =====
27
28
29
30 COMMAND LINE
31 ============
32
33 usage: cmdfatool.py [-h] [-v]
34 {cut,extractNames,extractContigs,remContigs,join,split,reverse,validate,stats}
35
36 optional arguments:
37 -h, --help show this help message and exit
38 -v, --version display version number and exit
39
40 fatool commands:
41 {cut,extractNames,extractContigs,remContigs,join,split,reverse,validate,stats} each has own params, for more details use: command -h
42
43 cut split supplied sequence into smaller parts, according to given params
44 extractNames extracting contigs names only
45 extractContigs extracting contigs specified in file (output in new file)
46 remContigs removing contigs specified in file (output in new file)
47 join joining two or more files, yet not verifing duplicates
48 split each cotig saved into separate file
49 reverse reverse all sequences in file
50 validate validates fa file
51 stats show statistics of fa file
52
53
54 cut:
55
56 usage: cmdfatool.py cut [-h] -f FAFILE -r RANGE [-o OUTPUT] [-s STEP]
57 [--report REPORT] [--operator OPERATOR]
58
59 optional arguments:
60 -h, --help show this help message and exit
61 -f FAFILE, --fafile FAFILE file to be cut usualy *.fa
62 -r RANGE, --range RANGE cutted sequence length
63 -o OUTPUT, --output OUTPUT output file default: output.fa
64 -s STEP, --step STEP step length default: 1
65 --report REPORT log file if not supplied stdout
66 --operator OPERATOR user who have fired script it will be noted in log
67
68
69 extractNames
70
71 usage: cmdfatool.py extractNames [-h] -f FAFILE [-o OUTPUT] [--report REPORT]
72 [--operator OPERATOR]
73
74 optional arguments:
75 -h, --help show this help message and exit
76 -f FAFILE, --fafile FAFILE file to be cut usualy *.fa
77 -o OUTPUT, --output OUTPUT output file if not supplied stdout
78 --report REPORT log file if not supplied stdout
79 --operator OPERATOR user who have fired script it will be noted in log
80
81
82 extractContigs
83
84 usage: cmdfatool.py extractContigs [-h] -f FAFILE --list LIST -o OUTPUT
85 [--report REPORT] [--operator OPERATOR]
86 [--multifile]
87
88 optional arguments:
89 -h, --help show this help message and exit
90 -f FAFILE, --fafile FAFILE file to be cut usualy *.fa
91 --list LIST file containing list of contigs one contig per line
92 -o OUTPUT, --output OUTPUT output file; if --multifile is set output directory
93 --report REPORT log file if not supplied stdout
94 --operator OPERATOR user who have fired script it will be noted in log
95 --multifile if this flag is set each contig will be saved in
96 separate file
97
98
99 remContigs
100
101 usage: cmdfatool.py remContigs [-h] -f FAFILE --list LIST -o OUTPUT
102 [--report REPORT] [--operator OPERATOR]
103
104 optional arguments:
105 -h, --help show this help message and exit
106 -f FAFILE, --fafile FAFILE file to be cut usualy *.fa
107 --list LIST file containing list of contigs one contig per line
108 -o OUTPUT, --output OUTPUT output file if not supplied stdout
109 --report REPORT log file if not supplied stdout
110 --operator OPERATOR user who have fired script it will be noted in log
111
112
113 join
114
115 usage: cmdfatool.py join [-h] -f FAFILE -o OUTPUT
116 [--files [FILES [FILES ...]]] [--overwrite]
117 [--report REPORT] [--operator OPERATOR]
118
119 optional arguments:
120 -h, --help show this help message and exit
121 -f FAFILE, --fafile FAFILE file to be cut usualy *.fa
122 -o OUTPUT, --output OUTPUT output file if not supplied stdout
123 --files [FILES [FILES ...]] files to be joined
124 --overwrite if set owerwrites contigs with same name
125 --report REPORT log file if not supplied stdout
126 --operator OPERATOR user who have fired script it will be noted in log
127
128
129 split
130
131 usage: cmdfatool.py split [-h] -f FAFILE -d OUTPUTDIR [--report REPORT]
132 [--operator OPERATOR]
133
134 optional arguments:
135 -h, --help show this help message and exit
136 -f FAFILE, --fafile FAFILE file to be cut usualy *.fa
137 -d OUTPUTDIR, --outputDir OUTPUTDIR output directory where splited contigs will be saved
138 --report REPORT log file if not supplied stdout
139 --operator OPERATOR user who have fired script it will be noted in log
140
141
142 reverse
143
144 usage: cmdfatool.py reverse [-h] -f FAFILE -o OUTPUT [--report REPORT]
145 [--operator OPERATOR]
146
147 optional arguments:
148 -h, --help show this help message and exit
149 -f FAFILE, --fafile FAFILE file to be cut usualy *.fa
150 -o OUTPUT, --output OUTPUT output file; if --multifile is set output directory
151 --report REPORT log file if not supplied stdout
152 --operator OPERATOR user who have fired script it will be noted in log
153
154
155 validate
156
157 usage: cmdfatool.py validate [-h] -f FAFILE -t TYPE [--details]
158
159 optional arguments:
160 -h, --help show this help message and exit
161 -f FAFILE, --fafile FAFILE
162 file to be cut usualy *.fa
163 -t TYPE, --type TYPE type of sequence 0 - general, 1 DNA, 2 - amino
164 --details set if you want to see detaild validation info
165
166
167 stats
168
169 usage: cmdfatool.py stats [-h] -f FAFILE [--report REPORT]
170 [--operator [OPERATOR [OPERATOR ...]]]
171
172 optional arguments:
173 -h, --help show this help message and exit
174 -f FAFILE, --fafile FAFILE file to show statistics usualy *.fa
175 --report REPORT log file if not supplied stdout
176 --operator [OPERATOR [OPERATOR ...]] user who have fired script it will be noted in log
177
178 findPrimer
179
180 usage: cmdfatool.py findPrimer [-h] -f FAFILE --start START --stop STOP --mode
181 {FF,FR} [--minlen MINLEN] [--maxlen MAXLEN]
182 [--mml MML] [--report REPORT]
183 [--operator [OPERATOR [OPERATOR ...]]]
184
185 optional arguments:
186 -h, --help show this help message and exit
187 -f FAFILE, --fafile FAFILE
188 file to show statistics usualy *.fa
189 --start START first sequence to be found
190 --stop STOP last sequence to be found
191 --mode {FF,FR} FF (start - forward orientated, stop - forward orientated) or FR (start - forward orientated, stop - reverse orientated)
192 --minlen MINLEN minimum length (detfault 50bp)
193 --maxlen MAXLEN max length (detfault 1000bp)
194 --mml MML mismatch level number of allowed missmatches in primers (detfault 0)
195 --report REPORT report results into file if not supplied stdout
196 --operator [OPERATOR [OPERATOR ...]]
197 user who have fired script it will be noted in report
198
199
200 cutNameMarker
201
202
203 usage: cmdfatool.py cutNameMarker [-h] -f FAFILE -m MARKER -l LENGTH
204 --keepMarker KEEPMARKER [-o OUTPUT]
205
206 optional arguments:
207 -h, --help show this help message and exit
208 -f FAFILE, --fafile FAFILE file to show statistics usualy *.fa
209 -m MARKER, --marker MARKER marker that indicates start of cut
210 -l LENGTH, --length LENGTH length of cut
211 --keepMarker KEEPMARKER weather to keep marker or not default 1 (Yes)
212 -o OUTPUT, --output OUTPUT output file default: output.fa
213
214 translateDNA2Proteins
215
216 usage: cmdfatool.py translateDNA2Proteins [-h] -f FAFILE [-o OUTPUT]
217 [--startCodons [STARTCODONS [STARTCODONS ...]]]
218 [--stopCodons [STOPCODONS [STOPCODONS ...]]]
219 [--tdict {STD,VMTO,YMTO,BAPP}]
220 [--nss] [--report REPORT]
221 [--operator [OPERATOR [OPERATOR ...]]]
222
223 optional arguments:
224 -h, --help show this help message and exit
225 -f FAFILE, --fafile FAFILE file to show statistics usualy *.fa
226 -o OUTPUT, --output OUTPUT output file default: output.fa
227 --startCodons [STARTCODONS [STARTCODONS ...]] list of start codons separated by space bar
228 --stopCodons [STOPCODONS [STOPCODONS ...]] list of stop codons separated by space bar
229 --tdict {STD,VMTO,YMTO,BAPP}
230 Which dictionary use for translation: STD - standard,
231 VMTO - Vertebrate Mitochondrial, YMTO - Yeast
232 Mitochondrial, BAPP - Bacterial Archaeal Plant and
233 Plastid
234 --nss No Start Stop
235 --report REPORT report results into file if not supplied stdout
Aug 4, 2016 @blazejmarciniak Fixed duble > in names, fixed \r\n on names ends
236 --operator [OPERATOR [OPERATOR ...]] user who have fired script it will be noted in report