Skip to content

Latest commit

 

History

History
82 lines (68 loc) · 1.16 KB

README.md

File metadata and controls

82 lines (68 loc) · 1.16 KB

splitmasked

pytest-badge

splitmasked splits sequence records in FAST(A/Q) files based on their masking status. What constitutes masking can be defined with the --maskchar option (eg. N or lowercase). Both masked and unmasked parts can be retained and written to separate output files.

Installation

pip install splitmasked

Usage

splitmasked \
    --maskchar lowercase \
    --minlength_masked 100 \
    --minlength_unmasked 20 \
    --outfile_masked /dev/null \
    --outfile_unmasked unmasked.fastq \
    input.fastq

Examples

Input

@Seq1 comment1
aaaaaTTTTTTAAgatgatgatgAATGAA
+
AAAAAAAAAAAAAAAAAAAAAAAAAAAAA
@Seq2 comment2
ATGATAGAgagagtTTTATA
+
HHHHHHHHHHHHHHHHHHHH

Output

With --maskchar lowercase:

unmasked.fastq

@Seq1_part2 comment1
TTTTTTAA
+
AAAAAAAA
@Seq1_part4 comment1
AATGAA
+
AAAAAA
@Seq2_part1 comment2
ATGATAGA
+
HHHHHHHH
@Seq2_part3 comment2
TTTATA
+
HHHHHH

masked.fastq

@Seq1_part1 comment1
aaaaa
+
AAAAA
@Seq1_part3 comment1
gatgatgatg
+
AAAAAAAAAA
@Seq2_part2 comment2
gagagt
+
HHHHHH