-
Notifications
You must be signed in to change notification settings - Fork 1
BCsummarizer
masikol edited this page May 26, 2023
·
3 revisions
BCsummarizer.py (makes a summary of basecalling) -- this script is designed for generating a brief summary of basecalling. It determines, in which FASTQ files are reads from FAST5 files placed.
This script can be useful, because basecallers (popular Guppy, in particular) often missasign names of input FAST5 and output FASTQ files. In result, source FAST5 and basecalled FASTQ files contain different reads although their names match one another.
Pre-requirements: h5py
Python package is necessary for working with FAST5 files. See Pre-requirements section above for installation details.
-h (--help) --- show help message;
-v (--version) --- show version;
-5 (--fast5-dir) --- directory that contains FAST5 files
meant to be processed. It may contain not only FAST5 files;
-q (--fastq-dir) --- directory that contains FASTQ files
meant to be processed. It may contain not only FASTQ files.
FASTQ files can be gzipped;
-o (--outfile) --- output summary file;
- FAST5 files are in directory
F5_dir
. Basecalled FASTQ files are in directoryFQ_dir
:
./BCsummarizer.py -5 F5_dir -q FQ_dir
- FAST5 and basecalled FASTQ files are in the working directory.
Write results in the file
/tmp/seq_summ.txt
:
./BCsummarizer -5 ./ -q ./ -o /tmp/seq_summ.txt