A set of tootkit for dealing with COI amplicons using Cyclone sequencing platform
- Clone from github
$ git clone https://github.com/BioEarthDigital/CycCOI.gitfrom Bio import SeqIO
from Bio.Seq import Seq
from Bio.SeqRecord import SeqRecord
import numpy as np
from collections import defaultdict
import argparse
import logging
import sys
import os
from multiprocessing import Pool
import multiprocessing
from icecream import ic
primer.list
for TAAACTTCTGGATGTCCAAAAAATCA
rev TTTCAACAAATCATAAAGATATTGG
plate.index.tsv
P1 TCGGTCTTAGACG
P2 TGTGAAGTTGCCA
P3 AGATTCTACACAA
P4 ATGCGATTAATTG
P5 GGCTGTTACAACA
cell.index.tsv
001 AAAGC
002 AACAG
003 AACCT
004 AACTC
005 AAGCA
- QC, filtering sequencing by length, gc and generate report figures
python3 bin/CycFqFilter.py -q 7 -l 700 -L 770 -g 0.2 -G 0.6 -o test.clean test.fastq.gz- assign sequencing by plate index and well index
$ python3 ../bin/pcr_demultiplex.py -p primer.txt --plate-index plate.index.tsv --well-index cell.index.tsv -f test.fa -o out --primer_max_mismatch 3 --primer_max_indel 3 --index_max_mismatch 2 --index_max_indel 1- make consensus sequence for each well
python3 ../bin/cluster_consensus.py -i assigned.fa -o cluster --id_threshold 0.95 --keep_temp_files