GitHub - maxibor/DNA_tools: A set of tools to play with/analyze genomics data

DNA TOOLS

A set of tools for to play with/analyze genomics data

random_dna.py : creates a random sequence of DNA of length K

usage : python random_dna.py K
fasta_length.py : computes the length of DNA sequences in a Fasta file

usage : python fasta_length.py file.fa
reverse_complement.py : computes the revese complement of a DNA sequence

usage : python reverse_complement.py CGGGTA
faStats : compute sequence length of all sequences in a fasta file

usage : python faStats file.fa
melting_temp.py : compute melting temperature of a sequence

usage : melting_temp.py DnaSequence
entrez_specie.py : returns the specie/organism name given an ENTREZ id.

usage : python entrez_specie.py 1567
fastq_split.py : splits a merged paired-end Illumina {basename}.fastq (or compressed fastq.gz) file in {basename}.R1.fastq and {basename}.R2.fastq

usage : python fastq_split.py paired_end_file.fastq
centrifuge2krona : converts a centrifuge output file to a krona visualisation using centrifuge-kreport and ktImportTaxonomy.

usage : centrifuge2krona centrifuge_file.out
sam_filter : filters a sam file on identity percentage and alignment length.

usage: sam_filter file.sam
bam_filter : filters a bam file on identity percentage

usage: bam_filter file.bam
filterFastaByLength : filters a fasta file on sequence length (min and max)

usage: filterFastaByLength -min 1 -max 66 file.fa
krakenTometaphlan : converts a [Kraken] style report to a [Metaphlan] style report

usage: krakenTometaphlan -o metaphlan_report.txt kraken_report.txt
consensusMaker : creates a consensus fasta from a samtools mpileup file

usage : consensusMaker -o myconsensus.fa infile.mpileup
bed2coverage : computes the 10th percentile coverage for each feature in a BED file

usage : bed2coverage infile.bed
filterFastaByName: filters a fasta file given a list of sequence names

usage : filterFastaByName infile.fasta seqnames_to_keep.txt -o outfile.fa
eslfasta2fastq: Extracts the headers of fasta file formatted by Easel (hmmer toolkit) and get the matching fastq records
usage : eslfasta2fastq fasta_input forward.fq -fq2 reverse.fq
parallel_download: Download files from a list of files (on file per line) in a parallel fashion using multiprocessing, and subprocess calling wget

usage: parallel_download list_of_files.txt
fasta_split: Splits fasta sequences in shorter sequences, using a negative binomial distribution usage: python fasta_split.py -m 800 input_sequences.fa
compare_fasta_seqs: Compare two or more sequences in a fasta file usage: compare_fasta_seqs multifasta.fa

To add the tools to your command prompt

Option 1 : use aliases in your ~/.bash_profile or ~/.bashrc file to add each tool one by one (replace /path/to/DNA_tools/ with the location of the tool)

example :

echo "alias revcom=python /path/to/DNA_tools/reverse_complement.py" >> ~/.bashrc
source ~/.bashrc

Option 2 : Add all the DNA tools at once to your $PATH environment variable (replace /path/to/ with the location of the DNA_tools directory)

example :

echo "export PATH=$PATH:/path/to/DNA_tools/" >> ~/.bashrc
source ~/.bashrc

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

DNA TOOLS

A set of tools for to play with/analyze genomics data

To add the tools to your command prompt

About

Releases

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 58 Commits
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
bam2flags		bam2flags
bam_count_PE		bam_count_PE
bam_filter		bam_filter
bedgraph2coverage		bedgraph2coverage
centrifuge2fasta		centrifuge2fasta
centrifuge2krona		centrifuge2krona
compare_fasta_seqs		compare_fasta_seqs
consensusMaker		consensusMaker
entrez_specie.py		entrez_specie.py
eslfasta2fastq		eslfasta2fastq
faStats		faStats
fasta_length.py		fasta_length.py
fasta_split.py		fasta_split.py
fastq_split.py		fastq_split.py
filterFastaByLength		filterFastaByLength
filterFastaByName		filterFastaByName
filter_bam_fragment_length.py		filter_bam_fragment_length.py
krakenTometaphlan		krakenTometaphlan
melting_temp.py		melting_temp.py
merge_sam_ncbi_fastq.py		merge_sam_ncbi_fastq.py
parallel_download		parallel_download
random_DNA.py		random_DNA.py
reverse_complement.py		reverse_complement.py
sam_filter		sam_filter

License

maxibor/DNA_tools

Folders and files

Latest commit

History

Repository files navigation

DNA TOOLS

A set of tools for to play with/analyze genomics data

To add the tools to your command prompt

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages