Skip to content
nielshanson edited this page Sep 13, 2013 · 22 revisions

Welcome to the Wiki for the utilities repo. Here we've put some documentation and examples on how to use some of the analytical scripts.

NCBI Submssion

Analysis

  • calculate_4mer_freq.py: A script to calculate the tetra-nucleotide (4-mer) frequency from a set of .fasta files, creating a matrix of all 256 4-mers as a tab-delimited file --- ready for loading into R.

Data Base Preparation

  • prepare_gg.py: The GreenGenes 16s rRNA database has been updated (May 2013) and has a new home on the web. However, the format of the database has changed as it no longer comes with a .fasta file that is ammenable to BLAST database creation. This script combines the available taxonomy, genbank, and sequence references to construct such a file.

Clone this wiki locally