Scripts and data associated with T7SS adaptation paper
Preprint: Mortimer TD, Weber AM, Pepperell CS. 2016. Evolutionary thrift: mycobacteria repurpose plasmid diversity during adaptation of type VII secretion systems. bioRxiv:67207. http://biorxiv.org/content/early/2016/08/01/067207
.faa files containing amino acids sequences for chromosomes and plasmids (output of prokka)
Alignments used for phylogenetic analysis
Concatenated alignment of amino acid sequences of all core genes of Actinobacteria used in this project
Alignment of eccA from plasmids
Alignment of eccB from plasmids
Alignment of espI from plasmids
Concatenated alignment of amino acid sequences of ESX-4 genes from ESX-4 and ESX-4 bis loci in Actinobacteria
Concatenated alignment of amino acid sequences of ESX genes for all ESX loci
Concatenated alignment of nucleotide sequences of ESX genes for all ESX loci
Concatenated alignment of nucleotide sequences of ESX genes for plasmid-borne ESX loci
Alignment of tcpC from plasmids
Concatenated alignment of amino acid sequences of genes common to all M. ulcerans plasmids.
Alignment of virB from plasmids
Contains input and output of HyPhy analysis of alignment trimmed with GBlocks
Contains input and output of HyPhy analysis of alignment trimmed with Guidance
Contains input and output of HyPhy analysis of untrimmed alignment
.ffn files containing nucleotide sequences for chromosomes and plasmids (output of prokka)
All orthologous groups output by OrthoMCL
Orthologous groups corresponding to the core genome
Orthologous groups containing ESX genes
Orthologous groups in plasmids
Checks that all ESX genes identified on a plasmid are on the same component in the plasmidSPAdes assembly
Usage: checkPlasmidESX.py [-h] group plasmid
Concatenates fasta format sequences of eccCa1 and eccCb1
Usage: concatenateEccC1.py [-h] eccCa1 eccCb1
Creates fasta files of nucleotide and amino acid sequences for alignment of ESX genes. Requires input file formatted as follows:
species1_locus1 eccA_gene_name eccB_gene_name eccC_gene_name eccD_gene_name eccE_gene_name mycP_gene_name
species2_locus1 eccA_gene_name eccB_gene_name eccC_gene_name eccD_gene_name eccE_gene_name mycP_gene_name
species2_locus2 eccA_gene_name eccB_gene_name eccC_gene_name eccD_gene_name eccE_gene_name mycP_gene_name
Usage: esxFASTAs_oneLocus.py [-h] esx nucleotide protein
Creates fasta files of nucleotide and amino acid sequences for alignmnent. Creates separate files for each ESX locus. Each ESX locus requires an input file as described for esxFASTAs_oneLocus.py.
Usage: esxFASTAs.py [-h] esx1 esx2 esx3 esx4 esx5 nucleotide protein
Performs treescape analysis on MrBayes output
Downloads genomes from NCBI and names them genus_species_strain.fasta
Usage: getNCBIGenome.py [-h] accession email
Creates Actinobacteria core phylogeny with ESX presence/absence matrix.
Creates ESX-4 tree with bootstrap values colored according to support.
Runs prokka v 1.11 and compresses the output directory
Usage: run_prokka.sh genus species strain genus_species_strain.fasta
Uses orthomcl output to separate genes on the same component as ESX locus from other components and outputs new fasta files
Usage: separateComponents.py [-h] group plasmid
Calculates the number of shared genes between all pairs of plasmids in OrthoMCL output.
Usage: sharedGeneContent.py [-h] group categories
Submit files and DAGs to run scripts on HTCondor
List of gene names of ESX genes in M. tuberculosis
Table containing strain/locus names and locus tags for paralogs/orthologs of eccA, eccB, eccC, eccD, eccE, and mycP
Same as above, but only plasmid-borne loci
Presence/absence matrix of ESX loci in genomes in the core genome phylogeny
Accession numbers of finished plasmids
Table to match short names by OrthoMCL to full names
OrthoMCL group names and the M. tb ESX genes they contain
ESX genes in M.tb with their Rv numbers and ESX locus
SRR/ERR accessions, genus, species, and strain information for assembled plasmids
Comparison of number of shared genes between ESX containing plasmids and whetheror not they are in the same group (e.g. ESX-1P, ESX-2P, etc.)
List of all genomes in analysis
Table with genomes and accessions used in analysis
Contains trees from MrBayes phylogenetic analysis of eccA, eccB, virB, tcpC, and espI in plasmids
Core genome phylogeny
ESX-4 phylogeny
ESX phylogeny from finished genomes trimmed with Guidance
ESX phylogeney from finished genomes trimmed with Gblocks
Phylogeny of M. ulcerans plasmids