Skip to content

Scripts and data associated with T7SS evolution paper

License

Notifications You must be signed in to change notification settings

tatumdmortimer/t7ss

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

76 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

t7ss

Scripts and data associated with T7SS adaptation paper

Preprint: Mortimer TD, Weber AM, Pepperell CS. 2016. Evolutionary thrift: mycobacteria repurpose plasmid diversity during adaptation of type VII secretion systems. bioRxiv:67207. http://biorxiv.org/content/early/2016/08/01/067207

aa_fastas

.faa files containing amino acids sequences for chromosomes and plasmids (output of prokka)

alignments

Alignments used for phylogenetic analysis

core_alignment.fasta

Concatenated alignment of amino acid sequences of all core genes of Actinobacteria used in this project

eccA_align_trim.fasta

Alignment of eccA from plasmids

eccB_align_trim.fasta

Alignment of eccB from plasmids

espI_align_trim.fasta

Alignment of espI from plasmids

esx4_aa_alignment.fasta

Concatenated alignment of amino acid sequences of ESX-4 genes from ESX-4 and ESX-4 bis loci in Actinobacteria

esx_aa_alignmnet.fasta

Concatenated alignment of amino acid sequences of ESX genes for all ESX loci

esx_nucleotide_alignment.fasta

Concatenated alignment of nucleotide sequences of ESX genes for all ESX loci

plasmid_ESX_nucleotide_align.fasta

Concatenated alignment of nucleotide sequences of ESX genes for plasmid-borne ESX loci

tcpC_align_trim.fasta

Alignment of tcpC from plasmids

ulcerans_plasmid_core_alignment.fasta

Concatenated alignment of amino acid sequences of genes common to all M. ulcerans plasmids.

virB_align_trim.fasta

Alignment of virB from plasmids

hyphy

gblocks

Contains input and output of HyPhy analysis of alignment trimmed with GBlocks

guidance

Contains input and output of HyPhy analysis of alignment trimmed with Guidance

untrimmed

Contains input and output of HyPhy analysis of untrimmed alignment

nucleotide_fastas

.ffn files containing nucleotide sequences for chromosomes and plasmids (output of prokka)

orthomcl

orthomcl_groups.txt

All orthologous groups output by OrthoMCL

orthomcl_core_groups.txt

Orthologous groups corresponding to the core genome

orthomcl_esx_groups.txt

Orthologous groups containing ESX genes

orthomcl_plasmids_groups.txt

Orthologous groups in plasmids

scripts

checkPlasmidESX.py

Checks that all ESX genes identified on a plasmid are on the same component in the plasmidSPAdes assembly

Usage: checkPlasmidESX.py [-h] group plasmid

concatenateEccC1.py

Concatenates fasta format sequences of eccCa1 and eccCb1

Usage: concatenateEccC1.py [-h] eccCa1 eccCb1

esxFASTAs_oneLocus.py

Creates fasta files of nucleotide and amino acid sequences for alignment of ESX genes. Requires input file formatted as follows:

species1_locus1 eccA_gene_name  eccB_gene_name  eccC_gene_name  eccD_gene_name  eccE_gene_name  mycP_gene_name
species2_locus1 eccA_gene_name  eccB_gene_name  eccC_gene_name  eccD_gene_name  eccE_gene_name  mycP_gene_name
species2_locus2 eccA_gene_name  eccB_gene_name  eccC_gene_name  eccD_gene_name  eccE_gene_name  mycP_gene_name

Usage: esxFASTAs_oneLocus.py [-h] esx nucleotide protein

esxFASTAs.py

Creates fasta files of nucleotide and amino acid sequences for alignmnent. Creates separate files for each ESX locus. Each ESX locus requires an input file as described for esxFASTAs_oneLocus.py.

Usage: esxFASTAs.py [-h] esx1 esx2 esx3 esx4 esx5 nucleotide protein

esx_treescape.R

Performs treescape analysis on MrBayes output

getNCBIGenome.py

Downloads genomes from NCBI and names them genus_species_strain.fasta

Usage: getNCBIGenome.py [-h] accession email

plotCoreGenomeTree.R

Creates Actinobacteria core phylogeny with ESX presence/absence matrix.

plot_esx4_tree.R

Creates ESX-4 tree with bootstrap values colored according to support.

run_prokka.sh

Runs prokka v 1.11 and compresses the output directory

Usage: run_prokka.sh genus species strain genus_species_strain.fasta

separateComponents.py

Uses orthomcl output to separate genes on the same component as ESX locus from other components and outputs new fasta files

Usage: separateComponents.py [-h] group plasmid

sharedGeneContent.py

Calculates the number of shared genes between all pairs of plasmids in OrthoMCL output.

Usage: sharedGeneContent.py [-h] group categories

submit_files

Submit files and DAGs to run scripts on HTCondor

get_genomes.dag
get_genomes.submit
prokka.dag

tables

esx_gene_names.txt

List of gene names of ESX genes in M. tuberculosis

esx_genes_allCopies.txt

Table containing strain/locus names and locus tags for paralogs/orthologs of eccA, eccB, eccC, eccD, eccE, and mycP

esx_genes_plasmids.txt

Same as above, but only plasmid-borne loci

esxMatrix.txt

Presence/absence matrix of ESX loci in genomes in the core genome phylogeny

finished_plasmids_accessions.txt

Accession numbers of finished plasmids

genomeNames.txt

Table to match short names by OrthoMCL to full names

group_mtb_genes.txt

OrthoMCL group names and the M. tb ESX genes they contain

mtb_esx_genes.txt

ESX genes in M.tb with their Rv numbers and ESX locus

plasmid_sra.txt

SRR/ERR accessions, genus, species, and strain information for assembled plasmids

shared_plasmid_genes.txt

Comparison of number of shared genes between ESX containing plasmids and whetheror not they are in the same group (e.g. ESX-1P, ESX-2P, etc.)

strains.txt

List of all genomes in analysis

T7SS_accessions.tsv

Table with genomes and accessions used in analysis

trees

mrbayes

Contains trees from MrBayes phylogenetic analysis of eccA, eccB, virB, tcpC, and espI in plasmids

RAxML_bipartitionsBranchLabels.core_longNames

Core genome phylogeny

RAxML_bipartitionsBranchLabels_esx4.newick

ESX-4 phylogeny

RAxML_bipartitionsBranchLabels.esx_combine_guidance

ESX phylogeny from finished genomes trimmed with Guidance

RAxML_bipartitionsBranchLabels.esx_combine_trimmed

ESX phylogeney from finished genomes trimmed with Gblocks

RAxML_bipartitionsBranchLabels.ulceransCore_combine

Phylogeny of M. ulcerans plasmids

About

Scripts and data associated with T7SS evolution paper

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages