Genomic analysis of Mycobacterium brumae sustains its nonpathogenic and immunogenic phenotype

Chantal Renau-Mínguez¹, Paula Herrero-Abadía², Vicente Sentandreu³, Paula Ruiz-Rodriguez¹, Eduard Torrents^4,5, Álvaro Chiner-Oms⁶, Manuela Torres-Puente⁶, Iñaki Comas⁶, Esther Julián^2* and Mireia Coscolla^1*

I²SysBio, University of Valencia-CSIC, FISABIO Joint Research Unit Infection and Public Health, Valencia, Spain
Genetics and Microbiology Department, Faculty of Biosciences, Autonomous University of Barcelona, 08193, Bellaterra, Barcelona, Spain
Genomics Unit, Central Service for Experimental Research (SCSIE), University of Valencia, Spain
Bacterial Infections and Antimicrobial Therapies Group, Institute for Bioengineering of Catalonia (IBEC), Baldiri Reixac 15-21, 08028 Barcelona, Spain
Microbiology Section, Department of Genetics, Microbiology and Statistics, Biology Faculty, Universitat de Barcelona, 08028 Barcelona, Spain
Instituto de Biomedicina de Valencia (IBV), CSIC, 46010, Valencia, Spain

_{* Correspondence:}

_{mireia.coscolla@uv.es (Mireia Coscolla); Esther.Julian@uab.cat (Esther Julián)}

Aim of this repository

The main purpose of this repository is to display the scripts made for this academic work, in order to achieve reproducibility. Also make public and available to all the closed genome of Mycobacterium brumae ATCC 51384^T

Scripts

This folder contains multiple subfolders with scripts made for certain purpose.

Dependencies

Python version 3
BLAST+
The MUMmer 3

blast_analysis

Scripts for the analysis to get protein identity of Mycobacterium tuberculosis H37Rv in the analyzed genomes of interest.

run_blast.py: Script to perform tblastn analysis extracting genes (aa) and find them in a fasta. We calculate the coincidence percentage of the gene with the target, finally we filter the genes by 3 coincidence percentage thresholds: 80, 70 and 60.

Command

python3 run_blast.py -g gene.txt -f brumae.fasta -p h37rv.fasta -n brumae_find

duplicated_genes

Scripts for the analysis to get genes with less than 300bp repeated in order to exclude this genes in Illumina genomic analysis.

clean_genes.py: script to get from gff file a tabbed file with gene id, orientation, start nt and end nt.
multifasta.py: script to get nt sequences for each gene from a fasta and a tsv file.
process_mummer_output.py: script to process mummer aou
command_mummer.sh: bash script in sequential order to get repeated genes from a gff file and a fasta file, also includes the command in mummer "run-mummer3".
duplicated_genescoord.tsv: tabbed file with the genes to exclude for Illumina analysis -> with gene id + "\t" + orientation + "\t" + start + "\t" + end + "\n"

Closed genome

This folder contains multiple files related to the analyzed closed genome: Mycobacterium brumae ATCC 51384^T

brumae.fasta: fasta file with the genomic sequence of Mycobacterium brumae ATCC 51384^T
brumae.gff: annotation file used in this study.
brumae.sqn: file for submission to NCBI.

Name		Name	Last commit message	Last commit date
Latest commit History 90 Commits
closed_genome		closed_genome
scripts		scripts
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Genomic analysis of Mycobacterium brumae sustains its nonpathogenic and immunogenic phenotype

Aim of this repository

Scripts

blast_analysis

Command

duplicated_genes

Closed genome

About

Releases

Packages

Contributors 2

Languages

PathoGenOmics/mbrumae_closedgenome

Folders and files

Latest commit

History

Repository files navigation

Genomic analysis of Mycobacterium brumae sustains its nonpathogenic and immunogenic phenotype

Aim of this repository

Scripts

blast_analysis

Command

duplicated_genes

Closed genome

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages