Skip to content

PathoGenOmics/mbrumae_closedgenome

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

90 Commits
 
 
 
 
 
 

Repository files navigation

Genomic analysis of Mycobacterium brumae sustains its nonpathogenic and immunogenic phenotype

Chantal Renau-Mínguez1, Paula Herrero-Abadía2, Vicente Sentandreu3, Paula Ruiz-Rodriguez1, Eduard Torrents4,5, Álvaro Chiner-Oms6, Manuela Torres-Puente6, Iñaki Comas6, Esther Julián2* and Mireia Coscolla1*

  1. I2SysBio, University of Valencia-CSIC, FISABIO Joint Research Unit Infection and Public Health, Valencia, Spain

  2. Genetics and Microbiology Department, Faculty of Biosciences, Autonomous University of Barcelona, 08193, Bellaterra, Barcelona, Spain

  3. Genomics Unit, Central Service for Experimental Research (SCSIE), University of Valencia, Spain

  4. Bacterial Infections and Antimicrobial Therapies Group, Institute for Bioengineering of Catalonia (IBEC), Baldiri Reixac 15-21, 08028 Barcelona, Spain

  5. Microbiology Section, Department of Genetics, Microbiology and Statistics, Biology Faculty, Universitat de Barcelona, 08028 Barcelona, Spain

  6. Instituto de Biomedicina de Valencia (IBV), CSIC, 46010, Valencia, Spain

* Correspondence:

mireia.coscolla@uv.es (Mireia Coscolla); Esther.Julian@uab.cat (Esther Julián)

Aim of this repository

The main purpose of this repository is to display the scripts made for this academic work, in order to achieve reproducibility. Also make public and available to all the closed genome of Mycobacterium brumae ATCC 51384T

Scripts

This folder contains multiple subfolders with scripts made for certain purpose.

Dependencies

Python version 3
BLAST+
The MUMmer 3 

blast_analysis

Scripts for the analysis to get protein identity of Mycobacterium tuberculosis H37Rv in the analyzed genomes of interest.

  • run_blast.py: Script to perform tblastn analysis extracting genes (aa) and find them in a fasta. We calculate the coincidence percentage of the gene with the target, finally we filter the genes by 3 coincidence percentage thresholds: 80, 70 and 60.

Command

python3 run_blast.py -g gene.txt -f brumae.fasta -p h37rv.fasta -n brumae_find

duplicated_genes

Scripts for the analysis to get genes with less than 300bp repeated in order to exclude this genes in Illumina genomic analysis.

  • clean_genes.py: script to get from gff file a tabbed file with gene id, orientation, start nt and end nt.
  • multifasta.py: script to get nt sequences for each gene from a fasta and a tsv file.
  • process_mummer_output.py: script to process mummer aou
  • command_mummer.sh: bash script in sequential order to get repeated genes from a gff file and a fasta file, also includes the command in mummer "run-mummer3".
  • duplicated_genescoord.tsv: tabbed file with the genes to exclude for Illumina analysis -> with gene id + "\t" + orientation + "\t" + start + "\t" + end + "\n"

Closed genome

This folder contains multiple files related to the analyzed closed genome: Mycobacterium brumae ATCC 51384T

  • brumae.fasta: fasta file with the genomic sequence of Mycobacterium brumae ATCC 51384T
  • brumae.gff: annotation file used in this study.
  • brumae.sqn: file for submission to NCBI.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published