DELEAT v0.1 (deletion design by essentiality analysis tool)

DELEAT is a bioinformatic analysis pipeline for the design of large-scale genome deletions in bacterial genomes.

An article describing implementation details and a usage example has been published in BMC Bioinformatics. Please cite as: Solana, J., Garrote-Sánchez, E. & Gil, R. DELEAT: gene essentiality prediction and deletion design for bacterial genome reduction. BMC Bioinformatics 22, 444 (2021). https://doi.org/10.1186/s12859-021-04348-5

DELEAT uses a machine learning logistic regression classifier trained on a selection of organisms from the Database of Essential Genes (DEG) to assign an essentiality score for each gene in the genome, and then uses this information to determine non-essential regions.

Once deletions are designed, DELEAT provides the user with useful information in the shape of reports and a circular genome plot to visualise the potential genome reduction.

Installation

With Conda:

Clone the repository: git clone https://github.com/jime-sg/deleat.git deleat && cd deleat.
Create a Conda env from deleat_env.txt file: conda create --name deleat-v0.1 --file deleat_env.txt. This will install all dependencies.
Add DELEAT to your PATH: edit ~/.bashrc file to include alias deleat="python /your/path/to/deleat/deleat-v0.1/deleat.py" (change /your/path/to/ to the appropiate path).

With Docker + Conda:

Clone the repository: git clone https://github.com/jime-sg/deleat.git deleat && cd deleat.
Build the Docker image: docker build -t deleat .
Run the container with an interactive shell, mounting your GenBank annotation file for analysis: docker run -it -v <genbank_file_path>:/home/genbank/<genbank_filename> deleat

Usage

First, activate the Conda env: conda activate deleat-v0.1

General usage: deleat <step name> <step arguments>

Steps (use deleat <step name> -h to see command-line usage for each step):

predict-essentiality
define-deletions
revise-deletions
summarise
design-all-primers / design-primers

Step 1: predict-essentiality

Get predicted essentiality scores for all genes in a bacterial genome.

Usage: deleat predict-essentiality -g <GenBank file> -o <output dir> [-p1 <ori>] [-p2 <ter>] -n <n_threads>

You only need to provide the GenBank annotation file of your organism of interest as input data. Essentiality scores will be calculated for all genes annotated with a unique locus_tag identifier. If you know the ori and ter replication coordinates, you should indicate those as well -- otherwise they will be inferred by the GC skew method.

Results are saved to a modified-I GenBank file (.gbm1), which includes essentiality scores. This file can be visualised with genome browsing tools such as Artemis, and scores can be edited manually if necessary.

Step 2: define-deletions

Compute table of proposed deletions.

Usage: deleat define-deletions -g1 <modified-I GenBank file> -o <output dir> -l <min deletion length> -e <essentiality score threshold> [-m <non-coding margin>]

An essentiality threshold of about 0.75 is recommended, but feel free to adjust it for your specific case.

The default non-coding margin around essential genes which is to be retained is 200 bp, but you can change this setting if you know of shorter/longer cis-regulatory elements in your genome.

Results are saved to a modified-II GenBank file (.gbm2), which includes essentiality scores and proposed deletions, and a deletion table in CSV format.

Step 3: revise-deletions

Redefine deletion list after manual curation.

Usage: deleat revise-deletions -g2 <modified-II GenBank file> -t <modified table of proposed deletions> -o <output dir>

Edit the output table of step 2 as you need to in order to manually curate the list of deletions. You can eliminate a deletion by either deleting or commenting (#) the corresponding line. Step 3 takes the first three columns (deletion name and coordinates) as input and updates output files accordingly.

Step 4: summarise

Generate final reports about deletion design and genome reduction process.

Usage: deleat summarise -g3 <modified-II GenBank file> -o <output dir> [-p1 <ori>] [-p2 <ter>]

Step 5: design-all-primers / design-primers

Design primers for large genome deletions by megapriming.

Usage:

deleat design-all-primers -g3 <modified-II GenBank file> -o <output dir> -e <restriction enzyme> [-L <min length for homologous recombination>]
deleat design-primers -g <genome sequence (FASTA)> -o <output dir> -n <deletion name> -d1 <del start> -d2 <del end> -e <restriction enzyme> [-L <min length for homologous recombination>]

Name		Name	Last commit message	Last commit date
Latest commit History 178 Commits
.idea		.idea
deleat-v0.1		deleat-v0.1
img		img
.gitignore		.gitignore
Dockerfile		Dockerfile
LICENSE		LICENSE
README.md		README.md
deleat_env.txt		deleat_env.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

.idea

.idea

deleat-v0.1

deleat-v0.1

img

img

.gitignore

.gitignore

Dockerfile

Dockerfile

LICENSE

LICENSE

README.md

README.md

deleat_env.txt

deleat_env.txt

Repository files navigation

DELEAT v0.1 (deletion design by essentiality analysis tool)

Installation

With Conda:

With Docker + Conda:

Usage

Step 1: predict-essentiality

Step 2: define-deletions

Step 3: revise-deletions

Step 4: summarise

Step 5: design-all-primers / design-primers

About

Releases

Packages

Languages

License

jime-sg/deleat

Folders and files

Latest commit

History

Repository files navigation

DELEAT v0.1 (deletion design by essentiality analysis tool)

Installation

With Conda:

With Docker + Conda:

Usage

Step 1: predict-essentiality

Step 2: define-deletions

Step 3: revise-deletions

Step 4: summarise

Step 5: design-all-primers / design-primers

About

Topics

Resources

License

Stars

Watchers

Forks

Languages