Skip to content
panISa is a software to search insertion sequence (IS) on resequencing data (bam file)
Branch: master
Clone or download
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Permalink
Type Name Latest commit message Commit time
Failed to load latest commit information.
lib Correction bug May 9, 2018
script Add script of simulation Jan 8, 2018
test Run to create coule step Feb 24, 2017
validate adding accession no. May 16, 2017
.gitignore Add setup script to PyPI deposit Jan 10, 2018
ISFinder_search.py Correct version number Jul 24, 2018
LICENSE
MANIFEST.in Add setup script to PyPI deposit Jan 10, 2018
README.rst Update README.rst May 9, 2018
panISa.py Correct version number Jul 24, 2018
principe.png Add schema May 7, 2018
setup.cfg Add setup script to PyPI deposit Jan 10, 2018
setup.py Correct version number Jul 24, 2018

README.rst

panISa

panISa is a software identifying insertion sequence (IS) on resequencing data (bam file) in bacterial genomes.

Idea

The panISa software searches for Insertion Sequences on NGS data ab initio (i.e. with a database-free approach) in bacterial genomes from short read data. Briefly, the software identifies a signature of insertion in the alignment by counting clipped reads on the start and end positions of the potential IS. These clipped reads overlap the direct repeats due to IS insertion. Finally, using a reconstruction of the beginning of both sides of the IS (IRL and IRR), panISa validates the IS by searching for inverted repeat regions.

Principe of panISa

Requirements and Installation

Requirements

The program used the python library pysam (>=0.9) and request (>=2.12)

You need to install the emboss package

In debian, type:

sudo apt-get install python-pysam python-requests emboss

Installation

Download the current tarball and unzip it.

Verify the installation using the test file

python panISa.py test/test.bam

Alternatively, you can install from PyPI repository

pip install panisa

Command and Options

python panISa.py [options] bam

Options

-h show this help message and exit
-o Return list of IS insertion by alignment [stdout]
-q Minimum alignment quality value to conserve a clipped read [20]
-m Minimum number of clipped reads to look at IS on a position [10]
-s Maximum size of direct repeat region [20bp]
-p Minimum percentage of same base to create consensus [0.8]
-v show program's version number and exit

Output

PanISa returns result in tabular format with the following columns:

Chromosome:
chromosome id
End position:
position of the last base of the direct repeat and the left bondary of the potential IS (IRL)
End clipped reads:
number of clipped reads (end position)
Direct repeat:
nucleotidic sequence of the direct repeat
Start position:
position of the first base of the direct repeat and the right bondary of the potential IS (IRR)
Start clipped reads:
number of clipped reads (start position)
Inverted repeats:
nucleotidic sequence of inverted repeats and their position
IS left sequence:
reconstruction of the left boundary of the potential IS (IRL)
IS right sequence:
reconstruction of the right boundary of the potential IS (IRR)

Validation

PanISa results can be search for homology against ISFinder to find IS familly using the script ISFinder_search.py

python ISFinder_search.py [options] panISa results

Recommandation

panISa works well with the alignment from bwa software.

You can’t perform that action at this time.