The SWORD2 partitioning algorithm produces multiple alternative domain assignments for a given protein structure. This unique approach handles ambiguous protein structure partitioning, admitting several solutions. The decomposition of the protein structure into domains is achieved through the hierarchical clustering of Protein Units, evolutionarily preserved structural descriptors at the interface between secondary structures and domains.
This is the repository of the standalone version of the corresponding webserver:
https://www.dsimb.inserm.fr/SWORD2/index.html
(Easy & recommanded)
Install the conda environment using the environment.yml
file:
conda env create -f environment.yml
or
mamba env create -f environment.yml
Compile necessary dependencies:
bash install.sh
Otherwise, a docker is also available:
docker pull dsimb/sword2
Launch the docker version:
# This will mount the local `results` directory inside the Docker image and run the program inside the image as you (local user)
# so that the results generated have good rights.
docker run -it -e LOCAL_UID=$(id -u $USER) -e LOCAL_GID=$(id -g $USER) -v $(pwd)/results:/app/results dsimb/sword2 -p 1jx4 -o results
First, activate the working environment:
conda activate sword2
Then, launch SWORD2:
./SWORD2.py -p 1jx4 -o results
./SWORD2.py -u Q76EI6 -o results
./SWORD2.py -m MGYP000936678158 -o results
./SWORD2.py -i ./structure.pdb -o results
./SWORD2.py -i ./structure.pdb -d 2 -o results
To get the full help:
$ ./SWORD2.py --help ─╯
usage: SWORD2.py [-h] (-u UNIPROT_ID | -m MGNIFY_ID | -p PDB_ID | -i INPUT_FILE) [-c PDB_CHAIN] [-d MODEL] [-x CPU] -o OUTPUT
SWORD2: SWift and Optimized Recognition of protein Domains.
The SWORD2 partitioning algorithm produces multiple alternative
domain assignments for a given protein structure.
This unique approach handles ambiguous protein structure partitioning,
admitting several solutions. The decomposition of the protein structure
into domains is achieved through the hierarchical clustering of Protein Units,
evolutionarily preserved structural descriptors at the interface between
secondary structures and domains.
options:
-h, --help show this help message and exit
-u UNIPROT_ID, --uniprot-id UNIPROT_ID
AlphaFold Uniprot Accession Id.
The corresponding predicted structure will be downloaded from the AlphaFold database.
-m MGNIFY_ID, --mgnify-id MGNIFY_ID
MGnify Id.
The corresponding predicted structure will be downloaded from the ESM Metagenomic Atlas database.
-p PDB_ID, --pdb-id PDB_ID
PDB id.
The corresponding structure will be downloaded from the PDB database.
-i INPUT_FILE, --input-file INPUT_FILE
Path to an input PDB or mmCIF file.
optional arguments:
-c PDB_CHAIN, --pdb-chain PDB_CHAIN
PDB chain. Default is A.
-d MODEL, --model MODEL
Model to parse. Especially usefull for NMR files which contain several models. Default is 1.
-x CPU, --cpu CPU How many CPUs to use.
Default all (0).
Max on this computer is: 32
required arguments:
-o OUTPUT, --output OUTPUT
Output directory.
Results will be generated inside in a dedicated directory
named after OUTPUT/PDBID_CHAIN/
Example:
$ ./SWORD2.py -p 1jx4 -o results ─╯
2022/11/03 02:00:04 INFO Fetch PDB ID: 1jx4
2022/11/03 02:00:04 DEBUG Connecting wwPDB FTP server RCSB PDB (USA).
2022/11/03 02:00:08 DEBUG 1jx4 downloaded (results/1jx4_A/1jx4.pdb)
2022/11/03 02:00:08 DEBUG PDB download via FTP completed (1 downloaded, 0 failed).
2022/11/03 02:00:08 DEBUG 3162 atoms and 1 coordinate set(s) were parsed in 0.03s.
2022/11/03 02:00:08 INFO
2022/11/03 02:00:08 INFO >>> Estimated runtime: 4 minutes
2022/11/03 02:00:08 INFO >>> Using 8 cpus
2022/11/03 02:00:08 INFO
2022/11/03 02:00:08 INFO Write a clean version of the PDB: remove non standard residues
2022/11/03 02:00:08 INFO Launch SWORD
2022/11/03 02:00:23 INFO Parse SWORD output
2022/11/03 02:00:23 INFO Calculate pseudo-energies of Domains
2022/11/03 02:01:57 INFO Write the SWORD results
2022/11/03 02:01:57 INFO Calculate pseudo-energies of PUs
2022/11/03 02:02:37 INFO Write the Peeling results
2022/11/03 02:02:37 INFO Generate histogram of SWORD2 domains consistency
2022/11/03 02:03:09 INFO Calculate junctions consistencies
2022/11/03 02:03:09 INFO Clean and prepare results
2022/11/03 02:03:09 INFO Results can be found here: results/1jx4_A
An easily parseable output in JSON format is generated for easier downstream tasks/analysis: SWORD2_summary.json