a bio-/chemoinformatics pipeline for drug repositioning applied to schistosomiasis
Table of contents
Q. What is drug repositioning?
A. The usage of a known drug for a different therapeutic indication. If you are not familiar with this at all, try Wikipedia
Q. What is schistosomiasis?
A. A very nasty parasitic disease affecting over 200 million people. Learn more about schistosomiasis on the World Health Organization website
Q. How does the tool work?
A. By mapping! known drugs -> their targets -> their domain architecture -> parasite targets
Q. I am reading this README on my local machine, why is the formatting all weird?
A. This README is formatted in GitHub markdown, please open it on GitHub. I will include an instructions-only plain text readme soon
|drug_repo.py||Python script that reads input files (chemb/drugbank), filters data, extracts relevant info for mapping with domain architecture info. It is being developed at the moment.||n/a|
|README.md||this readme file||n/a|
|chembl_drugs.txt||ChEMBL drugs; downloaded from ChEMBL||30/04/2014|
|chembl_drugtargets.txt||ChEMBL drug targets; downloaded from ChEMBL, manually edited to strip a newline character at lines 383/384.||30/04/2014|
|chembl_uniprot_mapping.txt||ChEMBL uniprot mapping, chembl ID to UniProt codes; downloaded from the ChEMBL 18 release page: ftp://ftp.ebi.ac.uk/pub/databases/chembl/ChEMBLdb/releases/chembl_18/||25/04/2014|
|small_molecule_target_ids_all.csv||DrugBank Drug Target Identifier/Small Molecule Drugs; downloaded from DrugBank (if necessary, all_target_ids_all.csv is also available)||06/05/2014|
|uniprot_pdb.csv||Uniprot to pdb mapping file; downloaded from SIFTS (if necessary, a tsv version is also available)||12/06/2014|
|lig_pairs.lst||pdb to ligand mapping file; downloaded from PDBsum downloads (if necessary, the het_pairs.lst version is also available)||17/06/2014|
|Components-smiles-oa.smi||chemical components dictionary in smiles format; downloaded from RCSB Ligand Expo Downloads, in the SMILES/InChi data files (if necessary, stereo versions and CACTVS-generated versions available)||20/06/2014|
|pointless_het.csv||contains list of 'pointless' het ligands, including aminoacids, nucleotides, metals and crystallographic solvets/aids||n/a|
|all.sdf||DrugBank drugs in sdf format; downloaded from DrugBank||16/07/2014|
|speclist.txt||taxonomic codes and mnemonic codes for all species; downloaded from UniProt||07/06/2014|
|pdb_pfam_mapping.txt||PDB IDs to Pfam domains and residue numbers; downloaded from ftp://ftp.ebi.ac.uk/pub/databases/Pfam/mappings/||20/08/2014|
- BioPhython - Freely available on the BioPython website(we have used release 1.64)
- ArchIndex/ArchSchema - kindly provided by Dr Laskowski. For more information, please visit the ArchSchema website, or read the main reference for ArchSchema
- SMSD (Small Molecule Subgraph Detector). For more information, please visit the SMSD website, the GitHub repository, or read the main reference for SMSD
- MODELLER, for homology modelling (only for step 10). See MODELLER website
- arch_schema_cath.tsv (UniProt IDs to CATH domains and residue numbers), to be downloaded from ftp://ftp.biochem.ucl.ac.uk/pub/gene3d_data/CURRENT_RELEASE/
- clone the repository
- check requirements
- modify the config.py file according to your needs
- run the script (>python drug_repo.py)
Copyright © 2014 Sandra Giuliani
This repository is licensed under the terms of the MIT license. Please see the license file for more information. The MIT license is approved by the Open Source Initiative
THIS IS A WORK IN PROGRESS. The main script (drug_repo.py) is currently being developed at the London School of Hygiene and Tropical Medicine, under the supervision of Dr Nick Furnham.