Join GitHub today
GitHub is home to over 28 million developers working together to host and review code, manage projects, and build software together.Sign up
generating various molecular representations for chemicals, proteins, DNAs/RNAs and their interactions.
Fetching latest commit…
Cannot retrieve the latest commit at this time.
|Failed to load latest commit information.|
# BioMedR ## Introduction The BioMedR package offers an R/Bioconductor package generating various molecular representations for chemicals, proteins, DNAs/RNAs and their interactions. See `vignette(‘BioMedR’)` for the comprehensive user guide. ## Installation To install the BioMedR package in R, simply type source('http://bioconductor.org/biocLite.R') biocLite(‘BioMedR’) To make the BioMedR package fully functional (especially the Open Babel related functionalities), we recommend the users also install the _Enhances_ packages by using: source('http://bioconductor.org/biocLite.R') biocLite('Rcpi', dependencies = c('Imports', 'Enhances')) Several dependencies of the BioMedR package may require some system-level libraries, check the corresponding manuals of these packages for detailed installation guides. ## Features Rcpi implemented and integrated the state-of-the-art protein sequence descriptors and molecular descriptors/fingerprints with R. For protein sequences, the Rcpi package could * Calculate six protein descriptor groups composed of fourteen types of commonly used structural and physicochemical descriptors that include 9920 descriptors. * Calculate six types of generalized scales-based descriptors derived by various dimensionality reduction methods for proteochemometric (PCM) modeling. * Parallellized pairwise similarity computation derived by protein sequence alignment and Gene Ontology (GO) semantic similarity measures within a list of proteins. For small molecules, the Rcpi package could * Calculate 307 molecular descriptors (2D/3D), including constitutional, topological, geometrical, and electronic descriptors, etc. * Calculate more than ten types of molecular fingerprints, including FP4 keys, E-state fingerprints, MACCS keys, etc., and parallelized chemical similarity search. * Calculate three nucleic acid composition features describing the local sequence information by means of kmers (subsequences of DNA sequences). * Calculate six autocorrelation features describing the level of correlation between two oligonucleotides along a DNA sequence in terms of their specific physicochemical properties. * Calculate two pseudo nucleotide composition features, which can be used to represent a DNA sequence with a discrete model or vector yet still keep considerable sequence order information, particularly the global or long-range sequence order information, via the physicochemical properties of its constituent oligonucleotides. * Parallelized pairwise similarity computation derived by fingerprints and maximum common substructure search within a list of small molecules. By combining various types of descriptors for drugs, proteins and DNA/RNA in different methods, interaction descriptors representing protein-protein, compound-compound, DNA-DNA, compound-DNA compound-protein and DNA-protein interactions could be conveniently generated withBioMedR, including: * Two types of compound-protein interaction (CPI) descriptors * Two types of compound-DNA interaction (CDI) descriptors * Two types of DNA-protein interaction (DPI) descriptors * Three types of protein-protein interaction (PPI) descriptors * Three types of compound-compound interaction (CCI) descriptors * Three types of DNA-DNA interaction (DDI) descriptors Several useful auxiliary utilities are also shipped with BioMedR: * Parallelized molecule and protein sequence retrieval from several online databases, like PubChem, ChEMBL, KEGG, DrugBank, UniProt, RCSB PDB, genBank, etc. * Loading molecules stored in SMILES/SDF files and loading protein sequences from FASTA/PDB files * Molecular file format conversion The computed protein sequence descriptors, molecular descriptors/fingerprints, interaction descriptors and pairwise similarities are widely used in various research fields relevant to drug disvery, primarily bioinformatics, chemoinformatics, proteochemometrics and chemogenomics. ## Links * Bioconductor Page: http://bioconductor.org/packages/release/bioc/html/BioMedR.html * Track Devel: https://github.com/wind22zhu/BioMedR * Report Bugs: https://github.com/wind22zhu/BioMedR/issues ## Contact The BioMedR package is developed by Computational Biology and Drug Design Group, Central South University, China. * Minfeng Zhu <email@example.com> * Dong Jie <firstname.lastname@example.org> * Dongsheng Cao <email@example.com>