Skip to content

kevqyzhu/TMscoreAlign

Repository files navigation

TMscoreAlign

TMscoreAlign is an R package that implements the TM-align program in R (Zhang & Skolnick, 2005). TM-align is a method for aligning proteins based on TM-score (Template Modelling score), which calculates topological similarity between two protein structures (Zhang & Skolnick, 2004). TM-score improves upon existing means of structural comparison such as RMSD (root-mean-square deviation) as it is length-independent and more sensitive to global similarities between structures rather than local deviations. This package is targeted for structural biologists who may use this package to investigate different conformations of the same protein, informing a structural basis of protein functions. The tool includes functions that performs structure alignment, calculates TM-scores and RMSD, and visualizes aligned structures. TMscoreAlign was developed using R version 4.3.1 (2023-06-16), Platform: aarch64-apple-darwin20 (64-bit) and Running under: macOS Ventura 13.2.1.

Installation

To install the latest version of the package:

install.packages("devtools")
library("devtools")
devtools::install_github("kevqyzhu/TMscoreAlign", build_vignettes = TRUE)
library("TMscoreAlign")

To run the Shiny app:

TMscoreAlign::runTMscoreAlign()

Overview

ls("package:TMscoreAlign")
data(package = "TMscoreAlign") 
browseVignettes("TMscoreAlign")

There are 7 functions in TMscoreAlign

get_alignment: This function performs structure alignment between two protein structures specified by PDB files. First, it selects common residues using a user-specified method (either based on ‘index’ or ‘alignment’). Then, the alignment parameters, including translation and rotation values, are initialized. These values can be further optimized for TM-score (Template Modeling score) before being returned to the user.

optimize_alignment: This function performs optimization to improve the alignment of two protein structures based on their atomic coordinates. It takes as input the translation and rotation values and aims to find the best parameters that minimize the objective function (TM-score). Note that this process can be initialized with default values or continue from a given set of values. Optimization is performed via the limited-memory Broyden–Fletcher–Goldfarb–Shanno algorithm.

get_tmscore: This function calculates the TM-score from the alignment parameters and coordinates obtained in a structural alignment between two protein structures.

get_tm_samples: This function calculates the TM local scores from the alignment parameters and coordinates obtained in a structural alignment between two protein structures.

get_rmsd: This function calculates the RMSD (root mean square deviation) between two protein structures based on their alignment parameters.

write_pdb: This function writes the transformed coordinates obtained from a structure alignment to a new PDB file. The transformed coordinates are generated by applying the alignment parameters to the original coordinates of the second protein structure. The resulting PDB file includes the transformed coordinates appended to the original coordinates of the first structure.

visualize_alignment_pdb: This function visualizes the structural alignment of two protein structures by displaying the aligned coordinates in a 3D viewer. The viewer is set up using 3Dmol, and the alignment is represented by color-coded cartoon-style structures for each protein chain. The user can specify the alignment parameters and chain coloration.

The package also contains two example PDB files of decoy protein structures from the 3DRobot dataset (Deng, Jia, & Zhang, 2016) for demonstration purposes. Refer to package vignettes for more details. An overview of the package is illustrated below. Note that for clarity, not all functions are shown.

Contributions

The package is created by Kevin Zhu. get_alignment calls a helper function that uses read.pdb, clean.pdb, pdbseq, and atom.select from the bio3d package to read the input PDB files. The pairwiseAlignment and subject functions from Biostrings, as well as start from BiocGenerics, are used to generate the sequence alignment. The optimize_alignment function uses the optim function from the stats package in order to optimize the alignment parameters for maximum TM-score. Furthermore, when generating the transformation matrix to represent the alignment parameters, the cross function from the pracma package is used to calculate the vector cross product. The write_pdb function uses trim.pdb, cat.pdb, and write.pdb from the bio3d package to write the aligned outputs. The visualization from this package is accomplished using r3dmol. Color-picking in the shiny app is accomplished using the colourInput function from colourpicker. All functions from this package were written by the author. No generative AI tools were used in the development of this package.

References

Acknowledgements

This package was developed as part of an assessment for 2023 BCB410H: Applied Bioinformatics course at the University of Toronto, Toronto, CANADA. TMscoreAlign welcomes issues, enhancement requests, and other contributions. To submit an issue, use the GitHub issues. Many thanks to those who provided feedback to improve this package.

Tree Structure

- TMscoreAlign
  |- .gitignore
  |- .Rbuildignore
  |- .RData
  |- .Rhistory
  |- data
    |- sample_pdb1.rda
    |- sample_pdb2.rda
  |- DESCRIPTION
  |- inst
    |- CITATION
    |- extdata
      |- 1LNIA_decoy1_4.pdb
      |- 1LNIA_decoy2_180.pdb
      |- overview.png
    |- shiny-scripts
      |- app.R
  |- LICENSE
  |- LICENSE.md
  |- man
    |- calculate_dist_samples.Rd
    |- calculate_rmsd.Rd
    |- calculate_tm_samples.Rd
    |- calculate_tmscore.Rd
    |- estimate_d0.Rd
    |- get_alignment.Rd
    |- get_default_values.Rd
    |- get_matrix.Rd
    |- get_rmsd.Rd
    |- get_tm_samples.Rd
    |- get_tmscore.Rd
    |- load_data_alignment.Rd
    |- optimize_alignment.Rd
    |- runTMscoreAlign.Rd
    |- sample_pdb1.Rd
    |- sample_pdb2.Rd
    |- visualize_alignment_pdb.Rd
    |- write_pdb.Rd
  |- NAMESPACE
  |- TMscoreAlign.Rproj
  |- R
    |- aligner.R
    |- data.R
    |- pdb_alignment.R
    |- scoring.R
    |- transformation_matrix.R
    |- runTMscoreAlign.R
  |- README.md
  |- README.RMD
  |- tests
    |- testthat
      |- test-aligner.R
      |- test-pdb_alignment.R
      |- test-scoring.R
      |- test-transformation_matrix.R
    |- testthat.R
  |- vignettes
    |- .gitignore
    |- Introduction_TMscoreAlign.Rmd

About

No description, website, or topics provided.

Resources

License

Unknown, MIT licenses found

Licenses found

Unknown
LICENSE
MIT
LICENSE.md

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages