Persistent Tor-algebra for protein-protein interaction analysis

This manual is for the code implementation of paper "Persistent Tor-algebra for protein-protein interaction analysis"

Software configuration

    Platform: Python>=3.6
    Packages needed: math, numpy>=1.18.1, scipy>=1.4.1, scikit-learn>=0.22.1, gudhi,

Persistent Tor-algebra based machine learning model

Details about each step

We take data SKEMPI S1131 as an example to illustrate the procedure of our model.

Data

We need to get the atom coordinates from the protein-protein complexes. This can be done from the code of TopNetTree 10.24433/CO.0537487.v1.

Algebraic representation and persistent Tor-algebra featurization

For each protein-protein complex, the element-specific atom combinations are used to generated the Vietoris-Rips complex, and persistent Tor-algebra are computed from these simplicial complexes. You can directly run the script "code/skempi-tor-feature-generation.py" to get the persistent Tor-algebra features for data SKEMPI S1131. The auxiliary features can be generated by the code from https://codeocean.com/capsule/2202829/tree/v1

Machine learning

We use ensemble learning to do the prediction. More specifically, we have two base learners, 1D CNN and GBT, and a meta learner, GBT. For two base learners, you can run "code/skempi-cnn.py" and "code/skempi-gbt.py" to generate the base learner features. For meta learner, you can run "code/tenfold-CV.py" to get the final prediction.

For new dataset

To use our persistent Tor-algebra model, you need to firstly generate the 3d-coordinates of your point cloud data. Then, you can use the following function to generate the persistent Tor-algebra feature

def get_tor_algebra_I_J(point,J,outfile,typ):
    # generate persistent Tor-algebra from point cloud data.
    # point: 3-D coordinate of the point cloud data
    # J: type of Tor-algebra, you can choose : 2,3,4,5,98,99,100. (Yo can add more by revising our code)
    # outfile: filepath of output file
    # typ: 0

The above function can be found in "code/skempi-tor-algebra-feature-generation.py". Our persistent Tor-algebra also can be computed from graph data, distance matrix data. You can contact me if you are interested.

Name		Name	Last commit message	Last commit date
Latest commit History 15 Commits
code		code
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

code

code

LICENSE

LICENSE

README.md

README.md

Repository files navigation

Persistent Tor-algebra for protein-protein interaction analysis

Software configuration

Persistent Tor-algebra based machine learning model

Details about each step

Data

Algebraic representation and persistent Tor-algebra featurization

Machine learning

For new dataset

About

Releases

Packages

Languages

License

LiuXiangMath/PTA-SEL

Folders and files

Latest commit

History

Repository files navigation

Persistent Tor-algebra for protein-protein interaction analysis

Software configuration

Persistent Tor-algebra based machine learning model

Details about each step

Data

Algebraic representation and persistent Tor-algebra featurization

Machine learning

For new dataset

About

Resources

License

Stars

Watchers

Forks

Languages