Skip to content

Implementation of SVC modeling of protein function based on HMM domain scoring

License

Notifications You must be signed in to change notification settings

trappedInARibosome/go-model

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

6 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

#go-model README File

This is a library for building gene ontology support vector classifiers from protein domain scores and then using them to predict function of candidate proteins.

#Objective

Identification of enzymes by sequence homology tends to result in a signal to noise problem. Determining which candidates are genuine functional homologs and which are false positives can be difficult.

The go_preprocess script is designed to use HMMER to search a protein domain hmm database (Pfams are best known but others are possible) and save the scores.

The model_test script takes the protein hmm scores and existing gene ontology classifications to train support vector classifiers by grid search through a parameter space.

The go_prediction script takes the SVC generated by model building and predicts gene ontology based on the primary sequence.

#Requirements

About

Implementation of SVC modeling of protein function based on HMM domain scoring

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages