StackGlyEmbed

Selected Feature Group

Training framework

Prediction framework

Data availability

All training and independent datasets are given in Dataset folder

Environments

OS: Ubuntu 22.04.4 LTS

Python version: Python 3.9.19

Used libraries:

numpy==1.26.4
pandas==2.2.1
pytorch==2.2.2
xgboost==2.0.3
pickle5==0.0.11
scikit-learn==1.2.2
matplotlib==3.8.2
PyQt5==5.15.10
imblearn==0.0
skops==0.9.0
shap==0.45.1
IPython==8.18.1

Reproduce results

Firstly, download all features. Read the readme.txt of all_features folder.
In N-GlycositeAtlas and N-GlyDE, reproducable codes are given. Training scripts are also provided. Follow the readme.txt instructions if it is given in the corresponding folder.

Prediction

Prerequisites

You need to have ProteinBert. Follow the following:

pip3 install tensorflow tensorflow_addons numpy pandas h5py lxml pyfaidx
git clone https://github.com/nadavbra/protein_bert.git
cd protein_bert
git submodule init
git submodule update
python setup.py install

transformers, Pytorch and tensorflow are needed for extracting the embeddings.
For more query, you can visit the following GitHubs:

ProtT5-XL-U50

ProteinBert

ESM2

Steps

Firsly, you need to fillup dataset.txt. Follow the pattern shown below:

Protein_id,site_position_1,site_position_2,...,site_position_n
Fasta

For predicting N-linked glycosylation sites from a protein sequence, you need to run the extractFeatures.py to generate features and then run predict.py for prediction.

Reproduce previous paper metrics

In Previous Paper codes, scripts are provided for reproducing the results of the previous papers.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

StackGlyEmbed

Selected Feature Group

Training framework

Prediction framework

Data availability

Environments

Reproduce results

Prediction

Prerequisites

Steps

Reproduce previous paper metrics

About

Uh oh!

Releases

Packages

Uh oh!

Contributors 2

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 10 Commits
Dataset		Dataset
N-GlyDE		N-GlyDE
N-GlycositeAtlas		N-GlycositeAtlas
Previous Paper codes		Previous Paper codes
all_features		all_features
prediction		prediction
README.md		README.md
requirements.txt		requirements.txt

nafcoder/StackGlyEmbed

Folders and files

Latest commit

History

Repository files navigation

StackGlyEmbed

Selected Feature Group

Training framework

Prediction framework

Data availability

Environments

Reproduce results

Prediction

Prerequisites

Steps

Reproduce previous paper metrics

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors 2

Uh oh!

Languages

Packages