Skip to content

sambitmishra0628/siRNA_activity_prediction

Repository files navigation

Prediction of siRNA activity for RNA interference

Description

The set of scripts included in this repository use the dataset of siRNA created by Heusken et. al (Heusken, Nature Biotechnology, 2005). The goal is to develop, using this dataset, a regression model that will predict the efficacy of a given siRNA molecule for RNA interference and a classification model that will predict whether a given siRNA molecule will have the desired potency for RNA interference or not.

Dataset

The dataset comprises of a set of guide siRNA sequences from mouse and human and their measured activity against targetted mRNA sequences dataset

Approach

approach

List of features

The features were calculated using the calc_features_v2 script. feature_table

Results (classification model)

Performance with all features and performance convergence

classification_auc_and_convergence

Top features and performance with a subset of 92 features

classification_feature_importance_and_metrics_new

Results (regression model)

Performance, convergence and important features for regression model

Regression_performance_feat_imp

About

Predicting siRNA activity with machine learning

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published