MESA

Multimodal Epigenetic Sequencing Analysis (MESA) is a flexible and sensitive method of capturing and integrating multimodal epigenetic information of cfDNA using a single experimental assay.

@ Modified by: Chaorong Chen

@ Modified time: 2023-02-11 02:27:49e original MESA paper, please refer to this tutorial: https://rpubs.com/LiYumei/926228.

Dependencies

Python >=3.6
deepTools
bedtools
DANPOS2
BSMAP
UCSC tools
Python Package
- pandas
- numpy
- scikit-learn = 0.24.2
- joblib
- itertools
- boruta_py
- deep-forest

Installation

Clone the repository with git:

git clone https://github.com/ChaorongC/MESA
cd MESA

Or download the repository with wget:

wget https://github.com/ChaorongC/MESA/archive/refs/heads/main.zip
unzip MESA-main.zip
cd MESA-main

Usage

The Python script MESA.py in the root directory is the main program for MESA. The function MESA_single() in 'MESA.py' is for analysis on a single type of feature, and the function MESA_integration() is for combining results on different types of features and returning the multimodal prediction result.

Example

Check the Jupyter notebook demo.ipynb for a tutorial on how to run MESA.

Parameters

MESA_single(X,
        y,
        estimator,
        classifiers=[],
        cv=5,
        random_state=0,
        min_feature=10,
        n_jobs=-1,
        scoring='roc_auc',
        boruta_top_n_feature=1000)

X : dataframe of shape (n_features, n_samples)

Input samples. A matrix containing features as rows with samples as columns.

y : array-like of shape (n_samples,)

Target values/labels/stages. Usually, we use 0 and 1 for 'normal/negative' and 'cancer/positive' samples.

estimator : estimator object/model implementing ‘fit’

The object used to fit the data. A model that is used to evaluate feature subsets in each iteration of sequential backward selection.

classifiers : a list of estimator object/model implementing ‘fit’ and 'predict_proba'

The object to use to evalutate on test set at the end. A model used to train on the final selected feature subset then test on the testing set.

cv : int, cross-validation generator or an iterable, default=5

(Adopted from sklearn.model_selection.cross_val_score) Determines the cross-validation splitting strategy. Possible inputs for cv are: None, to use the default 5-fold cross validation; int, to specify the number of folds in a (Stratified)KFold; CV splitter, An iterable yielding (train, test) splits as arrays of indices.

random_state : int, RandomState instance or None, default=0

Controls the pseudo random number generation for shuffling the data.

__min_feature : int, default=10

The minimal feature size SBS should consider.

n_jobs : int, default=-1

Number of jobs to run in parallel. When evaluating a new feature to add or remove, the cross-validation procedure is parallel over the folds. None means 1 unless in a joblib.parallel_backend context. -1 means using all processors.

scoring : str or callable, default='roc_auc'

For SBS process, a str (see scikit-learn model evaluation documentation) or a scorer callable object/function with signature scorer(estimator, X, y) which should return only a single value. Compatible with sklearn.model_selection.cross_val_score.

boruta_top_n_feature : int, default=1000

Features to select for SBS in the Boruta algorithm. Features are first ranked by Boruta then output for SBS for further selection.

MESA_integration(X_list, 
                  y, 
                  feature_selected, 
                  classifiers)

X : list of dataframes of shape (n_features, n_samples)

Input samples. A matrix containing features as rows with samples as columns.

y : array-like of shape (n_samples,)

Target values/labels/stages. Usually, we use 0 and 1 for 'normal/negative' and 'cancer/positive' samples.

feature_selected : list of tuples (n_samples)

Features selected for each LOO iteration (same order with X)

classifiers : a list of estimator object/model implementing ‘fit’ and 'predict_proba'

The object to use to evalutate on test set at the end.

Authors

Yumei Li (yumei.li@uci.edu)
JianFeng Xu (Jianfeng@heliohealth.com)
Chaorong Chen (chaoronc@uci.edu)
Wei Li (wei.li@uci.edu)

Name		Name	Last commit message	Last commit date
Latest commit History 56 Commits
.ipynb_checkpoints		.ipynb_checkpoints
.vscode		.vscode
__pycache__		__pycache__
codes_PaperReproducibility		codes_PaperReproducibility
.DS_Store		.DS_Store
LICENSE		LICENSE
MESA_util.py		MESA_util.py
README.md		README.md
demo.ipynb		demo.ipynb

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

.ipynb_checkpoints

.ipynb_checkpoints

.vscode

.vscode

pycache

pycache

codes_PaperReproducibility

codes_PaperReproducibility

.DS_Store

.DS_Store

LICENSE

LICENSE

MESA_util.py

MESA_util.py

README.md

README.md

demo.ipynb

demo.ipynb

Repository files navigation

MESA

@ Modified by: Chaorong Chen

@ Modified time: 2023-02-11 02:27:49e original MESA paper, please refer to this tutorial: https://rpubs.com/LiYumei/926228.

Dependencies

Installation

Usage

Example

Parameters

Authors

About

Releases

Packages

Contributors 2

Languages

License

ChaorongC/MESA

Folders and files

Latest commit

History

Repository files navigation

MESA

@ Modified by: Chaorong Chen

@ Modified time: 2023-02-11 02:27:49e original MESA paper, please refer to this tutorial: https://rpubs.com/LiYumei/926228.

Dependencies

Installation

Usage

Example

Parameters

Authors

About

Resources

License

Stars

Watchers

Forks

Languages