PRAD_EMSeq is a code repository that contains necessary R scripts and processed files for EM-seq data interpretation and visualization.
PRAD_EMSeq_interpretation.R is the main file of the pipeline. And machine_learning folder includes three R files corresponding to the three machine learning algorithms, Elastic Net Regression, XGBoost, and Logistic Regression, used in this project.
All_merged_DMRs_sig_methy_group_clinics_infor.csv is the main data of visualization. It has methylation percentages of all significant DMRs and clinical information of each patient. You can find it in the folder processed_data.
Relavant manuscript is under editor review in Biology Open.
(Waiting to be published~)