CFA (CRM Function Annotator)

Related paper:

Tzu-Hsien Yang*, Yu-Huai Yu+, Shang-Hang Wu+, Fang-Yuan Zhang+, "CFA: an explainable deep learning model for cis-regulatory module transcriptional role annotation based on epigenetic codes", Computers in Biology and Medicine, 2022.

+: These authors contributed equally.

Prepare the Environment

Suggested running environments: Linux Ubuntu 16.04.6, Python 3.8.13

We recommend that you can use the conda package to create a new environment. This will automatically install the required python packages.

Here is an example:

Install the Conda package for you system. The installation of the package can be found here.
Create the CFA Conda environment. This may take a while, depending on the network status.

conda create -n "CFA" python=3.8.13

Activate your CFA Conda environment.

conda activate CFA

Steps to Use CFA

Download the codes from the following link and unzip the file. Please skip it if you have done this step.

wget https://cobis-fs.bme.ncku.edu.tw/CFA/CFA_Model.tar.gz

Unzip the file.

tar -zxvf CFA_Model.tar.gz

Change the working directory.

cd CFA_Model

Download the processed epigenetic datasets from the following link.

wget https://cobis-fs.bme.ncku.edu.tw/CFA/CFA_Dataset.tar.gz

Unzip the file.

tar -zxvf CFA_Dataset.tar.gz

If this is the first time you use CFA, run the following command to install necessary packages.

pip install -r requirements.txt

CFA can also support GPU acceleration. If you want to utilize GPU, please run the following command instead:

pip install -r requirements_gpu.txt

Prepare the input Drosophila CRM chromosomal regions (ver. DM6). Multiple chromosomal regions can be provided in the same input file.

The input format SHOULD followed the following formats: chromosome,start,end

For example: (as the input file named input_Test.csv)

Note: The input chromosomal regions start from the 5' end.
Predict the identification probability of CRM transcriptional roles (promoter, enhancer, and insulator).

python main.py -i <input_csv_file> -o <output_directory>

Required arguments:

-i: The input file for the CFA Model.

-o: The output directory of the predicted results and SHAP bar plots.

The probability results will be saved to the file named "predicted_result.csv" in the specified output directory.

CFA also provides the top five determining epigenetic profiling types for every CRM. The SHAP value plots and the standardized ChIP-seq socres are stored in the specified output directory.

Output Results

If we use the following as our inputs with the example command:

python main.py -i input_Test.csv -o output/

** output/predicted_result.csv :**

Output format explanation:

CRM_location,0,0,0 --> Contains nothing.

CRM_location,1,0,0 --> Contains promoter.

CRM_location,0,1,0 --> Contains enhancer.

CRM_location,0,0,1 --> Contains insulator.

CRM_location,1,1,0 --> Contains promoter and enhancer.

CRM_location,1,0,1 --> Contains promoter and insulator.

CRM_location,0,1,1 --> Contains enhancer and insulator.

CRM_location,1,1,1 --> Contains promoter, enhancer and insulator.

Output Plots :

Take X_8305146_8307354 from the above input as an example.

The CRM is identified to have promoter and insulator functions. And the output plots for this CRM is in the subdirectory output/X_8305146_8307354.

The SHAP value plots

The standardized ChIP-seq score plots

Name		Name	Last commit message	Last commit date
Latest commit History 14 Commits
LICENSE		LICENSE
README.md		README.md
SHAP.jpg		SHAP.jpg
Scores.jpg		Scores.jpg

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

LICENSE

LICENSE

README.md

README.md

SHAP.jpg

SHAP.jpg

Scores.jpg

Scores.jpg

Repository files navigation

CFA (CRM Function Annotator)

Related paper:

Prepare the Environment

Steps to Use CFA

Output Results

About

Releases

Packages

License

cobisLab/CFA

Folders and files

Latest commit

History

Repository files navigation

CFA (CRM Function Annotator)

Related paper:

Prepare the Environment

Steps to Use CFA

Output Results

About

Resources

License

Stars

Watchers

Forks