CRMnet: a deep learning model for predicting gene expression from large regulatory sequence datasets

.
├── preprocessed_data/                          # Preprocessed data folder
├── train.py                                    # Model training code
├── model.py                                    # Model construction code        
├── requirements.txt                            # required python packages
└── README.md

This repository contains code for "CRMnet: a deep learning model for predicting gene expression from large regulatory sequence datasets"

To setup environment on TPU vm (take v2-8 as example):

initiate a tpu-vm with tensorflow 2.8.0 preinstalled:

 gcloud alpha compute tpus tpu-vm create tpu_v2 --zone=asia-east1-c --accelerator-type=v2-8 --version=tpu-vm-tf-2.8.0

install supporting packages:
```
 pip install -r requirements.txt
```

To download the original data and preprocess the data from scratch:
1. Download the original data from: https://zenodo.org/record/4436477#.Y4a_PS0RoUE
2. Copy "complex_media_training_data_Glu.txt" to "./Yeast_Original_Data/"
3. Preprocessed the data:
```
 python3 data_preprocessing.py
```
4. The final dataset in tf.data.Dataset format is about 250GB
To use the trained model:
1. Download the model weight from: https://zenodo.org/record/7375243#.Y9W0iS0RoUG
2. Load the trained model:
```
 model = tf.keras.models.load_model(PATH)
```
To train model from scratch:
1. on tpu v2-8 about 4 hours to converge:
```
 python3 train.py
```
  trained model will be saved on folder "/saved_model"

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

CRMnet: a deep learning model for predicting gene expression from large regulatory sequence datasets

About

Uh oh!

Releases

Packages

Uh oh!

Contributors 2

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 9 Commits
README.md		README.md
data_preprocessing.py		data_preprocessing.py
model.py		model.py
requirements.txt		requirements.txt
train.py		train.py
utils.py		utils.py

jiayuwen/CRMnet

Folders and files

Latest commit

History

Repository files navigation

CRMnet: a deep learning model for predicting gene expression from large regulatory sequence datasets

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors 2

Uh oh!

Languages

Packages