Degron Landscape of Fusion Genes in Cancer

This repository contains a pipeline to analyze the impact of gene fusions on protein stability by either degron loss.

Download miscellaneous data

As some degrons require a post-translational modification (PTM), you will need to download the PTM data from the PhosphositePlus database (https://www.phosphosite.org/staticDownloads). Once downloaded change the variables PHOS_PATH, ACETYL_PATH and UBIQ_PATH to refer to the location of your download for phosphorylation, acetylation and ubiquitination respectively.

Next you will need to download data from SNVBox.

$ cd data
$ wget https://www.dropbox.com/s/gglvpfm06gqzghu/fusion_snvbox_data.zip?dl=1 -O fusion_snvbox_data.zip
$ unzip fusion_snvbox_data.zip
$ cd ..

Install environment

This code only works on linux and was tested on CentOS release 6.10. No non-standard hardware are needed. The code has only been tested on CentOS 6.10 and software version numbers listed in the environment.yaml config.

Next, you will need to install the appropriate conda environment (https://docs.conda.io/en/latest/).

$ conda env create -f environment.yaml

Note that software dependencies are expressed in the environment.yaml config file. It may take ~30min to 1 hour to install dependencies. Now you need to activate the "fusion" conda environment.

$ conda activate fusion

Analysis

Running the pipeline is done through the run.sh script. Note that by default it uses the fusion calls that were analyzed in the paper.

$ ./run.sh output

Note that you can change "output" to whatever directory you choose to output the results as.

You will need to stop midway through the shell script (line #47), and then use the "motif/wt_motif_hits_canonical.snvbox_input.txt" file as input to SNVBox. Once finished, copy the file to "degron_pred/wt_motif_hits_canonical.snvbox_annot.txt" location within your output directory and run the remaining portions of the shell script.

Running all of the commands may take ~10-20 hours to run on a normal computer without parallelization.

Ultimately, you should get a similar result as this.

Name		Name	Last commit message	Last commit date
Latest commit History 52 Commits
data		data
scripts		scripts
README.md		README.md
agfusion.homo_sapiens.95.db		agfusion.homo_sapiens.95.db
environment.yaml		environment.yaml
run.sh		run.sh

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Degron Landscape of Fusion Genes in Cancer

Download miscellaneous data

Install environment

Analysis

About

Releases 1

Packages

Languages

ctokheim/fusion_pipeline

Folders and files

Latest commit

History

Repository files navigation

Degron Landscape of Fusion Genes in Cancer

Download miscellaneous data

Install environment

Analysis

About

Resources

Stars

Watchers

Forks

Releases 1

Packages 0

Languages

Packages