WormMap

The predicted complexes Cytoscape file and elution profile files can be found at https://github.com/BaderLab/EPIC/tree/master/WormMap (Make sure to update your Cytoscape software version to 3.6.0; elution profile files can be opened in JavaTreeView software.) ELution files could be found at https://github.com/LucasHu/Worm-co-frac-data

EPIC

Elution Profile-Based Inference of Protein Complexes

EPIC/test_data/elution_profiles/

../InputFolder/

-c ../complexes.txt

../OutputFolder/

-o PrefixName

-f FILE -F ../fun_anno_file.txt

EPIC/test_data/WormNetV3_noZeros_no_physical_interactions.txt.zip

-P ../scores.txt

python /EPIC/src/main.py -s 11101001 ../InputFolder/ -c ../complexes.txt ../OutputFolder/ -o PrefixName -M RF -n 6 -m COMB -f STRING

../OutputFolder/

Name		Name	Last commit message	Last commit date
Latest commit History 227 Commits
WormMap		WormMap
src		src
test_data		test_data
.gitignore		.gitignore
Dockerfile		Dockerfile
EPIC-Manual.docx		EPIC-Manual.docx
EPIC-Manual.pdf		EPIC-Manual.pdf
README.MD		README.MD

BaderLab/EPIC

Folders and files

Latest commit

History

Repository files navigation

WormMap

EPIC

Elution Profile-Based Inference of Protein Complexes

Installation

To install EPIC, first make sure you have Python 2.7 and scikit-learn package installed. Also one correlation score ("wcc") utilizes R to perform computation, thus R and rpy2 should be installed in your computer too.

We recommend using the Anaconda environemnt to run EPIC and install associated libraries. Anaconda can be downloaded from "https://conda.io/docs/user-guide/install/download.html#anaconda-or-miniconda"

Create an Anaconda environment: type "conda create -n EPIC python=2.7 anaconda"

If the system says "conda: command not found", you probably need to type "export PATH=~/anaconda2/bin:$PATH" first

Activate your Anaconda EPIC environment: type "source activate EPIC"

Install "rpy2": type "conda install rpy2"

Install "requests": type "conda install requests"

Install "scikit-learn": type "conda install scikit-learn"

Install "beautifulsoup4": type "conda install beautifulsoup4"

Install "mock": type "conda install mock"

Install "R": type "conda install -c r r"

Install "kohonen": type "conda install -c r r-kohonen"

Install "numpy": type "pip install numpy"

If this step requires you to install "msgpack" or "argparse", just type "pip install msgpack" and "pip install argparse"

The package "wccsom" can be downloaded from https://cran.r-project.org/src/contrib/Archive/wccsom/wccsom_1.2.11.tar.gz

If the directory containing this package is "/wccsom_directory", go to r (type: R in command line), type: install.packages('/wccsom_directory/wccsom_1.2.11.tar.gz', type = 'source'). This will install the "wccsom" package.

(sometimes, you need to type: install.packages('class') and then type: install.packages('kohonen') in R before doing the step above.)

Install "matplotlib": type "python -mpip install -U matplotlib"

In case you missed any packages, you usaually can easily type "conda install missing_package_name" to install it...

First, open terminal and go to your desired directory. Then git clone at your desired directory:

Prepare Input Data

Make a file folder and put all the elution profile files into this file folder. There are a couple of examples in the folder:

UniProt protein identifiers should be used in these elution profile files. Assume the input elution profiles (make sure no other folders are in this directory, only store elution profiles files there) are stored at

Run EPIC

Specify correlation scores to be used in EPIC. Eight different correlation socres are implemented in EPIC, in order: Mutual Information, Bayes Correlation, Euclidean Distance, Weighted Cross-Correlation, Jaccard Score, PCCN, Pearson Correlation Coefficient, and Apex Score.

"0" indicates that we don't use this correlation score and "1" indicates that we use this correlation score. For example, 11101001 means we will use Mutual Information, Bayes Correlation, Euclidean Distance, Jaccard Score and Apex Score. To specify the correlation scores to use:

There are two ways of generating gold standard protein complexes.

The second way is to automatically download protein complexes from public databases (CORUM, IntAct and GO). In this case, you only need to specify the Taxonomy ID for the species you are studying on using "-t". For instnace, for C. elegans, we use:

(When you use both of the above two options, the reference will come from your input reference complexes file rather than curating from the Internet.)

Specify the directory of storing output files.

We need to specify a file folder that can store all the output files. This command needs to be given after the input file folder. Let's assume the output file folder is:

Specify the prefix name of output files.

You can specify a prefix name for all the output files to distinguish between different runs. The default is "Out"

EPIC currently supports two machine learning classifiers: support vector machine and random forest. You can pick one here. "SVM" stands for support vector machine and "RF" stands for random forest. You can specify the machine learning classifier as:

or

You need to specify the number of cores used to run EPIC, the default number is 1. Assume you want to use six cores to run EPIC, you can give the following command:

(Notice that STRING and GEMANIA only have limited number of species, an error will be reported if the species is not in these databases.)

If you you use your own curated functional evidence data, you should also specify the directory of the functional evidence data file. Assume the functional evidence data file is "fun_anno_file.txt" and it is stored in the file folder "../", you can use the following parameter:

There is an example file that shows the format of "fun_anno_file.txt" at:

Overall, you can run a command like (we take default values for many parameters):

Output

Once EPIC finishes (which may take many hours), you will find all the output files in the file folder:

Contact

If you have problems, you could contact Lucas (lucasming.huATmail.utoronto.ca) or Florian (florian.goebelsATgooglemail.com).

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 3

Languages

Packages