Skip to content
The research and processing repo for import of CPTAC data into cBioPortal
Branch: master
Clone or download
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Type Name Latest commit message Commit time
Failed to load latest commit information.


This is a repository for all of the data processing scripts for the transfer of CPTAC data into cBioPortal as part of 2016's Google Summer of Code. The purpose of this is not only to produce flat text files for import into the cBioPortal database, but it's also to do data exploration and cross-dataset normalization. This has been incorporated into the cBioPortal visualization interface.


This is a pretty specific package, so we designed it so that it was easy to use on-the-fly. First, clone the repo and cd in:

git clone
cd CPTAC-proteomics-pipeline

If you would like to have all the CPTAC files we used, please run the wget script:


Please visit the tutorial, which goes through all the elements of the API.

NOTE: As shown in the tutorial, to import the classes, just add the relative location of the script to your current working directory. For example, since the tutorial is nested inside the repo:

import sys


Thanks to my PI David Fenyo and the GSoC mentors at MSKCC, JJ Gao and Zack Heins, for guidance.

You can’t perform that action at this time.