Skip to content

Commit

Permalink
Merge pull request #594 from isb-cgc/Deena-Staging-Theme
Browse files Browse the repository at this point in the history
Deena staging theme
  • Loading branch information
DeenaBleich committed Jun 24, 2021
2 parents 35c1897 + 5772a96 commit a1dec3d
Show file tree
Hide file tree
Showing 2 changed files with 40 additions and 5 deletions.
6 changes: 3 additions & 3 deletions docs/source/sections/Hosted-Data.rst
Original file line number Diff line number Diff line change
Expand Up @@ -61,7 +61,7 @@ Clinical, biospecimen and processed -omics data (such as RNASeq, etc.) are avail
-
* - `FM <data/FM_about.html>`_
- |checkmark|
- |checkmark| *
- |checkmark|
- |checkmark|
* - `GENIE <data/GENIE_about.html>`_
- |checkmark|
Expand All @@ -73,7 +73,7 @@ Clinical, biospecimen and processed -omics data (such as RNASeq, etc.) are avail
-
* - `MMRF <data/MMRF_about.html>`_
- |checkmark|
- |checkmark| *
- |checkmark|
- |checkmark|
* - `NCICCR <data/NCICCR_about.html>`_
- |checkmark|
Expand Down Expand Up @@ -101,7 +101,7 @@ Clinical, biospecimen and processed -omics data (such as RNASeq, etc.) are avail
- |checkmark|
* - `VAREPOP <data/VAREPOP_about.html>`_
- |checkmark|
- |checkmark| *
- |checkmark|
-
* - `WCDT <data/WCDT_about.html>`_
- |checkmark|
Expand Down
39 changes: 37 additions & 2 deletions docs/source/sections/data/CPTAC_about.rst
Original file line number Diff line number Diff line change
Expand Up @@ -23,6 +23,38 @@ For more information on the CPTAC data, please refer to these sites:

ISB-CGC also has proteomic CPTAC data, obtained from the `Proteomics Data Commons (PDC) <https://pdc.cancer.gov/pdc/>`_ API. This includes clinical and protein expression data for breast, ovarian, colon, liver, lung, uterine and other cancers.

The NCI CPTAC has generated a tremendous amount of valuable quantitative proteomics data derived from clinical cancer specimens and makes them publicly accessible to the community. We have imported the data into Google BigQuery, where they can be queried via SQL and easily joined with data tables from TCGA using the BigQuery interface or programmatically with the BigQuery API.

Which studies are available?

- CCRCC - Clear cell renal cell carcinoma
- GBM - glioblastoma multiforme
- HNSCC - Head and neck squamous cell carcinoma
- LUAD - lung adenocarcinoma
- UCEC - Uterine Corpus Endometrial Carcinoma
- Breast cancer
- Colon cancer
- Ovarian cancer

Most studies have both whole proteome as well as phosphoproteome. A few studies also have acetylome and glycoproteome data.

What processing of the raw data is available here?

- Most data have been processed by the original producers and presented in publications.
- The same raw data have been processed uniformly through the CPTAC Common Data Analysis Pipeline (CDAP).
- We provide here the results from the CDAP sourced from the PDC API.

Important considerations:

- All abundances are presented as log2 ratios as computed by the CDAP.
- Abundances are comparable within each study since the same reference was used within each study.
- However, different controls were used for different studies, and therefore extreme caution should be used when comparing abundance values between different studies.
- Some PDC datasets are embargoed, which means that the data may be examined prior to the end of the embargo period, but no manuscripts may be published until the embargo expires. Currently, ISB-CGC does not host any embargoed data in our BQ datasets.

Python Jupyter Notebooks showing examples of queries of PDC CPTAC data are available at:

* `How do I explore CPTAC protein abundances? <https://nbviewer.jupyter.org/github/isb-cgc/Community-Notebooks/blob/master/Notebooks/How_to_explore_CPTAC_protein_abundances.ipynb>`_

Accessing the NCI Clinical Proteomic Tumor Analysis Consortium Data on the Cloud
----------------------------------------------------------------------------------

Expand Down Expand Up @@ -51,19 +83,22 @@ Here is an example to find CPTAC-3 GDC files:
Accessing the CPTAC Data in Google BigQuery
------------------------------------------------

ISB-CGC has GDC CPTAC data, such as clinical, RNA-Seq and somatic mutation, and PDC CPTAC data, such as clinical and protein expression, stored in Google BigQuery tables. Information about these tables can be found using the `ISB-CGC BigQuery Table Search <https://isb-cgc.appspot.com/bq_meta_search/>`_ with CPTAC2 and/or CPTAC3 selected for filter PROGRAM.
ISB-CGC has GDC CPTAC data, such as clinical, RNA-Seq and somatic mutation, and PDC CPTAC data, such as clinical and protein expression, stored in Google BigQuery tables.

Information about these tables can be found using the `ISB-CGC BigQuery Table Search <https://isb-cgc.appspot.com/bq_meta_search/>`_ with CPTAC2 and/or CPTAC3 selected for filter PROGRAM.
To learn more about this tool, see the `ISB-CGC BigQuery Table Search documentation <../BigQueryTableSearchUI.html>`_.

The CPTAC tables are in project isb-cgc-bq.

- Data set ``isb-cgc-bq.CPTAC`` contains the latest tables for each data type.
- Data set ``isb-cgc-bq.CPTAC_versioned`` contains previously released tables, as well as the most current table.

Note that some data are part of a CPTAC2 retrospective study of TCGA data. These tables are labeled as both program CPTAC2 and TCGA and can be found be filtering for either. The tables are in project isb-cgc-bq.
Note that some data are part of a CPTAC2 retrospective study of TCGA data. These tables are labeled as both program CPTAC2 and TCGA and can be found by filtering for either. The tables are in project isb-cgc-bq.

- Data set ``isb-cgc-bq.TCGA`` contains the latest tables for each data type.
- Data set ``isb-cgc-bq.TCGA_versioned`` contains previously released tables, as well as the most current table.

In addition, there are some tables with CPTAC data derived from the 2017 paper `Proteogenomics connects somatic mutations to signalling in breast cancer <https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5102256/>`_. These are in data set ``isb-cgc.hg19_data_previews``. They are labeled with programs CPTAC2 and TCGA and source LIT (for literature).

To learn more about how to view and query tables in the Google BigQuery console, see the `ISB-CGC BigQuery Tables documentation <../BigQuery.html>`_.
Here is an example of a PDC CPTAC table viewed in the Google BigQuery console: `quant_acetylome_prospective_breast_BI_pdc_current <https://console.cloud.google.com/bigquery?p=isb-cgc-bq&d=CPTAC&t=quant_acetylome_prospective_breast_BI_pdc_current&page=table>`__

0 comments on commit a1dec3d

Please sign in to comment.