Skip to content

Commit

Permalink
Merge pull request #683 from isb-cgc/staging
Browse files Browse the repository at this point in the history
Staging
  • Loading branch information
DeenaBleich authored Jul 11, 2022
2 parents 791ab13 + 36a2b78 commit c240e8b
Show file tree
Hide file tree
Showing 5 changed files with 164 additions and 0 deletions.
22 changes: 22 additions & 0 deletions docs/source/sections/Hosted-Data.rst
Original file line number Diff line number Diff line change
Expand Up @@ -59,6 +59,10 @@ Clinical, biospecimen and processed -omics data (such as RNASeq, etc.) are avail
- |checkmark|
- |checkmark|
-
* - `Exceptional Responders <data/EXC_RESPOND_about.html>`_
-
- |checkmark| **
-
* - `FM <data/FM_about.html>`_
- |checkmark|
- |checkmark|
Expand All @@ -75,6 +79,10 @@ Clinical, biospecimen and processed -omics data (such as RNASeq, etc.) are avail
- |checkmark|
- |checkmark|
- |checkmark|
* - `MP2PRT <data/MP2PRT_about.html>`_
-
- |checkmark| **
-
* - `NCICCR <data/NCICCR_about.html>`_
- |checkmark|
- |checkmark| *
Expand All @@ -87,6 +95,10 @@ Clinical, biospecimen and processed -omics data (such as RNASeq, etc.) are avail
- |checkmark|
- |checkmark|
-
* - `REBC <data/REBC_about.html>`_
- |checkmark|
- |checkmark| *
-
* - `TARGET <data/TARGET_top.html>`_
- |checkmark|
- |checkmark|
Expand All @@ -99,6 +111,10 @@ Clinical, biospecimen and processed -omics data (such as RNASeq, etc.) are avail
- |checkmark|
- |checkmark|
- |checkmark|
* - `TRIO <data/TRIO_about.html>`_
- |checkmark|
- |checkmark| *
-
* - `VAREPOP <data/VAREPOP_about.html>`_
- |checkmark|
- |checkmark|
Expand All @@ -112,6 +128,8 @@ Clinical, biospecimen and processed -omics data (such as RNASeq, etc.) are avail

*Clinical and metadata only available
**Clinical data only available
.. toctree::
:maxdepth: 1
:hidden:
Expand All @@ -122,16 +140,20 @@ Clinical, biospecimen and processed -omics data (such as RNASeq, etc.) are avail
data/CMI_about
data/CPTAC_about
data/CTSP_about
data/EXC_RESPOND_about
data/FM_about
data/GENIE_about
data/HCMI_about
data/MMRF_about
data/MP2PRT_about
data/NCICCR_about
data/OHSU_about
data/ORGANOID_about
data/REBC_about
data/TARGET_top
data/TCGA_top
data/TCGA-images
data/TRIO_about
data/VAREPOP_about
data/WCDT_about

Expand Down
29 changes: 29 additions & 0 deletions docs/source/sections/data/EXC_RESPOND_about.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,29 @@
*****************
Exceptional Responders Data Set
*****************

About Exceptional Responders
------------------------------------------------------------------------

The Exceptional Responders Initiative is a pilot study to investigate the underlying molecular factors driving exceptional treatment responses of cancer patients to drug therapies.

About Exceptional Responders Data
---------------------------------------------------------------------------------

Exceptional Responders has one project EXCEPTIONAL_RESPONDERS-ER with 84 cases spanning nine disease types and 20 primary sites. Data categories include sequencing reads, transcriptome profiling and simple nucleotide variation.

For more information on Exceptional Responders data, please refer to the site below:

- `GDC Data Portal <https://portal.gdc.cancer.gov/projects?filters=%7B%22op%22%3A%22and%22%2C%22content%22%3A%5B%7B%22op%22%3A%22in%22%2C%22content%22%3A%7B%22field%22%3A%22projects.program.name%22%2C%22value%22%3A%5B%22EXCEPTIONAL_RESPONDERS%22%5D%7D%7D%5D%7D>`_


Accessing the Exceptional Responders Data in Google BigQuery
------------------------------------------------

ISB-CGC has Exceptional Responders data, such as clinical, stored in Google BigQuery tables. Information about these tables can be found using the `ISB-CGC BigQuery Table Search <https://isb-cgc.appspot.com/bq_meta_search/>`_ with EXCEPTIONAL RESPONDERS selected for filter PROGRAM.
To learn more about this tool, see the `ISB-CGC BigQuery Table Search documentation <../BigQueryTableSearchUI.html>`_.

The Exceptional Responders tables are in project isb-cgc-bq. To learn more about how to view and query tables in the Google BigQuery console, see the `ISB-CGC BigQuery Tables documentation <../BigQuery.html>`_.

- Data set ``isb-cgc-bq.EXC_RESPONDERS`` contains the latest tables for each data type.
- Data set ``isb-cgc-bq.EXC_RESPONDERS_versioned`` contains previously released tables, as well as the most current table.
28 changes: 28 additions & 0 deletions docs/source/sections/data/MP2PRT_about.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,28 @@
*****************
MP2PRT Data Set
*****************

About MP2PRT
------------------------------------------------------------------------

The Molecular Profiling to Predict Response to Treatment (MP2PRT) program is part of the NCI's Cancer Moonshot Initiative. This study "Identification of Genetic Changes Associated with Relapse and/or Adaptive Resistance in Patients Registered as Favorable Histology Wilms Tumor on AREN03B2" performs genomic characterization on trio cases (normal tissue, tumor tissue at time of diagnosis, tumor tissue at time of relapse) from patients who relapsed with Favorable Histology Wilms Tumor.

About MP2PRT Data
---------------------------------------------------------------------------------

The MP2PRT data set includes one project MP2PRT-WT with 52 cases. Data categories include sequencing reads, transcriptome profiling, simple nucleotide variation and copy number variation.

For more information on MP2PRT data, please refer to the site below:

- `GDC Data Portal <https://portal.gdc.cancer.gov/projects?filters=%7B%22op%22%3A%22and%22%2C%22content%22%3A%5B%7B%22op%22%3A%22in%22%2C%22content%22%3A%7B%22field%22%3A%22projects.program.name%22%2C%22value%22%3A%5B%22MP2PRT%22%5D%7D%7D%5D%7D>`_


Accessing the MP2PRT Data in Google BigQuery
------------------------------------------------

ISB-CGC has MP2PRT data, such as clinical, stored in Google BigQuery tables. Information about these tables can be found using the `ISB-CGC BigQuery Table Search <https://isb-cgc.appspot.com/bq_meta_search/>`_ with MP2PRT selected for filter PROGRAM. To learn more about this tool, see the `ISB-CGC BigQuery Table Search documentation <../BigQueryTableSearchUI.html>`_.

The MP2PRT tables are in project isb-cgc-bq. To learn more about how to view and query tables in the Google BigQuery console, see the `ISB-CGC BigQuery Tables documentation <../BigQuery.html>`_.

- Data set ``isb-cgc-bq.MP2PRT`` contains the latest tables for each data type.
- Data set ``isb-cgc-bq.MP2PRT_versioned`` contains previously released tables, as well as the most current table.
42 changes: 42 additions & 0 deletions docs/source/sections/data/REBC_about.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,42 @@
*****************
REBC Data Set
*****************

About REBC
------------------------------------------------------------------------

REBC studies comprehensive genomic characterization of radiation-related papillary thyroid cancer in the Ukraine after the 1986 Chernobyl nuclear power plan accident. This accident released radioactive contaminants into the surrounding areas in Ukraine, Belarus, and Russia, causing an increased occurrence of thyroid cancer among individuals who were children at the time of the accident or born not long afterwards.

About REBC
---------------------------------------------------------------------------------

The REBC data set includes one project REBC-THYR with 440 cases. Data categories include sequencing reads, transcriptome profiling, simple nucleotide variation and copy number variation.

For more information on REBC data, please refer to the site below:

- `GDC Data Portal <https://portal.gdc.cancer.gov/projects?filters=%7B%22op%22%3A%22and%22%2C%22content%22%3A%5B%7B%22op%22%3A%22in%22%2C%22content%22%3A%7B%22field%22%3A%22projects.program.name%22%2C%22value%22%3A%5B%22REBC%22%5D%7D%7D%5D%7D>`_

Accessing the REBC Data on the Cloud
-------------------------------------------------------------------------------------------

Besides accessing the files on the GDC Data Portal, you can also access them from the GDC Google Cloud Storage Bucket, which means that you don’t need to download them to perform analysis. ISB-CGC stores the cloud file locations in tables in the ``isb-cgc-bq.GDC_case_file_metadata`` data set in BigQuery.

- To access these metadata files, go to the Google BigQuery console.
- Perform SQL queries to find the REBC files. Here is an example:

.. code-block:: sql
SELECT active.*, file_gdc_url
FROM `isb-cgc-bq.GDC_case_file_metadata.fileData_active_current` as active, `isb-cgc-bq.GDC_case_file_metadata.GDCfileID_to_GCSurl_current` as GCSurl
WHERE program_name = 'REBC'
AND active.file_gdc_id = GCSurl.file_gdc_id
Accessing the REBC Data in Google BigQuery
------------------------------------------------

ISB-CGC has REBC data, such as clinical and metadata, stored in Google BigQuery tables. Information about these tables can be found using the `ISB-CGC BigQuery Table Search <https://isb-cgc.appspot.com/bq_meta_search/>`_ with REBC selected for filter PROGRAM. To learn more about this tool, see the `ISB-CGC BigQuery Table Search documentation <../BigQueryTableSearchUI.html>`_.

The REBC tables are in project isb-cgc-bq. To learn more about how to view and query tables in the Google BigQuery console, see the `ISB-CGC BigQuery Tables documentation <../BigQuery.html>`_.

- Data set ``isb-cgc-bq.REBC`` contains the latest tables for each data type.
- Data set ``isb-cgc-bq.REBC_versioned`` contains previously released tables, as well as the most current table.
43 changes: 43 additions & 0 deletions docs/source/sections/data/TRIO_about.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,43 @@
*****************
TRIO Data Set
*****************

About TRIO
------------------------------------------------------------------------

The Ukrainian National Research Center for Radiation Medicine Trio Study contains epidemiologic data of trios of parents (exposed to the radiation from the Chernobyl accident) and their unexposed offspring. The purpose of the study is to investigate the transgenerational effects following nuclear accidents to understand the consequences of parental exposure to ionizing radiation.


About the TRIO Data
---------------------------------------------------------------------------------

The TRIO data set includes whole genome sequencing (WGS) sequencing reads for 339 cases in the project TRIO-CRU.

For more information on TRIO data, please refer to the site below:

- `GDC Data Portal <https://portal.gdc.cancer.gov/projects?filters=%7B%22op%22%3A%22and%22%2C%22content%22%3A%5B%7B%22op%22%3A%22in%22%2C%22content%22%3A%7B%22field%22%3A%22projects.program.name%22%2C%22value%22%3A%5B%22TRIO%22%5D%7D%7D%5D%7D>`_

Accessing the TRIO Data on the Cloud
-------------------------------------------------------------------------------------------

Besides accessing the files on the GDC Data Portal, you can also access them from the GDC Google Cloud Storage Bucket, which means that you don’t need to download them to perform analysis. ISB-CGC stores the cloud file locations in tables in the ``isb-cgc-bq.GDC_case_file_metadata`` data set in BigQuery.

- To access these metadata files, go to the Google BigQuery console.
- Perform SQL queries to find the TRIO files. Here is an example:

.. code-block:: sql
SELECT active.*, file_gdc_url
FROM `isb-cgc-bq.GDC_case_file_metadata.fileData_active_current` as active, `isb-cgc-bq.GDC_case_file_metadata.GDCfileID_to_GCSurl_current` as GCSurl
WHERE program_name = 'TRIO'
AND active.file_gdc_id = GCSurl.file_gdc_id
Accessing the TRIO Data in Google BigQuery
------------------------------------------------

ISB-CGC has TRIO data, such as clinical and metadata, stored in Google BigQuery tables. Information about these tables can be found using the `ISB-CGC BigQuery Table Search <https://isb-cgc.appspot.com/bq_meta_search/>`_ with TRIO selected for filter PROGRAM. To learn more about this tool, see the `ISB-CGC BigQuery Table Search documentation <../BigQueryTableSearchUI.html>`_.

The TRIO tables are in project isb-cgc-bq. To learn more about how to view and query tables in the Google BigQuery console, see the `ISB-CGC BigQuery Tables documentation <../BigQuery.html>`_.

- Data set ``isb-cgc-bq.TRIO`` contains the latest tables for each data type.
- Data set ``isb-cgc-bq.TRIO_versioned`` contains previously released tables, as well as the most current table.

0 comments on commit c240e8b

Please sign in to comment.