Skip to content

Commit

Permalink
extend File-Formats.md with mutational signature datatype
Browse files Browse the repository at this point in the history
  • Loading branch information
MatthijsPon committed May 12, 2023
1 parent 584008b commit 86fd419
Showing 1 changed file with 31 additions and 0 deletions.
31 changes: 31 additions & 0 deletions docs/File-Formats.md
Original file line number Diff line number Diff line change
Expand Up @@ -21,6 +21,7 @@
* [Study Tags file](#study-tags-file)
* [Generic Assay](#generic-assay)
* [Arm Level CNA Data](#arm-level-cna-data)
* [Mutational Signature Data](#mutational-signature-data)
* [Resource Data](#resource-data)
* [Custom namespace columns](#custom-namespace-columns)

Expand Down Expand Up @@ -1599,6 +1600,36 @@ Allowed values for Arm-level copy-number data are `Loss`, `Gain`, and `Unchanged

Please find example file format here: [Meta file example](https://github.com/cBioPortal/cbioportal-frontend/blob/master/end-to-end-test/local/studies/lgg_ucsf_2014_test_generic_assay/meta_armlevel_CNA.txt) and [Data file example](https://github.com/cBioPortal/cbioportal-frontend/blob/master/end-to-end-test/local/studies/lgg_ucsf_2014_test_generic_assay/data_armlevel_CNA.txt)

### Mutational Signature Data
Mutational Signature data is a predefined subtype of Generic Assay Data. Setting `generic_assay_type: MUTATIONAL_SIGNATURE`
in the meta file will make cBioPortal interpret the data as Mutational Signature data.

#### Mutational Signature meta files
The mutational signature meta files follow the same convention as the [Generic Assay Meta file](#generic-assay-meta-file),
however there are some key differences:
- `genetic_assay_type` should be set to `MUTATIONAL_SIGNATURE`
- `stable_id` values should end with: `{datatype}_{identifier}`, where:
- `datatype` is one of `contribution`, `pvalue` or `matrix`
- `identifier` is consistent between files belonging to the same analysis
- Multiple signatures can be added to a single study, as long as they have different identifiers in their stable id
(e.g., `contribution_SBS` and `contribution_DBS`)
- In `generic_entity_meta_properties` the `NAME` value is required. The `DESCRIPTION` and `URL` values can be added
to display more information and link to external resources in the mutational signatures tab.

#### Mutational Signature data files
The mutational signature data files follow the same convention as the [Generic Assay Data file](#generic-assay-data-file).
Each collection of mutational signatures can consist of up to three different data files, each with an accompanying meta file.
- Signature contribution file (**required**)
- Data file containing the contribution of each signature-sample pair. Values are expected to be 0 ≥ x ≥ 1.
- Signature p-values file (optional)
- Data file containing p-values for each signature-sample pair. Values below 0.05 will be treated as significant.
- Mutation matrix file (optional)
- Data file containing nucleotide changes of a sample. cBioPortal has specific visualisation options for single-base
substitutions (96 channels), double-base substitutions (72 channels) and insertion/deletions (83 channels),
following the signatures defined by [COSMIC](https://cancer.sanger.ac.uk/signatures/). But other channels can also
be used. Values are expected to be positive integers.


## Resource Data

The resource data is used to capture resource data in patients, samples and studies. The resources will be represented by URLs with meta data. The types of resources include:
Expand Down

0 comments on commit 86fd419

Please sign in to comment.