Skip to content
Vlad edited this page Jan 19, 2023 · 1 revision

COSMIC vocabulary

Overview

This vocabulary is based on the data from the Catalogue Of Somatic Mutations In Cancer (shortly COSMIC) ar one of the the most comprehensive resource for exploring the impact of somatic mutations in human cancer.

Sources

The source data is provided by COSMIC developers in tsv format. Only Cancer Mutation Census Data source was used to be ingested in OMOP terminologies.

Transformation

The procedures for transforming Concepts from the source to the OMOP Standard Vocabularies can be found on the OHDSI GitHub.

Concept Names

All concept names are concatenations of gene acronym with both aminoacid residues alterations and changes in related coding sequences.

Concept Synonyms

To facilitate the OMOP Genomic composition based on KOIOS tool results all the relevant HGVS expressions are included as synonyms. The language of the concepts is a Genetic nomenclature.

Concept Code

Concept codes are taken from the COSMIC-derived genomic_mutation_id.

Standard Concepts

The entire set of COSMIC concepts are non-standard entities.

Concept Classes and Domains

The entire set of COSMIC concepts belong to Variant concept_class_id within Measurement domain.

Concept Relationships

No relationships within vocabulary exist. No relationships between COSMIC and other OMOP vocabularies exist at the time.

Instructions for ETL

All COSMIC concepts are non-Standard. That means they should be mapped to the corresponding Standard Concepts using the CONCEPT_RELATIONSHIP table ("Maps to" and occasionally "Maps to value" records). Most of them will mapped to single OMOP Genomic Concepts when it will be released.

Clone this wiki locally