Skip to content

ilovemane/CodelistGenerator

 
 

Repository files navigation

codecov.io R-CMD-check Lifecycle:Experimental

CodelistGenerator

Introduction

CodelistGenerator is used to create a candidate set of codes for helping to define patient cohorts in data mapped to the OMOP common data model. A little like the process for a systematic review, the idea is that for a specified search strategy, CodelistGenerator will identify a set of concepts that may be relevant, with these then being screened to remove any irrelevant codes.

Installation

You can install the development version of CodelistGenerator like so:

install.packages("remotes")
remotes::install_github("darwin-eu/CodelistGenerator")

Connecting to the OMOP CDM vocabularies

# example with postgres database connection details
server_dbi<-Sys.getenv("server")
user<-Sys.getenv("user")
password<- Sys.getenv("password")
port<-Sys.getenv("port")
host<-Sys.getenv("host")

db <- DBI::dbConnect(RPostgres::Postgres(),
                dbname = server_dbi,
                port = port,
                host = host,
                user = user,
                password = password)

# name of vocabulary schema
vocabulary_database_schema<-Sys.getenv("vocabulary_schema")

Example search

Every codelist is specific to a version of the OMOP CDM vocabularies, so we can first check the version.

dplyr::tbl(db, dplyr::sql(paste0(
    "SELECT * FROM ",
    vocabulary_database_schema,
    ".vocabulary"
    ))) %>%
    dplyr::rename_with(tolower) %>%
    dplyr::filter(.data$vocabulary_id == "None") %>%
    dplyr::select("vocabulary_version") %>%
    dplyr::collect() %>%
    dplyr::pull()
#> [1] "v5.0 13-JUL-21"

We can then search for asthma like so

asthma_1<-get_candidate_codes(keywords="asthma",
                    domains = "Condition",
                    db=db,
                    vocabulary_database_schema = vocabulary_database_schema)
kable(head(asthma_1, 10))

concept_id

concept_name

domain_id

vocabulary_id

761844

Inhaled steroid-dependent asthma

Condition

SNOMED

764677

Persistent asthma

Condition

SNOMED

764949

Persistent asthma, well controlled

Condition

SNOMED

3661412

Thunderstorm asthma

Condition

SNOMED

4015819

Asthma disturbs sleep weekly

Condition

SNOMED

4015947

Asthma causing night waking

Condition

SNOMED

4017025

Asthma disturbing sleep

Condition

SNOMED

4017026

Asthma not limiting activities

Condition

SNOMED

4017182

Asthma disturbs sleep frequently

Condition

SNOMED

4017183

Asthma not disturbing sleep

Condition

SNOMED

Perhaps we want to exclude certain concepts as part of the search strategy, in which case this can be added like so

asthma_2<-get_candidate_codes(keywords="asthma",
                    domains = "Condition",
                    exclude = "Poisoning by antiasthmatic",
                    db=db,
                    vocabulary_database_schema = vocabulary_database_schema)
kable(head(asthma_2, 10))

concept_id

concept_name

domain_id

vocabulary_id

761844

Inhaled steroid-dependent asthma

Condition

SNOMED

764677

Persistent asthma

Condition

SNOMED

764949

Persistent asthma, well controlled

Condition

SNOMED

3661412

Thunderstorm asthma

Condition

SNOMED

4015819

Asthma disturbs sleep weekly

Condition

SNOMED

4015947

Asthma causing night waking

Condition

SNOMED

4017025

Asthma disturbing sleep

Condition

SNOMED

4017026

Asthma not limiting activities

Condition

SNOMED

4017182

Asthma disturbs sleep frequently

Condition

SNOMED

4017183

Asthma not disturbing sleep

Condition

SNOMED

We can then also see source codes these are mapped from, for example

asthma_icd_mappings<-show_mappings(candidate_codelist=asthma_2,
                     source_vocabularies="ICD10CM",
                    db=db,
                    vocabulary_database_schema =  vocabulary_database_schema)
kable(head(asthma_icd_mappings %>% 
       select(standard_concept_name,
              standard_vocabulary_id,
              source_concept_name,
              source_vocabulary_id),
     10))

standard_concept_name

standard_vocabulary_id

source_concept_name

source_vocabulary_id

Eosinophilic asthma

SNOMED

Pulmonary eosinophilia, not elsewhere classified

ICD10CM

Eosinophilic asthma

SNOMED

Eosinophilic asthma

ICD10CM

Eosinophilic asthma

SNOMED

Other pulmonary eosinophilia, not elsewhere classified

ICD10CM

Eosinophilic asthma

SNOMED

Pulmonary eosinophilia, not elsewhere classified

ICD10CM

Cryptogenic pulmonary eosinophilia

SNOMED

Chronic eosinophilic pneumonia

ICD10CM

Simple pulmonary eosinophilia

SNOMED

Acute eosinophilic pneumonia

ICD10CM

Asthma

SNOMED

Asthma

ICD10CM

Asthma

SNOMED

Other and unspecified asthma

ICD10CM

Asthma

SNOMED

Unspecified asthma

ICD10CM

Asthma

SNOMED

Other asthma

ICD10CM

Releases

No releases published

Packages

No packages published

Languages

  • R 100.0%