Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add function to generate table of descriptive statistics #196

Open
AshlinHarris opened this issue Dec 2, 2022 · 2 comments
Open

Add function to generate table of descriptive statistics #196

AshlinHarris opened this issue Dec 2, 2022 · 2 comments
Assignees
Labels
enhancement New feature or request

Comments

@AshlinHarris
Copy link
Contributor

Input: a concept group c

Output:

  • A table with columns for individuals in c, individuals outside c, unmatched, and total (number and percentage)
  • Rows with totals and breakdowns (sex, age in 5 year bins)
@AshlinHarris AshlinHarris added the enhancement New feature or request label Dec 2, 2022
@AshlinHarris AshlinHarris self-assigned this Dec 2, 2022
@AshlinHarris
Copy link
Contributor Author

The function should be run on

  • the entire data set
  • Covid positive individuals
  • vaccinated individuals, broken down by vaccine type

What if individuals have received more than one vaccine type?

@AshlinHarris
Copy link
Contributor Author

Quote from https://www.ohdsi.org/web/wiki/doku.php?id=documentation:vocabulary:gender:

The Gender domain captures all concepts about the sex of a person, denoting the biological and physiological characteristics. In fact, the Domain (and field in the PERSON table) should probably should be called “sex” rather than “gender”, as gender refers to behaviors, roles, expectations, and activities in society.

The domain contains only two standard concepts: FEMALE (concept_id=8532) and MALE (concept_id=8507). Many data sources contain other codes, such as “Unknown”, “Refused to tell”, “Hermaphrodite”, as well as transgender constellations (“male to female”, etc.). For the current purposes of the OMOP CDM, the gender concepts are used to stratify patients by their biological make-up or to adjust analytical results for the influence of the biological sex. Therefore, all those other genders are denoted as concept_id=0 (unknown information).

Quote from https://www.ohdsi.org/web/wiki/doku.php?id=documentation:vocabulary:ethnicity:

The race field contains races and ethnic backgrounds, while for Ethnicity there are only two categories for data on ethnicity: “Hispanic or Latino” (concept_id=38003563) and “Not Hispanic or Latino” (concept_id=38003564). This means, the two categories are orthogonal to each other, and both Latinos and non-Latinos can have any racial or ethnic background.

This is a very US-centric solution, and hence the terminology might be confusing to non-US data owners. If belong to the latter group, you can probably ignore this field entirely.

There are no relationships defined for Ethnicity.

Relevant data fields:

  • :ETHNICITY_CONCEPT_ID
    • "Hispanic or Latino": 38003563
    • "Not Hispanic or Latino": 38003564
  • :GENDER_CONCEPT_SOURCE
    • Female: 8532
    • Male: 8507
    • Other or Unknown: 0
  • :YEAR_OF_BIRTH
  • :DRUG_CONCEPT_ID
    • Johnson & Johnson: 739906
    • Pfizer: 37003436
    • Moderna: 37003518

Identifying COVID-positive individuals requires a concept set. How should concept sets be handled in general?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

1 participant