# `CONDITION_CONCEPT`

`CONDITION_CONCEPT` is a table in the `effect_nsides` database that stores conditions/outcomes.
This table only stores those conditions/outcomes which appear in `CONDITION_OCCURRENCE`.
The schema for this table is the following:

```mysql
CREATE TABLE CONDITION_CONCEPT (
    concept_id int
    concept_name varchar(255)
    meddra_concept_id int
    snomed_concept_id int
)
```

Fields:
* `concept_id` is the OMOP CDM `concept_id` for each condition
* `concept_name` is the condition's OMOP CDM `concept_name`
* `meddra_concept_id` is the conditions ID from MedDRA
* `snomed_concept_id` is the conditions ID from SNOMED CT
    * Many of the concepts do not have a `snomed_concept_id` in the table. We were primarily concerned with the MedDRA concept IDs, as these were the codes used for PRR, etc. computation.

In [1]:
import pandas as pd

## Load original files

In [2]:
outcomes_df = pd.read_csv('../../data/meta_formatted/outcomes_table.csv.xz')
outcomes_df.head(2)

Unnamed: 0,report_id,outcome_concept_id,snomed_outcome_concept_id,report_index,outcome_index
0,100033001,36516812,77074.0,4394326,10544
1,100033001,35708093,196523.0,4394326,3612


In [3]:
concepts_df = pd.read_csv('../../data/athena_maps/CONCEPT.csv', sep='\t')
concepts_df.head(2)

  interactivity=interactivity, compiler=compiler, result=result)


Unnamed: 0,concept_id,concept_name,domain_id,vocabulary_id,concept_class_id,standard_concept,concept_code,valid_start_date,valid_end_date,invalid_reason
0,45956931,Self-emulsifying glyceryl monostearate,Observation,SNOMED,Substance,S,3578611000001105,19700101,20991231,
1,45956935,Sibutramine hydrochloride,Observation,SNOMED,Substance,S,3579011000001108,19700101,20991231,


## Format original files to build `CONDITION_CONCEPT` table

In [4]:
outcomes_df = (
    outcomes_df
    .rename(columns={'snomed_outcome_concept_id': 'snomed_concept_id'})
    .filter(items=['outcome_concept_id', 'snomed_concept_id'])
    .drop_duplicates()
)

concepts_df = (
    concepts_df
    .filter(items=['concept_id', 'concept_name', 'vocabulary_id', 'concept_code'])
    .rename(columns={'concept_code': 'meddra_concept_id'})
)

## Combine original files and save to `data/tables/`

In [5]:
condition_concept_df = (
    outcomes_df
    .merge(concepts_df, left_on='outcome_concept_id',  right_on='concept_id', how='left')
    .filter(items=['concept_id', 'concept_name', 'meddra_concept_id', 'snomed_concept_id'])
)

condition_concept_df.to_csv('../../data/tables/condition_concept.csv.xz',
                            compression='xz', index=False)

condition_concept_df.head(2)

Unnamed: 0,concept_id,concept_name,meddra_concept_id,snomed_concept_id
0,36516812,Arthralgia,10003239,77074.0
1,35708093,Diarrhoea,10012735,196523.0
