# `CONDITION_OCCURRENCE`


`CONDITION_OCCURRENCE` is a table in the `effect_nsides` database that stores relationships between reports and their outcomes ("condition occurrences").
The schema for this table is the following:

```mysql
CREATE TABLE CONDITION_OCCURRENCE (
    report_id int
    condition_concept_id int
)
```

Fields:
* `report_id` is the ID for each report, from the FDA itself. This is a foreign key for `REPORT.report_id`.
* `condition_concept_id` is the OMOP CDM `concept_id` for each condition. This is a foreign key for `CONDITION_CONCEPT.concept_id`.

In [1]:
import pandas as pd

## Load original file

In [2]:
outcomes_df = pd.read_csv('../../data/meta_formatted/outcomes_table.csv.xz')

outcomes_df.head(2)

Unnamed: 0,report_id,outcome_concept_id,snomed_outcome_concept_id,report_index,outcome_index
0,100033001,36516812,77074.0,4394326,10544
1,100033001,35708093,196523.0,4394326,3612


## Subset and save resulting table

In [3]:
condition_occurrence = (
    outcomes_df
    .rename(columns={'outcome_concept_id': 'condition_concept_id'})
    .filter(items=['report_id', 'condition_concept_id'])
)

condition_occurrence.to_csv('../../data/tables/condition_occurrence.csv.xz', index=False,
                            compression='xz')

condition_occurrence.head(2)

Unnamed: 0,report_id,condition_concept_id
0,100033001,36516812
1,100033001,35708093
