# DICOM to OMOP: create custom vocabularies

This notebook extract DICOM Attributes and values to add them to OMOP CDM vocabulary as custom concepts. 
1. Restructure the data: DICOM harvest to OMOP structure
2. Extract information for `CONCEPT_RELATIONSHIP` from DICOM Standard Part 3

Links to OMOP CDM
- [Create a custom vocabulary](https://forums.ohdsi.org/t/how-to-add-a-custom-vocabulary-to-the-omop-vocabulary-table/12440/3)
- [OMOP CDM v.5.4 VOCABULARY](https://ohdsi.github.io/CommonDataModel/cdm54.html#vocabulary)
- [OMOP CDM v.5.4 CONCEPT_CLASS](https://ohdsi.github.io/CommonDataModel/cdm54.html#concept_class)
- [OMOP CDM v.5.4 CONCEPT](https://ohdsi.github.io/CommonDataModel/cdm54.html#concept)

Links to DICOM Standards
- [Value Representation (VR)](https://dicom.nema.org/medical/dicom/current/output/chtml/part05/sect_6.2.html)
- [Part 3](https://dicom.nema.org/medical/dicom/current/output/html/part03.html)

## 1. Restructure data: DICOM harvest to OMOP Structure

<blockquote>
<strong>This is for your reference, you can skip this section and use the flat files in the `files` directory in this repository. The instruction to update your OMOP database is shown in another notebook, "upload_dicom_to_omop.ipynb".</strong>
</blockquote>

In [2]:
import pandas as pd

attributes = pd.read_csv("./files/DICOM Standard/part6_attributes.csv")
valuesets = pd.read_csv("./files/DICOM Standard/part16_fhir_valuesets.csv")
part3 = pd.read_pickle('./files/DICOM Standard/part3_mapping.pkl')

In [3]:
part3_att = part3[part3['CID']!=''].merge(attributes, left_on = 'Tag', right_on = 'Tag_cleaned', how = 'left')

In [10]:
# part 3 includes 1590 attributes
part3['Tag'].nunique()

1590

In [11]:
# part 6 includes 5190 attributes
attributes['Tag'].nunique()

5190

In [44]:
attributes[attributes['Name'].isna()]

Unnamed: 0,Tag,Name,Keyword,VR,VM,Unnamed: 5,Tag_cleaned,concept_id
86,"(0008,0202)",,,,,RET (2020c),00080202,2128000087
810,"(0018,0061)",,,DS,1.0,RET (2015c),00180061,2128000811
1461,"(0018,9445)",,,,,RET (2004) - See Note,00189445,2128001462
2089,"(0028,0020)",,,,,RET (2007) - See Note,00280020,2128002090
3751,"(0400,0315)",,,FL,1.0,RET (2015c),04000315,2128003752
4337,"(300A,0135)",,,,,RET,300A0135,2128004338
4743,"(300A,0782)",,,US,1.0,RET,300A0782,2128004744


In [102]:
attributes[attributes['Tag_cleaned']=="00190010"]

Unnamed: 0,Tag,Name,Keyword,VR,VM,Unnamed: 5,Tag_cleaned,concept_id


In [None]:
# # Version 2
# attributes_cid = part3_att['Tag_x'].unique()
# included_VR = ['AT', 'CS', 'DA', 'DT', 'DS', 'FL', 'FD', 'IS', 'SL', 'SS', 'SV', 'TM', 'UL', 'US', 'UV']
# attributes_included = attributes[(attributes['VR'].isin(included_VR)) | (attributes['Tag_cleaned'].isin(attributes_cid))]
# attributes_included #2915 -> 2983

# version 3 -> we are going to include all attributes

In [7]:
# columns for the imaging extension tables
mi_cdm = ["0020000D", "0020000E", "00080020", "00100020", "00080060", "00180015"]
attributes[(attributes['Tag_cleaned'].isin(mi_cdm))][['Tag', 'Name', 'VR', 'VM']]

Unnamed: 0,Tag,Name,VR,VM
16,"(0008,0020)",Study Date,DA,1
40,"(0008,0060)",Modality,CS,1
267,"(0010,0020)",Patient ID,LO,1
783,"(0018,0015)",Body Part Examined,CS,1
1676,"(0020,000D)",Study Instance UID,UI,1
1677,"(0020,000E)",Series Instance UID,UI,1


In [None]:
# DICOM attributes
# concept_id: 2128000000 + sequential number in range of 10-5999
# concept_name: 'Name'
# domain_id: Candidates - 'Measurement', 'Meas Value', 'Meas/Procedure', 'Type Concept'
# vocabulary_id: 'DICOM'
# concept_class_id: 'DICOM Attributes'
# standard_concept: NULL
# concept_code: Tag
# valid_start_date: 19930101
# valid_end_date: 20991231
# invalid_reason: NULL

In [8]:
import numpy as np
import pandas as pd

sequential_numbers = range(1, len(attributes)+1)
attributes.loc[:, 'concept_id'] = [2128000000 + num for num in sequential_numbers]

columns = ['concept_id', 'concept_name', 'domain_id', 'vocabulary_id', 'concept_class_id', 'standard_concept', 'concept_code', 'valid_start_date', 'valid_end_date','invalid_reason']
attribute_table_omop = pd.DataFrame(columns = columns)

attribute_table_omop['concept_id'] = attributes['concept_id']
attribute_table_omop['concept_name'] = attributes['Name']
attribute_table_omop['domain_id'] = 'Measurement'
attribute_table_omop['vocabulary_id'] = 'DICOM'
attribute_table_omop['concept_class_id'] = 'DICOM Attributes'
attribute_table_omop['concept_code'] = attributes['Tag_cleaned']
attribute_table_omop['valid_start_date'] = 19930101
attribute_table_omop['valid_end_date'] = 20991231

attribute_table_omop = attribute_table_omop.reset_index(drop='True')

In [9]:
attribute_table_omop

Unnamed: 0,concept_id,concept_name,domain_id,vocabulary_id,concept_class_id,standard_concept,concept_code,valid_start_date,valid_end_date,invalid_reason
0,2128000001,Length to End,Measurement,DICOM,DICOM Attributes,,00080001,19930101,20991231,
1,2128000002,Specific Character Set,Measurement,DICOM,DICOM Attributes,,00080005,19930101,20991231,
2,2128000003,Language Code Sequence,Measurement,DICOM,DICOM Attributes,,00080006,19930101,20991231,
3,2128000004,Image Type,Measurement,DICOM,DICOM Attributes,,00080008,19930101,20991231,
4,2128000005,Recognition Code,Measurement,DICOM,DICOM Attributes,,00080010,19930101,20991231,
...,...,...,...,...,...,...,...,...,...,...
5185,2128005186,Digital Signatures Sequence,Measurement,DICOM,DICOM Attributes,,FFFAFFFA,19930101,20991231,
5186,2128005187,Data Set Trailing Padding,Measurement,DICOM,DICOM Attributes,,FFFCFFFC,19930101,20991231,
5187,2128005188,Item,Measurement,DICOM,DICOM Attributes,,FFFEE000,19930101,20991231,
5188,2128005189,Item Delimitation Item,Measurement,DICOM,DICOM Attributes,,FFFEE00D,19930101,20991231,


In [None]:
# DICOM Value Sets
# concept_id: 2128000000 + sequential number in range of 6000-999999
# concept_name: display
# domain_id: Candidates - 'Measurement', 'Meas Value', 'Meas/Procedure', 'Type Concept', 'Condition', 'Observation'
# vocabulary_id: 'DICOM'
# concept_class_id: 'DICOM Value Sets'
# standard_concept: NULL
# concept_code: code
# valid_start_date: 19930101
# valid_end_date: 20991231
# invalid_reason: NULL

In [45]:
# part 16 shape (this is only CID portions)
valuesets.shape

(26825, 8)

In [10]:
valuesets_dicom = valuesets[valuesets['system']=='http://dicom.nema.org/resources/ontology/DCM']
valuesets_dicom #5223

Unnamed: 0,code,display,system,id,version,status,description,cid
31,110504,Patient died,http://dicom.nema.org/resources/ontology/DCM,dicom-cid-9301-ModalityPPSDiscontinuationReason,20140419,active,Transitive closure of CID 9301 ModalityPPSDisc...,9301
32,110515,Patient condition prevented continuing,http://dicom.nema.org/resources/ontology/DCM,dicom-cid-9301-ModalityPPSDiscontinuationReason,20140419,active,Transitive closure of CID 9301 ModalityPPSDisc...,9301
33,110503,Patient allergic to media/contrast,http://dicom.nema.org/resources/ontology/DCM,dicom-cid-9301-ModalityPPSDiscontinuationReason,20140419,active,Transitive closure of CID 9301 ModalityPPSDisc...,9301
34,110514,Incorrect worklist entry selected,http://dicom.nema.org/resources/ontology/DCM,dicom-cid-9301-ModalityPPSDiscontinuationReason,20140419,active,Transitive closure of CID 9301 ModalityPPSDisc...,9301
35,110502,Incorrect procedure ordered,http://dicom.nema.org/resources/ontology/DCM,dicom-cid-9301-ModalityPPSDiscontinuationReason,20140419,active,Transitive closure of CID 9301 ModalityPPSDisc...,9301
...,...,...,...,...,...,...,...,...
26820,128129,Plane through Posterior Extent,http://dicom.nema.org/resources/ontology/DCM,dicom-cid-1010-ReferenceGeometryPlane,20160905,active,Transitive closure of CID 1010 ReferenceGeomet...,1010
26821,128128,Plane through Anterior Extent,http://dicom.nema.org/resources/ontology/DCM,dicom-cid-1010-ReferenceGeometryPlane,20160905,active,Transitive closure of CID 1010 ReferenceGeomet...,1010
26822,128130,Plane through Center,http://dicom.nema.org/resources/ontology/DCM,dicom-cid-1010-ReferenceGeometryPlane,20160905,active,Transitive closure of CID 1010 ReferenceGeomet...,1010
26823,128121,Plane through Inferior Extent,http://dicom.nema.org/resources/ontology/DCM,dicom-cid-1010-ReferenceGeometryPlane,20160905,active,Transitive closure of CID 1010 ReferenceGeomet...,1010


In [48]:
valuesets_dicom_agg = valuesets_dicom.groupby(['code', 'display', 'system']).agg(counts = ('id', 'count')).reset_index()
valuesets_dicom_agg[valuesets_dicom_agg['counts']>1].head(20)

Unnamed: 0,code,display,system,counts
15,109016,A wave peak pressure,http://dicom.nema.org/resources/ontology/DCM,2
27,109034,V wave peak pressure,http://dicom.nema.org/resources/ontology/DCM,2
48,109071,Indicator mean transit time,http://dicom.nema.org/resources/ontology/DCM,18
49,109072,Tau,http://dicom.nema.org/resources/ontology/DCM,18
55,109091,Cardiac Stress State,http://dicom.nema.org/resources/ontology/DCM,2
79,109134,Prior to voiding,http://dicom.nema.org/resources/ontology/DCM,2
80,109135,Post voiding,http://dicom.nema.org/resources/ontology/DCM,2
158,109843,TG18-UNL10 Pattern,http://dicom.nema.org/resources/ontology/DCM,2
159,109844,TG18-UNL80 Pattern,http://dicom.nema.org/resources/ontology/DCM,2
232,110002,Quality Control,http://dicom.nema.org/resources/ontology/DCM,4


In [49]:
# Number of Part 16 elements where the same code is used towards multiple CIDs
valuesets_dicom_agg[valuesets_dicom_agg['counts']>1].shape

(1064, 4)

In [50]:
# Number of Part 16 elements after deduplication
valuesets_unique = valuesets_dicom[["code", "display", "system"]].drop_duplicates().reset_index(drop=True)
valuesets_unique.shape #3295

(3295, 3)

In [16]:
valuesets_dicom_agg = valuesets_unique.groupby('code').agg(counts=('code', 'count')).reset_index()
print(valuesets_dicom_agg[valuesets_dicom_agg['counts']>1]['code'].nunique())
dicom_code_duplicates = valuesets_dicom_agg[valuesets_dicom_agg['counts']>1]['code'].unique()

14


In [17]:
valuesets_dicom[valuesets_dicom['code'].isin(dicom_code_duplicates)][['code', 'display', 'version']].drop_duplicates().sort_values('code')

Unnamed: 0,code,display,version
22470,110828,Flow velocity,20030327
3600,110828,Flow velocity,20191108
16938,110828,Flow Velocity,20200920
14758,110828,Flow Velocity,20191108
11846,110828,Flow Velocity,20141110
20890,111101,Image quality,20030108
19749,111101,Image Quality,20020904
3865,111101,Image quality,20220922
8974,111101,Image Quality,20050110
604,111209,Positioning,20020904


In [11]:
# clean up different capitalization and versions
delete_duplicates = ['Flow velocity', '3D Manufacturing Modeling System', 'Laser Surface Scan', 'left ventricle apical anterolateral segment', 'RT Prescription Result', 'Preliminary report',
                     'Analysis or measurements for current procedure', 'Source Image for Image Processing Operation', 'No filter', 'Positioning', 'Image Quality']

In [17]:
import numpy as np
import pandas as pd

sequential_numbers = range(6000, len(valuesets_unique)+6000)
valuesets_unique.loc[:, 'concept_id'] = [2128000000 + num for num in sequential_numbers]

columns = ['concept_id', 'concept_name', 'domain_id', 'vocabulary_id', 'concept_class_id', 'standard_concept', 'concept_code', 'valid_start_date', 'valid_end_date','invalid_reason']
valuesets_table_omop = pd.DataFrame(columns = columns)

valuesets_table_omop['concept_id'] = valuesets_unique['concept_id']
valuesets_table_omop['concept_name'] = valuesets_unique['display']
valuesets_table_omop['domain_id'] = 'Measurement'
valuesets_table_omop['vocabulary_id'] = 'DICOM'
valuesets_table_omop['concept_class_id'] = 'DICOM Value Sets'
valuesets_table_omop['concept_code'] = valuesets_unique['code']
valuesets_table_omop['valid_start_date'] = 19930101
valuesets_table_omop['valid_end_date'] = 20991231

valuesets_table_omop = valuesets_table_omop.reset_index(drop='True')

In [18]:
valuesets_table_omop

Unnamed: 0,concept_id,concept_name,domain_id,vocabulary_id,concept_class_id,standard_concept,concept_code,valid_start_date,valid_end_date,invalid_reason
0,2128006000,Patient died,Measurement,DICOM,DICOM Value Sets,,110504,19930101,20991231,
1,2128006001,Patient condition prevented continuing,Measurement,DICOM,DICOM Value Sets,,110515,19930101,20991231,
2,2128006002,Patient allergic to media/contrast,Measurement,DICOM,DICOM Value Sets,,110503,19930101,20991231,
3,2128006003,Incorrect worklist entry selected,Measurement,DICOM,DICOM Value Sets,,110514,19930101,20991231,
4,2128006004,Incorrect procedure ordered,Measurement,DICOM,DICOM Value Sets,,110502,19930101,20991231,
...,...,...,...,...,...,...,...,...,...,...
3290,2128009290,Plane through Posterior Extent,Measurement,DICOM,DICOM Value Sets,,128129,19930101,20991231,
3291,2128009291,Plane through Anterior Extent,Measurement,DICOM,DICOM Value Sets,,128128,19930101,20991231,
3292,2128009292,Plane through Center,Measurement,DICOM,DICOM Value Sets,,128130,19930101,20991231,
3293,2128009293,Plane through Inferior Extent,Measurement,DICOM,DICOM Value Sets,,128121,19930101,20991231,


### Code String values: DICOM Defined Terms and Enumerated Values

In [27]:
import pandas as pd

modality = pd.read_csv('./files/DICOM Standard/part3_modality.csv')
patient_position = pd.read_csv('./files/DICOM Standard/part3_patient_position.csv')
lossy_image_comp_methods = pd.read_csv('./files/DICOM Standard/part3_lossy_image_comp_methods.csv')
#other_values = pd.read_csv('./files/DICOM Standard/part3_other_values.csv')
body_part = pd.read_pickle('./files/DICOM Standard/part16_body_part_examined.pkl')

In [28]:
#other_values['concept_id'] = pd.to_numeric(other_values['concept_id'], errors='coerce').astype('Int64')
modality['concept_id'] = pd.to_numeric(modality['concept_id'], errors='coerce').astype('Int64')

In [29]:
combined_values = pd.concat([modality, patient_position, lossy_image_comp_methods])
combined_values = combined_values.rename(columns={'concept_id': 'syn_concept_id'})
combined_values

Unnamed: 0,code,description,syn_concept_id
0,ANN,Annotation,
1,AR,Autorefraction,
2,ASMT,Content Assessment Results,
3,AU,Audio,
4,BDUS,Bone Densitometry (ultrasound),
...,...,...,...
3,ISO_15444_15,High-Throughput JPEG 2000 Irreversible Compres...,
4,ISO_18181_1,JPEG XL Image Coding System - Part 1 Core Codi...,
5,ISO_13818_2,MPEG2 Compression[ISO/IEC 13818-2],
6,ISO_14496_10,MPEG-4 AVC/H.264 Compression[ISO/IEC 14496-10],


In [30]:
index = valuesets_table_omop['concept_id'].max() + 1
sequential_numbers = range(6000, len(combined_values)+6000)
combined_values.loc[:, 'concept_id'] = [index + num for num in sequential_numbers]

columns = ['concept_id', 'concept_name', 'domain_id', 'vocabulary_id', 'concept_class_id', 'standard_concept', 'concept_code', 'valid_start_date', 'valid_end_date','invalid_reason']
cs_value_table_omop = pd.DataFrame(columns = columns)

cs_value_table_omop['concept_id'] = combined_values['concept_id']
cs_value_table_omop['concept_name'] = combined_values['description']
cs_value_table_omop['domain_id'] = 'Measurement'
cs_value_table_omop['vocabulary_id'] = 'DICOM'
cs_value_table_omop['concept_class_id'] = 'DICOM Value Sets'
cs_value_table_omop['concept_code'] = combined_values['code']
cs_value_table_omop['valid_start_date'] = 19930101
cs_value_table_omop['valid_end_date'] = 20991231

cs_value_table_omop = cs_value_table_omop.reset_index(drop='True')

In [31]:
cs_value_table_omop

Unnamed: 0,concept_id,concept_name,domain_id,vocabulary_id,concept_class_id,standard_concept,concept_code,valid_start_date,valid_end_date,invalid_reason
0,2128015295,Annotation,Measurement,DICOM,DICOM Value Sets,,ANN,19930101,20991231,
1,2128015296,Autorefraction,Measurement,DICOM,DICOM Value Sets,,AR,19930101,20991231,
2,2128015297,Content Assessment Results,Measurement,DICOM,DICOM Value Sets,,ASMT,19930101,20991231,
3,2128015298,Audio,Measurement,DICOM,DICOM Value Sets,,AU,19930101,20991231,
4,2128015299,Bone Densitometry (ultrasound),Measurement,DICOM,DICOM Value Sets,,BDUS,19930101,20991231,
...,...,...,...,...,...,...,...,...,...,...
98,2128015393,High-Throughput JPEG 2000 Irreversible Compres...,Measurement,DICOM,DICOM Value Sets,,ISO_15444_15,19930101,20991231,
99,2128015394,JPEG XL Image Coding System - Part 1 Core Codi...,Measurement,DICOM,DICOM Value Sets,,ISO_18181_1,19930101,20991231,
100,2128015395,MPEG2 Compression[ISO/IEC 13818-2],Measurement,DICOM,DICOM Value Sets,,ISO_13818_2,19930101,20991231,
101,2128015396,MPEG-4 AVC/H.264 Compression[ISO/IEC 14496-10],Measurement,DICOM,DICOM Value Sets,,ISO_14496_10,19930101,20991231,


In [32]:
dicom_code_duplicates = cs_value_table_omop[cs_value_table_omop['concept_code'].isin(valuesets_table_omop['concept_code'])]['concept_id']
len(dicom_code_duplicates)

74

### For Concept_relationship

In [None]:
#other_values.head()

In [None]:
# # Pivot (melt) the DataFrame to go from wide to long format
# other_values_long = pd.melt(other_values, id_vars=['code', 'description', 'concept_id'], 
#                   value_vars=['tag_1', 'tag_2', 'tag_3'], 
#                   var_name='tag_type', value_name='tags')

# # Drop the 'tag_type' column as it is not needed in the final format
# other_values_long = other_values_long.drop('tag_type', axis=1)
# other_values_long = other_values_long[~other_values_long['tags'].isna()]
# other_values_long['Tag'] = other_values_long['tags'].str.replace(r'[(),]', '', regex = True)
# other_values_long = other_values_long.rename(columns={'concept_id': 'syn_concept_id'})
# other_values_long.head()

In [None]:
# other_values_long = other_values_long.merge(cs_value_table_omop[['concept_id', 'concept_code']], how = 'left', right_on = 'concept_code', left_on= 'code')
# other_values_long = other_values_long.drop(columns=['tags', 'concept_code'])
# other_values_long = other_values_long.rename(columns={'concept_id': 'concept_id_2'}) #value sets' concept ID is concept_id_2 for Concept_relationship table
# other_values_long.head()

In [None]:
# other_values_long = other_values_long.merge(attribute_table_omop[['concept_id', 'concept_code']], how='left', left_on = 'Tag', right_on = 'concept_code')
# other_values_long = other_values_long.drop(columns=['concept_code'])
# other_values_long = other_values_long.rename(columns={'concept_id': 'concept_id_1'}) #attributes' concept ID is concept_id_1 for Concept_relationship table
# other_values_long

In [None]:
#other_values_long.merge(attributes, how='left', left_on = 'Tag', right_on = 'Tag_cleaned')['Name'].unique()

In [None]:
# other_values_maps_to_value = other_values_long[['concept_id_1', 'concept_id_2']].copy()
# other_values_maps_to_value['relationship_id'] = 'Maps to value'
# other_values_maps_to = other_values_long[~other_values_long['syn_concept_id'].isna()][['concept_id_2', 'syn_concept_id']].reset_index(drop=True)
# other_values_maps_to['relationship_id'] = 'Maps to'
# other_values_maps_to = other_values_maps_to.drop_duplicates()
# other_values_maps_to = other_values_maps_to.rename(columns = {'concept_id_2': 'concept_id_1', 'syn_concept_id': 'concept_id_2'})
# other_values_relationship = pd.concat([other_values_maps_to_value, other_values_maps_to])
# other_values_relationship['concept_id_1'] = other_values_relationship['concept_id_1'].astype('Int64')
# other_values_relationship.head()

In [None]:
#other_values_relationship['relationship_id'].value_counts()

In [54]:
body_part = body_part[body_part['Body Part Examined']!=""].copy()

In [55]:
index = cs_value_table_omop['concept_id'].max() + 1
sequential_numbers = range(6000, len(body_part)+6000)
body_part.loc[:, 'concept_id'] = [index + num for num in sequential_numbers]

columns = ['concept_id', 'concept_name', 'domain_id', 'vocabulary_id', 'concept_class_id', 'standard_concept', 'concept_code', 'valid_start_date', 'valid_end_date','invalid_reason']
body_part_table_omop = pd.DataFrame(columns = columns)

body_part_table_omop['concept_id'] = body_part['concept_id']
body_part_table_omop['concept_name'] = body_part['Code Meaning']
body_part_table_omop['domain_id'] = 'Measurement'
body_part_table_omop['vocabulary_id'] = 'DICOM'
body_part_table_omop['concept_class_id'] = 'DICOM Value Sets'
body_part_table_omop['concept_code'] = body_part['Body Part Examined']
body_part_table_omop['valid_start_date'] = 19930101
body_part_table_omop['valid_end_date'] = 20991231

body_part_table_omop = body_part_table_omop.reset_index(drop='True')
body_part_table_omop

Unnamed: 0,concept_id,concept_name,domain_id,vocabulary_id,concept_class_id,standard_concept,concept_code,valid_start_date,valid_end_date,invalid_reason
0,2128021398,Abdomen,Measurement,DICOM,DICOM Value Sets,,ABDOMEN,19930101,20991231,
1,2128021399,Abdomen and Pelvis,Measurement,DICOM,DICOM Value Sets,,ABDOMENPELVIS,19930101,20991231,
2,2128021400,Abdominal aorta,Measurement,DICOM,DICOM Value Sets,,ABDOMINALAORTA,19930101,20991231,
3,2128021401,Acromioclavicular joint,Measurement,DICOM,DICOM Value Sets,,ACJOINT,19930101,20991231,
4,2128021402,Adrenal gland,Measurement,DICOM,DICOM Value Sets,,ADRENAL,19930101,20991231,
...,...,...,...,...,...,...,...,...,...,...
313,2128021711,Vein,Measurement,DICOM,DICOM Value Sets,,VEIN,19930101,20991231,
314,2128021712,Vertebral artery,Measurement,DICOM,DICOM Value Sets,,VERTEBRALA,19930101,20991231,
315,2128021713,Vulva,Measurement,DICOM,DICOM Value Sets,,VULVA,19930101,20991231,
316,2128021714,Wrist joint,Measurement,DICOM,DICOM Value Sets,,WRIST,19930101,20991231,


### Combine DICOM concepts

In [56]:
print(attribute_table_omop.shape, valuesets_table_omop.shape, cs_value_table_omop.shape, body_part_table_omop.shape)

(5190, 10) (3295, 10) (103, 10) (318, 10)


In [57]:
omop_table_staging = pd.concat([attribute_table_omop, valuesets_table_omop, cs_value_table_omop, body_part_table_omop], ignore_index=True)
omop_table_staging['standard_concept'] = ''
print(omop_table_staging.shape)
omop_table_staging = omop_table_staging[~omop_table_staging['concept_id'].isin([2128009022, 2128008704, 2128007273])] #same concept_code, different names (DCM Value Sets)

(8906, 10)


In [58]:
drop_concept_ids = omop_table_staging[(omop_table_staging['concept_id'].isin(dicom_code_duplicates)) | 
                                      (omop_table_staging['concept_name'].isin(delete_duplicates)) |
                                      (omop_table_staging['concept_name'].isna())]['concept_id']
print(len(drop_concept_ids))
omop_table_staging = omop_table_staging[~omop_table_staging['concept_id'].isin(drop_concept_ids)]
omop_table_staging.to_csv('./files/OMOP CDM Staging/omop_table_staging_v3.csv', index=False)
omop_table_staging

90


Unnamed: 0,concept_id,concept_name,domain_id,vocabulary_id,concept_class_id,standard_concept,concept_code,valid_start_date,valid_end_date,invalid_reason
0,2128000001,Length to End,Measurement,DICOM,DICOM Attributes,,00080001,19930101,20991231,
1,2128000002,Specific Character Set,Measurement,DICOM,DICOM Attributes,,00080005,19930101,20991231,
2,2128000003,Language Code Sequence,Measurement,DICOM,DICOM Attributes,,00080006,19930101,20991231,
3,2128000004,Image Type,Measurement,DICOM,DICOM Attributes,,00080008,19930101,20991231,
4,2128000005,Recognition Code,Measurement,DICOM,DICOM Attributes,,00080010,19930101,20991231,
...,...,...,...,...,...,...,...,...,...,...
8901,2128021711,Vein,Measurement,DICOM,DICOM Value Sets,,VEIN,19930101,20991231,
8902,2128021712,Vertebral artery,Measurement,DICOM,DICOM Value Sets,,VERTEBRALA,19930101,20991231,
8903,2128021713,Vulva,Measurement,DICOM,DICOM Value Sets,,VULVA,19930101,20991231,
8904,2128021714,Wrist joint,Measurement,DICOM,DICOM Value Sets,,WRIST,19930101,20991231,


In [59]:
omop_table_staging.groupby('concept_class_id').count()

Unnamed: 0_level_0,concept_id,concept_name,domain_id,vocabulary_id,standard_concept,concept_code,valid_start_date,valid_end_date,invalid_reason
concept_class_id,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1
DICOM Attributes,5183,5183,5183,5183,5183,5183,5183,5183,0
DICOM Value Sets,3630,3630,3630,3630,3630,3630,3630,3630,0


## 2. Set Concept_relationship from Part 3

### Attribute & Value Sets with CIDs: Attributes = concept_id_1, Value Sets = concept_id_2
This relationship includes mapping to standard coding systems, such as SNOMED and LOINC.

In [61]:
part3_cid = part3[part3['CID']!=''].merge(attributes[['Tag_cleaned', 'concept_id']], how = 'inner', left_on = 'Tag', right_on = 'Tag_cleaned')
part3_cid['cid'] = pd.to_numeric(part3_cid['CID'], errors='coerce').astype('Int64')
part3_cid = part3_cid.rename(columns={'concept_id':'concept_id_1'})
part3_cid.head()

Unnamed: 0,xml_id,iod,IE,Module,Reference,Usage,Usage_code,Reference_adjusted,Attribute Name,Tag,Type,Attribute Description,CID,SOP Class UID,Tag_cleaned,concept_id_1,cid
0,table_A.2-1,Computed Radiography Image IOD Modules,Patient,Patient,sect_C.7.1.1,M,M,sect_C.7.1.1,Ethnic Group Code Sequence,102161,3,{},6099,1.2.840.10008.5.1.4.1.1.1,102161,2128000323,6099
1,table_A.2-1,Computed Radiography Image IOD Modules,Patient,Patient,sect_C.7.1.1,M,M,sect_C.7.1.1,Patient Species Code Sequence,102202,1C,{},7454,1.2.840.10008.5.1.4.1.1.1,102202,2128000331,7454
2,table_A.2-1,Computed Radiography Image IOD Modules,Patient,Patient,sect_C.7.1.1,M,M,sect_C.7.1.1,Patient Breed Code Sequence,102293,2C,{},7480,1.2.840.10008.5.1.4.1.1.1,102293,2128000335,7480
3,table_A.2-1,Computed Radiography Image IOD Modules,Patient,Patient,sect_C.7.1.1,M,M,sect_C.7.1.1,De-identification Method Code Sequence,120064,1C,{},7050,1.2.840.10008.5.1.4.1.1.1,120064,2128000365,7050
4,table_A.2-1,Computed Radiography Image IOD Modules,Study,General Study,sect_C.7.2.1,M,M,sect_C.7.2.1,Requesting Service Code Sequence,321034,3,{},7030,1.2.840.10008.5.1.4.1.1.1,321034,2128002343,7030


In [62]:
part3_cid_val = part3_cid[['iod', 'Module', 'Attribute Name', 'Tag', 'cid', 'concept_id_1']].merge(valuesets[['code', 'cid', 'system']], how = 'left', on = 'cid')
part3_cid_val = part3_cid_val.rename(columns={'code': 'concept_code'})
part3_cid_val.head()

Unnamed: 0,iod,Module,Attribute Name,Tag,cid,concept_id_1,concept_code,system
0,Computed Radiography Image IOD Modules,Patient,Ethnic Group Code Sequence,102161,6099,2128000323,C41219,http://ncit.nci.nih.gov
1,Computed Radiography Image IOD Modules,Patient,Ethnic Group Code Sequence,102161,6099,2128000323,413490006,http://snomed.info/sct
2,Computed Radiography Image IOD Modules,Patient,Ethnic Group Code Sequence,102161,6099,2128000323,413581001,http://snomed.info/sct
3,Computed Radiography Image IOD Modules,Patient,Ethnic Group Code Sequence,102161,6099,2128000323,413773004,http://snomed.info/sct
4,Computed Radiography Image IOD Modules,Patient,Ethnic Group Code Sequence,102161,6099,2128000323,413600007,http://snomed.info/sct


In [63]:
part3_tag_agg = part3.groupby(['Tag', 'Attribute Name']).agg(
    unique_iod_count = ('iod', 'nunique'),
    unique_cid_count = ('CID', 'nunique')
    ).reset_index()
print(part3_tag_agg.shape, part3_tag_agg[part3_tag_agg['unique_cid_count']>1].shape)
part3_tag_agg[part3_tag_agg['unique_cid_count']>1]

(1594, 4) (14, 4)


Unnamed: 0,Tag,Attribute Name,unique_iod_count,unique_cid_count
98,00082218,Anatomic Region Sequence,15,3
99,00082228,Primary Anatomic Structure Sequence,5,2
108,00089215,Derivation Code Sequence,71,2
773,00220015,Acquisition Device Type Code Sequence,5,2
774,00220016,Illumination Type Code Sequence,5,2
806,00221423,Acquisition Method Algorithm Sequence,2,2
823,00221612,Derivation Algorithm Sequence,4,2
994,00400275,Request Attributes Sequence,139,3
1018,00409096,Real World Value Mapping Sequence,43,2
1021,0040A043,Concept Name Code Sequence,29,2


In [64]:
part3_cid_val.shape

(609019, 8)

In [65]:
part3_cid_val['system'].value_counts()

system
http://snomed.info/sct                                                 590255
http://dicom.nema.org/resources/ontology/DCM                            12195
http://sig.biostr.washington.edu/projects/fm/AboutFM.html                2928
http://www.nlm.nih.gov/research/umls                                     1170
http://braininfo.rprc.washington.edu/aboutBrainInfo.aspx#NeuroNames       748
doi:10.1016/S0735-1097(99)00126-6                                         544
http://www.itis.gov                                                       513
http://ncit.nci.nih.gov                                                   303
http://www.radlex.org                                                      80
http://unitsofmeasure.org                                                  61
http://www.nlm.nih.gov/research/umls/rxnorm                                40
http://loinc.org                                                           10
Name: count, dtype: int64

In [66]:
mapping = {
    'http://snomed.info/sct': 'SNOMED',
    'http://dicom.nema.org/resources/ontology/DCM': 'DICOM',
    'http://unitsofmeasure.org': 'UCUM',
    'http://loinc.org': 'LOINC',
}

part3_cid_val['vocabulary_id'] = part3_cid_val['system'].map(mapping)
part3_cid_val.head()

Unnamed: 0,iod,Module,Attribute Name,Tag,cid,concept_id_1,concept_code,system,vocabulary_id
0,Computed Radiography Image IOD Modules,Patient,Ethnic Group Code Sequence,102161,6099,2128000323,C41219,http://ncit.nci.nih.gov,
1,Computed Radiography Image IOD Modules,Patient,Ethnic Group Code Sequence,102161,6099,2128000323,413490006,http://snomed.info/sct,SNOMED
2,Computed Radiography Image IOD Modules,Patient,Ethnic Group Code Sequence,102161,6099,2128000323,413581001,http://snomed.info/sct,SNOMED
3,Computed Radiography Image IOD Modules,Patient,Ethnic Group Code Sequence,102161,6099,2128000323,413773004,http://snomed.info/sct,SNOMED
4,Computed Radiography Image IOD Modules,Patient,Ethnic Group Code Sequence,102161,6099,2128000323,413600007,http://snomed.info/sct,SNOMED


In [68]:
# import concept table from the SQL database
# *** This was ran after uploading DICOM custom concepts ***

import psycopg2

# Connect to your database
conn = psycopg2.connect(
    database="",
    user="",
    password="",
    host="",
    port="",
    connect_timeout = 6000
)
cursor = conn.cursor()

sql = "select * from dbo.concept"
concept_df = pd.read_sql_query(sql, conn)
concept_df.head()

# close the cursor and connection
cursor.close()
conn.close()

  concept_df = pd.read_sql_query(sql, conn)


In [69]:
part3_cid_val_concept = part3_cid_val.merge(concept_df[['concept_id', 'concept_name', 'concept_code', 'vocabulary_id']], how = 'left', on = ['concept_code', 'vocabulary_id'])
part3_cid_val_concept = part3_cid_val_concept.rename(columns={'concept_id': 'concept_id_2'})
part3_cid_val_concept['concept_id_2'] = part3_cid_val_concept['concept_id_2'].astype('Int64')

In [70]:
part3_cid_val_concept.shape

(609019, 11)

In [71]:
part3_cid_val_concept.head()

Unnamed: 0,iod,Module,Attribute Name,Tag,cid,concept_id_1,concept_code,system,vocabulary_id,concept_id_2,concept_name
0,Computed Radiography Image IOD Modules,Patient,Ethnic Group Code Sequence,102161,6099,2128000323,C41219,http://ncit.nci.nih.gov,,,
1,Computed Radiography Image IOD Modules,Patient,Ethnic Group Code Sequence,102161,6099,2128000323,413490006,http://snomed.info/sct,SNOMED,4184966.0,American Indian or Alaska native
2,Computed Radiography Image IOD Modules,Patient,Ethnic Group Code Sequence,102161,6099,2128000323,413581001,http://snomed.info/sct,SNOMED,4184984.0,Asian or Pacific islander
3,Computed Radiography Image IOD Modules,Patient,Ethnic Group Code Sequence,102161,6099,2128000323,413773004,http://snomed.info/sct,SNOMED,4185154.0,Caucasian
4,Computed Radiography Image IOD Modules,Patient,Ethnic Group Code Sequence,102161,6099,2128000323,413600007,http://snomed.info/sct,SNOMED,4186705.0,Australian aborigine


In [72]:
part3_val_agg = part3_cid_val_concept.groupby(['concept_id_1', 'concept_code'])['concept_id_2'].nunique().reset_index()
print('No concept_id:', part3_val_agg[part3_val_agg['concept_id_2']==0].shape)
print('Exactly one concept_id:', part3_val_agg[part3_val_agg['concept_id_2']==1].shape)
print('Multiple concept_ids:', part3_val_agg[part3_val_agg['concept_id_2']>1].shape)

No concept_id: (219, 3)
Exactly one concept_id: (7101, 3)
Multiple concept_ids: (0, 3)


In [73]:
part3_cid_val_concept[part3_cid_val_concept['concept_id_2'].isna()].groupby('system')['concept_code'].nunique()

system
doi:10.1016/S0735-1097(99)00126-6                                       8
http://braininfo.rprc.washington.edu/aboutBrainInfo.aspx#NeuroNames    11
http://ncit.nci.nih.gov                                                 6
http://sig.biostr.washington.edu/projects/fm/AboutFM.html              43
http://snomed.info/sct                                                 10
http://unitsofmeasure.org                                              21
http://www.itis.gov                                                     3
http://www.nlm.nih.gov/research/umls                                    7
http://www.nlm.nih.gov/research/umls/rxnorm                             1
http://www.radlex.org                                                   2
Name: concept_code, dtype: int64

In [74]:
print(part3_cid_val_concept[~part3_cid_val_concept['concept_id_2'].isna()].shape) #601825
part3_cid_val_concept_omop = part3_cid_val_concept[~part3_cid_val_concept['concept_id_2'].isna()]

(601825, 11)


In [75]:
columns = ['concept_id_1', 'concept_id_2', 'relationship_id', 'valid_start_date', 'valid_end_date']
concept_relationship_staging = pd.DataFrame(columns = columns)

concept_relationship_staging['concept_id_1'] = part3_cid_val_concept_omop['concept_id_1']
concept_relationship_staging['concept_id_2'] = part3_cid_val_concept_omop['concept_id_2']
concept_relationship_staging['relationship_id'] = 'Maps to value'
concept_relationship_staging['valid_start_date'] = 19930101
concept_relationship_staging['valid_end_date'] = 20991231

concept_relationship_staging = concept_relationship_staging.reset_index(drop='True')
concept_relationship_staging.head()

Unnamed: 0,concept_id_1,concept_id_2,relationship_id,valid_start_date,valid_end_date
0,2128000323,4184966,Maps to value,19930101,20991231
1,2128000323,4184984,Maps to value,19930101,20991231
2,2128000323,4185154,Maps to value,19930101,20991231
3,2128000323,4186705,Maps to value,19930101,20991231
4,2128000323,4185920,Maps to value,19930101,20991231


In [76]:
concept_relationship_staging.shape

(601825, 5)

In [77]:
concept_relationship_staging.drop_duplicates().shape

(7101, 5)

In [78]:
concept_relationship_staging = concept_relationship_staging.drop_duplicates()
concept_relationship_staging.to_pickle('./files/OMOP CDM Staging/part3_to_part16_relationship_via_CID.pkl')

### Collect Value Sets other than CIDs

In [79]:
body_part.head()

Unnamed: 0,Coding Scheme Designator,Code Value,Code Meaning,Body Part Examined,SNOMED-RT ID (Retired),FMA Code Value,UMLS Concept UniqueID,concept_id
0,SCT,818981001,Abdomen,ABDOMEN,,,,2128021398
1,SCT,818982008,Abdomen and Pelvis,ABDOMENPELVIS,,,,2128021399
2,SCT,7832008,Abdominal aorta,ABDOMINALAORTA,T-42500,,,2128021400
3,SCT,85856004,Acromioclavicular joint,ACJOINT,T-15420,,,2128021401
4,SCT,23451007,Adrenal gland,ADRENAL,T-B3000,,,2128021402


In [80]:
body_part['Coding Scheme Designator'].value_counts()

Coding Scheme Designator
SCT    313
         5
Name: count, dtype: int64

In [81]:
body_part = body_part.rename(columns={'concept_id': 'concept_id_1'})

In [82]:
body_part_maps_to = body_part[body_part['Coding Scheme Designator']=="SCT"].merge(concept_df[concept_df['vocabulary_id']=="SNOMED"][['concept_code', 'concept_id']]
                                                              , how = 'left', left_on = 'Code Value', right_on = 'concept_code')
body_part_maps_to['concept_id'] = body_part_maps_to['concept_id'].astype('Int64')
body_part_maps_to = body_part_maps_to.rename(columns={'concept_id':'concept_id_2'})
body_part_maps_to['relationship_id'] = 'Maps to'
body_part_maps_to = body_part_maps_to[~body_part_maps_to['concept_id_2'].isna()].copy().reset_index(drop=True)
body_part_maps_to

Unnamed: 0,Coding Scheme Designator,Code Value,Code Meaning,Body Part Examined,SNOMED-RT ID (Retired),FMA Code Value,UMLS Concept UniqueID,concept_id_1,concept_code,concept_id_2,relationship_id
0,SCT,818981001,Abdomen,ABDOMEN,,,,2128021398,818981001,37303869,Maps to
1,SCT,818982008,Abdomen and Pelvis,ABDOMENPELVIS,,,,2128021399,818982008,37303868,Maps to
2,SCT,7832008,Abdominal aorta,ABDOMINALAORTA,T-42500,,,2128021400,7832008,4301737,Maps to
3,SCT,85856004,Acromioclavicular joint,ACJOINT,T-15420,,,2128021401,85856004,4311928,Maps to
4,SCT,23451007,Adrenal gland,ADRENAL,T-B3000,,,2128021402,23451007,4051774,Maps to
...,...,...,...,...,...,...,...,...,...,...,...
302,SCT,29092000,Vein,VEIN,T-48000,,,2128021711,29092000,4104340,Maps to
303,SCT,85234005,Vertebral artery,VERTEBRALA,T-45700,,,2128021712,85234005,4310816,Maps to
304,SCT,45292006,Vulva,VULVA,T-81000,,,2128021713,45292006,4166066,Maps to
305,SCT,74670003,Wrist joint,WRIST,T-15460,,,2128021714,74670003,4254083,Maps to


In [88]:
body_part_maps_to_value = body_part[['concept_id_1']].copy()
body_part_maps_to_value = body_part_maps_to_value.rename(columns = {'concept_id_1': 'concept_id_2'})
body_part_maps_to_value['concept_id_1'] = 2128000784 # Body Part Examined
body_part_maps_to_value['relationship_id'] = 'Maps to value'
body_part_maps_to_value

Unnamed: 0,concept_id_2,concept_id_1,relationship_id
0,2128021398,2128000784,Maps to value
1,2128021399,2128000784,Maps to value
2,2128021400,2128000784,Maps to value
3,2128021401,2128000784,Maps to value
4,2128021402,2128000784,Maps to value
...,...,...,...
390,2128021711,2128000784,Maps to value
393,2128021712,2128000784,Maps to value
395,2128021713,2128000784,Maps to value
396,2128021714,2128000784,Maps to value


In [89]:
body_part_maps_to_value_2 = body_part_maps_to_value[['concept_id_2', 'relationship_id']].copy()
body_part_maps_to_value_2['concept_id_1'] = 2128000226 # Anatomic Region Sequence
body_part_maps_to_value_2

Unnamed: 0,concept_id_2,relationship_id,concept_id_1
0,2128021398,Maps to value,2128000226
1,2128021399,Maps to value,2128000226
2,2128021400,Maps to value,2128000226
3,2128021401,Maps to value,2128000226
4,2128021402,Maps to value,2128000226
...,...,...,...
390,2128021711,Maps to value,2128000226
393,2128021712,Maps to value,2128000226
395,2128021713,Maps to value,2128000226
396,2128021714,Maps to value,2128000226


In [84]:
modality_maps_to = combined_values[(combined_values['code'].isin(modality['code'])) & (~combined_values['syn_concept_id'].isna())].copy().reset_index(drop=True)
modality_maps_to['relationship_id'] = 'Maps to'
modality_maps_to = modality_maps_to.rename(columns={'concept_id':'concept_id_1', 'syn_concept_id': 'concept_id_2'})
print(modality_maps_to.shape)
modality_maps_to.head()

(36, 5)


Unnamed: 0,code,description,concept_id_2,concept_id_1,relationship_id
0,CFM,Confocal Microscopy,42628535,2128015302,Maps to
1,CT,Computed Tomography,4300757,2128015304,Maps to
2,DMS,Dermoscopy,40486413,2128015306,Maps to
3,DG,Diaphanography,4082994,2128015307,Maps to
4,DX,Digital Radiography,4178367,2128015309,Maps to


In [96]:
modality_maps_to_value = combined_values[(combined_values['code'].isin(modality['code']))][['code', 'description', 'concept_id']].copy().reset_index(drop=True)
modality_maps_to_value['relationship_id'] = 'Maps to value'
modality_maps_to_value['concept_id_1'] = 2128000041 #modality 
modality_maps_to_value = modality_maps_to_value.rename(columns={'concept_id':'concept_id_2'})
print(modality_maps_to_value.shape)
modality_maps_to_value.head()

(79, 5)


Unnamed: 0,code,description,concept_id_2,relationship_id,concept_id_1
0,ANN,Annotation,2128015295,Maps to value,2128000041
1,AR,Autorefraction,2128015296,Maps to value,2128000041
2,ASMT,Content Assessment Results,2128015297,Maps to value,2128000041
3,AU,Audio,2128015298,Maps to value,2128000041
4,BDUS,Bone Densitometry (ultrasound),2128015299,Maps to value,2128000041


In [97]:
patient_position_maps_to_value = combined_values[(combined_values['code'].isin(patient_position['code']))][['code', 'description', 'concept_id']].copy().reset_index(drop=True)
patient_position_maps_to_value['relationship_id'] = 'Maps to value'
patient_position_maps_to_value['concept_id_1'] = 2128001090 # patient position
patient_position_maps_to_value = patient_position_maps_to_value.rename(columns={'concept_id':'concept_id_2'})
print(patient_position_maps_to_value.shape)
patient_position_maps_to_value.head()

(16, 5)


Unnamed: 0,code,description,concept_id_2,relationship_id,concept_id_1
0,HFP,Head First-Prone,2128015374,Maps to value,2128001090
1,HFS,Head First-Supine,2128015375,Maps to value,2128001090
2,HFDR,Head First-Decubitus Right,2128015376,Maps to value,2128001090
3,HFDL,Head First-Decubitus Left,2128015377,Maps to value,2128001090
4,FFDR,Feet First-Decubitus Right,2128015378,Maps to value,2128001090


In [98]:
lossy_image_comp_methods_maps_to_value = combined_values[(combined_values['code'].isin(lossy_image_comp_methods['code']))][['code', 'description', 'concept_id']].copy().reset_index(drop=True)
lossy_image_comp_methods_maps_to_value['relationship_id'] = 'Maps to value'
lossy_image_comp_methods_maps_to_value['concept_id_1'] = 2128002223 #lossy image compression
lossy_image_comp_methods_maps_to_value = lossy_image_comp_methods_maps_to_value.rename(columns={'concept_id':'concept_id_2'})
print(lossy_image_comp_methods_maps_to_value.shape)
lossy_image_comp_methods_maps_to_value.head()

(8, 5)


Unnamed: 0,code,description,concept_id_2,relationship_id,concept_id_1
0,ISO_10918_1,JPEG Lossy Compression[ISO/IEC 10918-1],2128015390,Maps to value,2128002223
1,ISO_14495_1,JPEG-LS Near-lossless Compression[ISO/IEC 1449...,2128015391,Maps to value,2128002223
2,ISO_15444_1,JPEG 2000 Irreversible Compression[ISO/IEC 154...,2128015392,Maps to value,2128002223
3,ISO_15444_15,High-Throughput JPEG 2000 Irreversible Compres...,2128015393,Maps to value,2128002223
4,ISO_18181_1,JPEG XL Image Coding System - Part 1 Core Codi...,2128015394,Maps to value,2128002223


In [99]:
cs_values_maps_to = pd.concat([body_part_maps_to[['concept_id_1', 'concept_id_2', 'relationship_id']], modality_maps_to[['concept_id_1', 'concept_id_2', 'relationship_id']]])
cs_values_maps_to

Unnamed: 0,concept_id_1,concept_id_2,relationship_id
0,2128021398,37303869,Maps to
1,2128021399,37303868,Maps to
2,2128021400,4301737,Maps to
3,2128021401,4311928,Maps to
4,2128021402,4051774,Maps to
...,...,...,...
31,2128015368,4231544,Maps to
32,2128015369,4230801,Maps to
33,2128015370,4056269,Maps to
34,2128015371,4299523,Maps to


In [94]:
cs_values_maps_to.to_csv('./files/OMOP CDM Staging/cs_values_maps_to.csv')

In [100]:
cs_values_maps_to_value = pd.concat([body_part_maps_to_value[['concept_id_1', 'concept_id_2', 'relationship_id']], body_part_maps_to_value_2[['concept_id_1', 'concept_id_2', 'relationship_id']],
                                     modality_maps_to_value[['concept_id_1', 'concept_id_2', 'relationship_id']], patient_position_maps_to_value[['concept_id_1', 'concept_id_2', 'relationship_id']],
                                      lossy_image_comp_methods_maps_to_value[['concept_id_1', 'concept_id_2', 'relationship_id']]])
cs_values_maps_to_value

Unnamed: 0,concept_id_1,concept_id_2,relationship_id
0,2128000784,2128021398,Maps to value
1,2128000784,2128021399,Maps to value
2,2128000784,2128021400,Maps to value
3,2128000784,2128021401,Maps to value
4,2128000784,2128021402,Maps to value
...,...,...,...
3,2128002223,2128015393,Maps to value
4,2128002223,2128015394,Maps to value
5,2128002223,2128015395,Maps to value
6,2128002223,2128015396,Maps to value


In [101]:
cs_values_maps_to_value.to_csv('./files/OMOP CDM Staging/cs_values_maps_to_value.csv')