# Playground JSON Generation

The code in the folder, and in particular the notebook, serve as a playground for JSON generation in our dataflow pipeline.


The individual functions are broken down to a patient/series level and are completly decoupled. Here you can simply try out an extension of the functionality, but the repo JsonGeneration remains the tested and deployed code. **!!Possibly some functionalities of the main code are not completely represented here or are represented in a modified way!!**.

Missing functionality: 
- URI entry

Also important:
- The location of the csvs (for databulk information loading) is hard coded to the server. If needed this can be changed in csv_databulk_loaders.py .
- also the location of the t-heart models is hard coded to the server



In [1]:
# obligatory imports
import os
from pydicom import dcmread
from backbone_functions import *

In the first step, the path to a series is specified and the Human ID is defined. 

In [19]:
human = "122"
path = "/home/database/datadrive/122/STD00001/SER00012/"

### serialize()
The function serialze() goes through all images and collects information about nominal_phase, image_comments and series_description to allow a decision between CINE and Normal later. 

In [20]:
description = serialize(path)

In [21]:
description

[[], ['158cm  62kg'], ['CT(1)']]

### cine_normal_classification()
Given this information, it is possible to decide, whether the series is CINE or Normal:

In [22]:
series_type, phase_n, phase_c, phase_s = cine_normal_classification(description[0], description[1], description[2])

In [23]:
print(series_type)
print(phase_n)
print(phase_c)
print(phase_s)

NORMAL
{''}
0
0


### normal_init() & cine_init()
Accordingly, the Patient Dictionary can be created.

In [24]:
if series_type == "NORMAL":
    patient_dict = normal_init(path, phase_n)
if series_type == "CINE":
    patient_dict = cine_init(path, human, phase_n, phase_c, phase_s)

In [25]:
patient_dict

{'datetime_creation': '06-04-2022 13:47:40',
 'versioning': {'version': 1,
  'base_patient_id': None,
  'edited_fields': [None, None]},
 'institution_id': '',
 'institution_name': '',
 'internal_info': {'human_id': '', 'series': ''},
 'age': '',
 'height': None,
 'weight': '',
 'bsa': None,
 'bmi': None,
 'gender': '',
 'body_rois': [{'catalog_tag': None, 'extra_tag': None}],
 'es_timestamp': None,
 'ed_timestamp': None,
 'pathology': None,
 'origin_location': None,
 'origin_ethnicity': None,
 'additional_hist': '',
 'imaging_data': None,
 'models': None,
 'imaging': {'datetime_creation': '06-04-2022 13:47:40',
  'image_files_list': [{'timestamp': None,
    'image_files': ['IMG02341.dcm',
     'IMG02342.dcm',
     'IMG02343.dcm',
     'IMG02344.dcm',
     'IMG02345.dcm',
     'IMG02346.dcm',
     'IMG02347.dcm',
     'IMG02348.dcm',
     'IMG02349.dcm',
     'IMG02350.dcm',
     'IMG02351.dcm',
     'IMG02352.dcm',
     'IMG02353.dcm',
     'IMG02354.dcm',
     'IMG02355.dcm',
     'IM

### elements_extraction_and_storing()
Now the fields can be filled.

In [26]:
for image in os.listdir(path):
    metadata = dcmread(os.path.join(path,image),force=True)
    patient_dict = elements_extraction_and_storing(metadata, patient_dict)
    

In [27]:
patient_dict

{'datetime_creation': '06-04-2022 13:47:40',
 'versioning': {'version': 1,
  'base_patient_id': None,
  'edited_fields': [None, None]},
 'institution_id': '200825MST8^MSR 87J female',
 'institution_name': 'decidemedical',
 'internal_info': {'human_id': '', 'series': ''},
 'age': '086Y',
 'height': None,
 'weight': '0',
 'bsa': None,
 'bmi': None,
 'gender': 'F',
 'body_rois': [{'catalog_tag': None, 'extra_tag': None}],
 'es_timestamp': None,
 'ed_timestamp': None,
 'pathology': None,
 'origin_location': None,
 'origin_ethnicity': None,
 'additional_hist': '',
 'imaging_data': None,
 'models': None,
 'imaging': {'datetime_creation': '06-04-2022 13:47:40',
  'image_files_list': [{'timestamp': None,
    'image_files': ['IMG02341.dcm',
     'IMG02342.dcm',
     'IMG02343.dcm',
     'IMG02344.dcm',
     'IMG02345.dcm',
     'IMG02346.dcm',
     'IMG02347.dcm',
     'IMG02348.dcm',
     'IMG02349.dcm',
     'IMG02350.dcm',
     'IMG02351.dcm',
     'IMG02352.dcm',
     'IMG02353.dcm',
     '

### elements_correction()
And correct the fields

In [28]:
if phase_n != 0 and series_type == "CINE":
    phase_n = sorted(list(phase_n),key=float)
    
patient_dict = elements_correction(patient_dict, path, human, phase_n)

In [29]:
patient_dict

{'datetime_creation': '06-04-2022 13:47:40',
 'versioning': {'version': 1,
  'base_patient_id': None,
  'edited_fields': [None, None]},
 'institution_id': '200825MST8^MSR 87J female',
 'institution_name': 'decidemedical',
 'internal_info': {'human_id': 122, 'series': 'SER00012'},
 'age': 86,
 'height': 158.0,
 'weight': 62.0,
 'bsa': 1.65,
 'bmi': 24.84,
 'gender': 'female',
 'body_rois': [{'catalog_tag': 'heart_and_thorax', 'extra_tag': None}],
 'es_timestamp': None,
 'ed_timestamp': None,
 'pathology': 'aortic_stenosis',
 'origin_location': 'europe',
 'origin_ethnicity': None,
 'additional_hist': '',
 'imaging_data': None,
 'models': None,
 'imaging': {'datetime_creation': '06-04-2022 13:47:40',
  'image_files_list': [{'timestamp': None,
    'image_files': ['IMG02341.dcm',
     'IMG02342.dcm',
     'IMG02343.dcm',
     'IMG02344.dcm',
     'IMG02345.dcm',
     'IMG02346.dcm',
     'IMG02347.dcm',
     'IMG02348.dcm',
     'IMG02349.dcm',
     'IMG02350.dcm',
     'IMG02351.dcm',
    

### model_init()
Finally, a model dictionary is created if desired.

In [30]:
model_dict = model_init(human)

In [31]:
model_dict

{'datetime_creation': '06-04-2022 13:47:42',
 'URI': 'https://virtonomydatamanaged0.blob.core.windows.net/',
 'models': [{'timestamp': None,
   'sub_models': [{'blob': None, 'name': None}],
   'landmarks': None}]}

### collection_creation()
And last but not least, the remaining collections are created.

In [32]:
imaging_collection, patient_collection, patient_reidentification_collection = collections_creation(patient_dict)

In [33]:
imaging_collection

{'datetime_creation': '06-04-2022 13:47:40',
 'image_files_list': [{'timestamp': None,
   'image_files': ['IMG02341.dcm',
    'IMG02342.dcm',
    'IMG02343.dcm',
    'IMG02344.dcm',
    'IMG02345.dcm',
    'IMG02346.dcm',
    'IMG02347.dcm',
    'IMG02348.dcm',
    'IMG02349.dcm',
    'IMG02350.dcm',
    'IMG02351.dcm',
    'IMG02352.dcm',
    'IMG02353.dcm',
    'IMG02354.dcm',
    'IMG02355.dcm',
    'IMG02356.dcm',
    'IMG02357.dcm',
    'IMG02358.dcm',
    'IMG02359.dcm',
    'IMG02360.dcm',
    'IMG02361.dcm',
    'IMG02362.dcm',
    'IMG02363.dcm',
    'IMG02364.dcm',
    'IMG02365.dcm',
    'IMG02366.dcm',
    'IMG02367.dcm',
    'IMG02368.dcm',
    'IMG02369.dcm',
    'IMG02370.dcm',
    'IMG02371.dcm',
    'IMG02372.dcm',
    'IMG02373.dcm',
    'IMG02374.dcm',
    'IMG02375.dcm',
    'IMG02376.dcm',
    'IMG02377.dcm',
    'IMG02378.dcm',
    'IMG02379.dcm',
    'IMG02380.dcm',
    'IMG02381.dcm',
    'IMG02382.dcm',
    'IMG02383.dcm',
    'IMG02384.dcm',
    'IMG02385.dcm'

In [34]:
patient_collection

{'datetime_creation': '06-04-2022 13:47:40',
 'versioning': {'version': 1,
  'base_patient_id': None,
  'edited_fields': [None, None]},
 'internal_info': {'human_id': 122, 'series': 'SER00012'},
 'age': 86,
 'height': 158.0,
 'weight': 62.0,
 'bsa': 1.65,
 'bmi': 24.84,
 'gender': 'female',
 'body_rois': [{'catalog_tag': 'heart_and_thorax', 'extra_tag': None}],
 'es_timestamp': None,
 'ed_timestamp': None,
 'pathology': 'aortic_stenosis',
 'origin_location': 'europe',
 'origin_ethnicity': None,
 'additional_hist': '',
 'imaging_data': None,
 'models': None}

In [35]:
patient_reidentification_collection

{'patient_id': None,
 'institution_id': '200825MST8^MSR 87J female',
 'institution_name': 'decidemedical'}