# Medical Image Data Format
Medical images follow Digital Imaging and Communications (DICOM) as a standard solution for storing
and exchanging medical image-data. The first version of this standard was released in 1985.
Since then there are several changes made. This standard uses a file format and a communications protocol.

File Format — All patient medical images are saved in the DICOM file format. 
This format has PHI (protected health information) about the patient such as — name, sex, age in addition to 
other image related data such as equipment used to capture the image and some context to the medical treatment. 
Medical Imaging Equipments create DICOM files. Doctors use DICOM Viewers, computer software applications 
that can display DICOM images, read and to diagnose the findings in the images.

Communications Protocol — The DICOM communication protocol is used to search for imaging studies in the archive 
and restore imaging studies to the workstation in order to display it. 
All medical imaging applications that are connected to the hospital network use the DICOM protocol to 
exchange information, mainly DICOM images but also patient and procedure information. 
There are also more advanced network commands that are used to control and follow the treatment, schedule procedures, 
report statuses and share the workload between doctors and imaging devices.

The following code snippet will read 2 DICOM files available at https://medistim.com/dicom/
    and will write out the DICOM tags from thes files to 2 separate text files 

In [2]:
import pydicom as dicom
# pydicom is a pure python package for parsing DICOM files. 
# pydicom makes it easy to read these complex files into natural python structures for easy manipulation. 
# Modified datasets can be re-written to DICOM format files.
# pydicom package can be installed in Anaconda by using command - 'conda install -c conda-forge pydicom'
# if installed via other means other than Anaconda use - pip install -U pydicom
import re

In [3]:
# Dataset is the main dicom object and is derived from python dictionary data type (i.e. key:value pairs)
# key: is the DICOM (group,element) tag (as a Tag object)
# value: is a DataElement instance. It stores:
	#   tag - a DICOM tag
	#   VR – DICOM value representation (see http://dicom.nema.org/dicom/2013/output/chtml/part05/sect_6.2.html)
	#   VM – value multiplicity
	#   value – the actual value.
ttfm = dicom.dcmread("input/ttfm.dcm")
bmode = dicom.dcmread("input/bmode.dcm") 

A DICOM tag is a 4 byte hexadecimal code composed of a 2 byte "group" number and a 2 byte "element" number. The group number is an identifier that tells you what information entity the tag applies to (for example, group 0010 refers to the patient and group 0020 refers to the study). The element number identifies the interpretation of the value (items such as the patient's ID number, the series description, etc.).

In [75]:
with open('output/ttfm_tags.txt', 'w') as f:
    for key in ttfm.dir():
       value = getattr(ttfm, key, '')
       if(type(value) is dicom.sequence.Sequence): # If sequence, then the pattern match the tags and print them to a text file
        s = str(value[0])
        print(re.findall('\(.*?\)',s), file = f)
       else:
        print(ttfm.data_element(key).tag, file=f)  # Access DataElement, then the tags and print them to a text file

In [76]:
with open('output/bmode_tags.txt', 'w') as f:
    for key in bmode.dir():
       value = getattr(bmode, key, '')
       if(type(value) is dicom.sequence.Sequence): # If sequence, then the pattern match the tags and print them to a text file
        s = str(value[0])
        print(re.findall('\(.*?\)',s), file = f)
       else:
        print(bmode.data_element(key).tag, file=f)  # Access DataElement, then the tags and print them to a text file