# Philips Digital Diagnost - Image Data Table Parser

This notebook takes the output of the QA Tool > Image Data Table export from a Philips Digital Diagnost and produces a .csv file that can be appended to the Reject Analysis and Dose Metric Dashboard database.

Let's start by importing the required libraries needed to make the parser work.

In [1]:
import pandas as pd
import os

from ipywidgets import *
from tkinter import Tk, filedialog
from IPython.display import clear_output, display

The template for the Reject Analysis and Dose Metric Dashboard is as follows:

In [3]:
cols_list_final = ['Asset Number','DeviceID','Manufacturer','Model','Image Date','Image Time','Body Part','View','Exposure Index','KAP (uGy.m2)','kVp','Exposure (mAs)','Exposure time (ms)','Image Status','Reject Reason']

df_template = pd.DataFrame(columns=[*cols_list_final])
df_template

Unnamed: 0,Asset Number,DeviceID,Manufacturer,Model,Image Date,Image Time,Body Part,View,Exposure Index,KAP (uGy.m2),kVp,Exposure (mAs),Exposure time (ms),Image Status,Reject Reason


To clean up an export, you'll need to select the exported files from the Philips QA Tool using the "File Select" button below.
<br>
You will also be asked for your asset number so that we can group systems in the Dashboard by facility.

In [16]:
# Select input Image Data Table file as exported from Philips QA tool

def select_files(b):
    clear_output()
    root = Tk()
    root.withdraw() # Hide the main window.
    root.call('wm', 'attributes', '.', '-topmost', True) # Raise the root to the top of all windows.
    b.files = filedialog.askopenfilename() # List of selected files will be set button's file attribute.
    print(b.files) # Print the list of files selected.

fileselect = Button(description="File select")
fileselect.on_click(select_files)

display(fileselect)

C:/Users/bernardm/GitHub/JupyterNotebooks/rejectAnalysis/inputdata/Reject_Analysis_Table_QHSCHDXC05_20201202_1_DEM_Xray_Room1.csv


In the next section, the file that you've selected will be printed out and you'll be asked to input the Asset Number of the system. Type the Asset Number in the box provided and then press "ENTER".

In [17]:
files = fileselect.files
print("The file you've selected is: ", files)
print()
print("What is the asset number of the system?")

AssetNumber = input()


The file you've selected is:  C:/Users/bernardm/GitHub/JupyterNotebooks/rejectAnalysis/inputdata/Reject_Analysis_Table_QHSCHDXC05_20201202_1_DEM_Xray_Room1.csv

What is the asset number of the system?
123


The default filename created by the Philips QA tool includes the DeviceID on the first cell in the first column. 
<br>
Let's extract out the DeviceID of the Philips Digital Diagnost from the Image Data Table you collected. 

In [18]:
# Get the DeviceID from the filename of the .csv file

fn = os.path.basename(files)
DeviceID = fn[22:32]
print("The DeviceID is",DeviceID)

The DeviceID is QHSCHDXC05


Let's do some clean-up on the .csv file:
- Skip the first four rows of the Image Data Table since they're blank
- Define the separator. By default, this will be ";". However, this can be set to something different during export (| or , or -). Change the code below if your separator is something other than ";".
- Any lines with additional separators will be skipped.
- There should be 19 columns in total corresponding to the template. If there is less/more, then the separator is incorrect.

In [19]:
df = pd.read_csv(files, sep =';', skiprows=4, error_bad_lines = False )
df

b'Skipping line 6403: expected 19 fields, saw 20\nSkipping line 6404: expected 19 fields, saw 20\nSkipping line 9164: expected 19 fields, saw 21\nSkipping line 9165: expected 19 fields, saw 21\nSkipping line 9542: expected 19 fields, saw 20\nSkipping line 11996: expected 19 fields, saw 20\nSkipping line 11997: expected 19 fields, saw 20\nSkipping line 11998: expected 19 fields, saw 20\nSkipping line 12473: expected 19 fields, saw 20\nSkipping line 12474: expected 19 fields, saw 20\nSkipping line 12475: expected 19 fields, saw 20\nSkipping line 15419: expected 19 fields, saw 20\nSkipping line 15420: expected 19 fields, saw 20\nSkipping line 19690: expected 19 fields, saw 20\nSkipping line 19691: expected 19 fields, saw 20\nSkipping line 19692: expected 19 fields, saw 20\n'


Unnamed: 0,Image date,Image time,Study description,Protocol step name step,kVp,Exposure [mAs],Exposure time [ms],Relative x-ray exposure,Image dose area product [µGy m²],Image comments,Reject reason,Status,Modality,Operators name,Image link,Patient's age,Patient sex,Pregnancy status,Unnamed: 18
0,02/07/2020,12:52:04 AM,Forearm R,Lateral,55,3,10,320,4.68,,,confirmed,DX,user,[Image],015Y,F,Unknown,
1,02/07/2020,12:50:56 AM,Forearm R,AP,55,3,10,410,4.49,,,confirmed,DX,user,[Image],015Y,F,Unknown,
2,02/07/2020,2:56:17 AM,Soft Tissue Neck,Lateral,77,6,17,234,6.72,,,confirmed,DX,user,[Image],057Y,F,Unknown,
3,02/07/2020,6:15:30 AM,Chest,Lateral L,125,4,7,419,13.14,,,confirmed,DX,user,[Image],020Y,M,Unknown,
4,02/07/2020,6:14:40 AM,Chest,PA,125,1,2,370,5.09,,,confirmed,DX,user,[Image],020Y,M,Unknown,
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
19718,01/12/2020,11:05:43 PM,Hand L,PA,52,2,10,167,0.92,,,confirmed,DX,user,[Image],070Y,F,Unknown,
19719,01/12/2020,11:34:15 PM,Toes R,Oblique,40,2,12,100,0.44,Image rejected. Patient Moved,Patient Moved,rejected,DX,user,[Image],002Y,F,Unknown,
19720,01/12/2020,11:34:29 PM,Toes R,Lateral,40,2,12,113,0.44,,,confirmed,DX,user,[Image],002Y,F,Unknown,
19721,01/12/2020,11:33:55 PM,Toes R,AP,40,2,12,168,0.44,,,confirmed,DX,user,[Image],002Y,F,Unknown,


If the .csv file has been read in correctly above, you should see a table with all the values from the log. 
<br>
Let's add a few identifiers for filtering purposes:
- Asset Number
- Device ID (e.g. unique system ID, serial number or even room number)
- Manufacturer
- Model


In [20]:
cols_list = ['Asset Number','DeviceID','Manufacturer','Model']
df = df.reindex(columns=[*cols_list,*df.columns.tolist()])

df['Asset Number'] = AssetNumber
df['DeviceID'] = DeviceID
df['Manufacturer'] = "Philips"
df['Model'] = "Digital Diagnost"
df

Unnamed: 0,Asset Number,DeviceID,Manufacturer,Model,Image date,Image time,Study description,Protocol step name step,kVp,Exposure [mAs],...,Image comments,Reject reason,Status,Modality,Operators name,Image link,Patient's age,Patient sex,Pregnancy status,Unnamed: 18
0,123,QHSCHDXC05,Philips,Digital Diagnost,02/07/2020,12:52:04 AM,Forearm R,Lateral,55,3,...,,,confirmed,DX,user,[Image],015Y,F,Unknown,
1,123,QHSCHDXC05,Philips,Digital Diagnost,02/07/2020,12:50:56 AM,Forearm R,AP,55,3,...,,,confirmed,DX,user,[Image],015Y,F,Unknown,
2,123,QHSCHDXC05,Philips,Digital Diagnost,02/07/2020,2:56:17 AM,Soft Tissue Neck,Lateral,77,6,...,,,confirmed,DX,user,[Image],057Y,F,Unknown,
3,123,QHSCHDXC05,Philips,Digital Diagnost,02/07/2020,6:15:30 AM,Chest,Lateral L,125,4,...,,,confirmed,DX,user,[Image],020Y,M,Unknown,
4,123,QHSCHDXC05,Philips,Digital Diagnost,02/07/2020,6:14:40 AM,Chest,PA,125,1,...,,,confirmed,DX,user,[Image],020Y,M,Unknown,
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
19718,123,QHSCHDXC05,Philips,Digital Diagnost,01/12/2020,11:05:43 PM,Hand L,PA,52,2,...,,,confirmed,DX,user,[Image],070Y,F,Unknown,
19719,123,QHSCHDXC05,Philips,Digital Diagnost,01/12/2020,11:34:15 PM,Toes R,Oblique,40,2,...,Image rejected. Patient Moved,Patient Moved,rejected,DX,user,[Image],002Y,F,Unknown,
19720,123,QHSCHDXC05,Philips,Digital Diagnost,01/12/2020,11:34:29 PM,Toes R,Lateral,40,2,...,,,confirmed,DX,user,[Image],002Y,F,Unknown,
19721,123,QHSCHDXC05,Philips,Digital Diagnost,01/12/2020,11:33:55 PM,Toes R,AP,40,2,...,,,confirmed,DX,user,[Image],002Y,F,Unknown,


Let's match the .csv columns into the template:
- rename the columns of the original .csv file to match the template
- remove any columns we don't need
- rearrange the columns to match the template

In [21]:
df = df.rename(columns={"Study description": "Body Part",
                       "Protocol step name step": "View",
                        "Relative x-ray exposure": "Exposure Index",
                        "Exposure [mAs]": "Exposure (mAs)",
                        "Exposure time [ms]": "Exposure time (ms)",
                        "Status": "Image Status",
                        "Reject reason": "Reject Reason",
                        "Image dose area product [µGy m²]": "KAP (uGy.m2)",
                        "Image date": "Image Date",
                        "Image time": 'Image Time'
                       })

df_out = df[[*cols_list_final]]
df_out

Unnamed: 0,Asset Number,DeviceID,Manufacturer,Model,Image Date,Image Time,Body Part,View,Exposure Index,KAP (uGy.m2),kVp,Exposure (mAs),Exposure time (ms),Image Status,Reject Reason
0,123,QHSCHDXC05,Philips,Digital Diagnost,02/07/2020,12:52:04 AM,Forearm R,Lateral,320,4.68,55,3,10,confirmed,
1,123,QHSCHDXC05,Philips,Digital Diagnost,02/07/2020,12:50:56 AM,Forearm R,AP,410,4.49,55,3,10,confirmed,
2,123,QHSCHDXC05,Philips,Digital Diagnost,02/07/2020,2:56:17 AM,Soft Tissue Neck,Lateral,234,6.72,77,6,17,confirmed,
3,123,QHSCHDXC05,Philips,Digital Diagnost,02/07/2020,6:15:30 AM,Chest,Lateral L,419,13.14,125,4,7,confirmed,
4,123,QHSCHDXC05,Philips,Digital Diagnost,02/07/2020,6:14:40 AM,Chest,PA,370,5.09,125,1,2,confirmed,
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
19718,123,QHSCHDXC05,Philips,Digital Diagnost,01/12/2020,11:05:43 PM,Hand L,PA,167,0.92,52,2,10,confirmed,
19719,123,QHSCHDXC05,Philips,Digital Diagnost,01/12/2020,11:34:15 PM,Toes R,Oblique,100,0.44,40,2,12,rejected,Patient Moved
19720,123,QHSCHDXC05,Philips,Digital Diagnost,01/12/2020,11:34:29 PM,Toes R,Lateral,113,0.44,40,2,12,confirmed,
19721,123,QHSCHDXC05,Philips,Digital Diagnost,01/12/2020,11:33:55 PM,Toes R,AP,168,0.44,40,2,12,confirmed,


Check the output table above and confirm that the information matches the column names. 
If everything is OK, let's export the cleaned up .csv file into an output file. 

A .csv file will be created with the Asset Number and a datestamp.

In [22]:
# Add timestamp to filename
from datetime import datetime
date = datetime.now().strftime("%Y_%m_%d_%I_%S%p")

df_out.to_csv((r'C:\Users\BernardM\GitHub\JupyterNotebooks\RejectLogParser\outputdata\\'
               +str(AssetNumber)
               +'_'
               +str(date)
               +'.csv')
               ,index = False, header = True)

print("The output file has been successfully created.")

The output file has been successfully created.


This output file can now be appended to the Reject Analysis and Dose Metric Dashboard database.