# Philips Digital Diagnost - Image Data Table Parser

This notebook takes the output of the QA Tool > Image Data Table export from a Philips Digital Diagnost and produces a .csv file that can be appended to the Reject Analysis and Dose Metric Dashboard database.

Let's start by importing the required libraries.

In [1]:
import pandas as pd
import os

The template for the Reject Analysis and Dose Metric Dashboard is as follows:

In [2]:
cols_list_final = ['Asset Number','DeviceID','Manufacturer','Model','Image Date','Image Time','Body Part','View','Exposure Index','KAP (uGy.m2)','kVp','Exposure (mAs)','Exposure time (ms)','Image Status','Reject Reason']

df_template = pd.DataFrame(columns=[*cols_list_final])
df_template

Unnamed: 0,Asset Number,DeviceID,Manufacturer,Model,Image Date,Image Time,Body Part,View,Exposure Index,KAP (uGy.m2),kVp,Exposure (mAs),Exposure time (ms),Image Status,Reject Reason


To clean up an export, you'll need to manually put in the filepath of the .csv file you want to clean up. 
<br>
You will also be asked for your asset number so that we can group systems in the Dashboard by facility.

In [2]:
print("What is the filepath for the .csv file you want to clean up?")
f = input()
## Test filepath "C:/Users/BernardM/JupyterNotebooks/rejectAnalysis/inputdata/Reject_Analysis_Table_QHSCHDXC01_20201202_1_General_Xray_Room1.csv"

print()

print("What is the asset number of the system?")
AssetNumber = input()


What is the filepath for the .csv file you want to clean up?
C:/Users/BernardM/JupyterNotebooks/rejectAnalysis/inputdata/Reject_Analysis_Table_QHSCHDXC01_20201202_1_General_Xray_Room1.csv

What is the asset number of the system?
123456789


The default filename created by the Philips QA tool includes the DeviceID on the first cell in the first column. 
<br>
Let's extract out the DeviceID of the Philips Digital Diagnost from the Image Data Table you collected. 

In [4]:
fn = os.path.basename(f)
DeviceID = fn[22:32]
print("The DeviceID is",DeviceID)

The DeviceID is QHSCHDXC01


Let's do some clean-up on the .csv file:
- Skip the first four rows of the Image Data Table since they're blank
- Define the separator. By default, this will be ";". However, this can be set to something different during export (| or , or -). Change the code below if your separator is something other than ";".

In [5]:
df = pd.read_csv(f, sep = ';', skiprows=4)
df

FileNotFoundError: [Errno 2] No such file or directory: 'C:/Users/BernardM/JupyterNotebooks/rejectAnalysis/inputdata/Reject_Analysis_Table_QHSCHDXC01_20201202_1_General_Xray_Room1.csv'

If the .csv file has been read in correctly above, you should see a table with all the values from the log. 
<br>
Let's add a few identifiers for filtering purposes:
- Asset Number
- Device ID (e.g. unique system ID, serial number or even room number)
- Manufacturer
- Model


In [34]:
cols_list = ['Asset Number','DeviceID','Manufacturer','Model']
df = df.reindex(columns=[*cols_list,*df.columns.tolist()])

df['Asset Number'] = AssetNumber
df['DeviceID'] = DeviceID
df['Manufacturer'] = "Philips"
df['Model'] = "Digital Diagnost"
df

Unnamed: 0,Asset Number,DeviceID,Manufacturer,Model,Image date,Image time,Study description,Protocol step name step,kVp,Exposure [mAs],...,Image comments,Reject reason,Status,Modality,Operators name,Image link,Patient sex,Patient's age,Pregnancy status,Unnamed: 18
0,123456789,QHSCHDXC03,Philips,Digital Diagnost,02/07/2020,11:05:48 AM,Chest,Lateral L,125.0,27.0,...,,,confirmed,DX,user,[Image],M,069Y,Unknown,
1,123456789,QHSCHDXC03,Philips,Digital Diagnost,02/07/2020,11:05:10 AM,Chest,PA,125.0,2.0,...,,,confirmed,DX,user,[Image],M,069Y,Unknown,
2,123456789,QHSCHDXC03,Philips,Digital Diagnost,02/07/2020,12:17:00 PM,Abdomen,AP,77.0,8.0,...,,,confirmed,DX,user,[Image],F,084Y,Unknown,
3,123456789,QHSCHDXC03,Philips,Digital Diagnost,03/07/2020,10:59:09 AM,Chest,AP Landscape,90.0,2.0,...,,,confirmed,DX,user,[Image],M,066Y,Unknown,
4,123456789,QHSCHDXC03,Philips,Digital Diagnost,03/07/2020,11:31:36 AM,Chest,Lateral L,125.0,10.0,...,,,confirmed,DX,BR,[Image],F,066Y,Unknown,
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
503,123456789,QHSCHDXC03,Philips,Digital Diagnost,30/10/2020,1:14:36 PM,Chest,Lateral L,125.0,26.0,...,,,confirmed,DX,user,[Image],M,069Y,Unknown,
504,123456789,QHSCHDXC03,Philips,Digital Diagnost,30/10/2020,1:13:05 PM,Chest,AP Landscape,90.0,2.0,...,,,confirmed,DX,user,[Image],M,069Y,Unknown,
505,123456789,QHSCHDXC03,Philips,Digital Diagnost,30/10/2020,2:31:25 PM,Chest,Lateral L,125.0,6.0,...,,,confirmed,DX,user,[Image],M,082Y,Unknown,
506,123456789,QHSCHDXC03,Philips,Digital Diagnost,30/10/2020,2:30:31 PM,Chest,PA,125.0,2.0,...,,,confirmed,DX,user,[Image],M,082Y,Unknown,


Let's match the .csv columns into the template:
- rename the columns of the original .csv file to match the template
- remove any columns we don't need
- rearrange the columns to match the template

In [35]:
df = df.rename(columns={"Study description": "Body Part",
                       "Protocol step name step": "View",
                        "Relative x-ray exposure": "Exposure Index",
                        "Exposure [mAs]": "Exposure (mAs)",
                        "Exposure time [ms]": "Exposure time (ms)",
                        "Status": "Image Status",
                        "Reject reason": "Reject Reason",
                        "Image dose area product [µGy m²]": "KAP (uGy.m2)",
                        "Image date": "Image Date",
                        "Image time": 'Image Time'
                       })

df_out = df[[*cols_list_final]]
#This is a test
df_out

Unnamed: 0,Asset Number,DeviceID,Manufacturer,Model,Image Date,Image Time,Body Part,View,Exposure Index,KAP (uGy.m2),kVp,Exposure (mAs),Exposure time (ms),Image Status,Reject Reason
0,123456789,QHSCHDXC03,Philips,Digital Diagnost,02/07/2020,11:05:48 AM,Chest,Lateral L,641,104.41,125.0,27.0,53.0,confirmed,
1,123456789,QHSCHDXC03,Philips,Digital Diagnost,02/07/2020,11:05:10 AM,Chest,PA,306,7.48,125.0,2.0,4.0,confirmed,
2,123456789,QHSCHDXC03,Philips,Digital Diagnost,02/07/2020,12:17:00 PM,Abdomen,AP,196,48.55,77.0,8.0,9.0,confirmed,
3,123456789,QHSCHDXC03,Philips,Digital Diagnost,03/07/2020,10:59:09 AM,Chest,AP Landscape,372,18,90.0,2.0,10.0,confirmed,
4,123456789,QHSCHDXC03,Philips,Digital Diagnost,03/07/2020,11:31:36 AM,Chest,Lateral L,358,38.94,125.0,10.0,20.0,confirmed,
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
503,123456789,QHSCHDXC03,Philips,Digital Diagnost,30/10/2020,1:14:36 PM,Chest,Lateral L,444,113.7,125.0,26.0,49.0,confirmed,
504,123456789,QHSCHDXC03,Philips,Digital Diagnost,30/10/2020,1:13:05 PM,Chest,AP Landscape,338,9.69,90.0,2.0,10.0,confirmed,
505,123456789,QHSCHDXC03,Philips,Digital Diagnost,30/10/2020,2:31:25 PM,Chest,Lateral L,487,22.95,125.0,6.0,11.0,confirmed,
506,123456789,QHSCHDXC03,Philips,Digital Diagnost,30/10/2020,2:30:31 PM,Chest,PA,382,6.57,125.0,2.0,3.0,confirmed,


Finally, let's export the cleaned up .csv file into an output file. Finally, let's export the cleaned up .csv file into an output file. By default, this creates a new .csv file with the name "df_out". Change the code below to rename it to something unique with a timestamp if preferred.

In [16]:
df_out.to_csv(r'C:\Users\BernardM\JupyterNotebooks\RejectAnalysis\outputdata\df_out.csv',index = False, header = True)

This output file can now be appended to the Reject Analysis and Dose Metric Dashboard database.