<a href="https://colab.research.google.com/github/anhle/AI_for_Medicine_Specification/blob/master/AI_2D/Ex/Ex_3_prioritization.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

**Apply Machine Learning Exercise**
In this exercise, you'll be given a real-world situation where a radiologist's worklist needs to be prioritized. In this scenario, you have a radiologist who works in a very busy emergency department in a major city. They are often getting hundreds of emergency images that need to be read every day, and there is no prioritization around those images because they come in through the emergency department, so everything is marked as "urgent." In the current setting, radiologists read these images in a first-in-first-out queue, where all images are simply read in the order that they come in. From a clinical perspective, you know that some urgent cases are truly more urgent than others. From your research in interviewing emergency doctors and radiologists, you have identified that two of the most urgent types of findings on an image are a brain bleed and an aortic dissection. Both of these problems can lead to patient death within minutes, but they can only be detected on imaging, so it is critical these images are read ASAP.

You have used deep learning to create two classification algorithms, one for the detection of brain bleeds on head CT images, and one for the detection of aortic dissection on chest x-ray images. Now, you need to figure out how to integrate these algorithms into the radiologist's workflow so that they are most helpful clinically.

In this exercise you'll be given the following:

1. A list of images that have come in through the ED in order of patient arrival
2. Probabilities of 'brain bleed' for each image, as determined by one of your deep learning algorithms
3. Probabilities of 'aortic dissection' for each image, as determined by one of your deep learning algorithms

You will need to do the following:

1. Calculate the amount of time it will take before each image is read by the radiologist, given the patient arrival queue, assuming each image takes 6 minutes to read
2. Implement a heuristic that uses the probabilities returned by your two algorithms to re-order the priority read list, assuming that brain bleeds and aortic dissections are equally urgent
3. Calculate the time delta for each image between the initial and the re-ordered priority reads


Answer the following questions based on your reprioritization:

1. If your algorithm's goal was to have brain bleeds read 30 minutes faster, how did it do?
2.  If your algorithm's goal was to have aortic dissections read 15 minutes faster, how did it do?
3. Were there any cases where your algorithm made it worse for patients who needed an ASAP read? Could anything have been done about this?




In [0]:
import pandas as pd
import numpy as np

In [0]:
!git clone https://github.com/anhle/AI-Healthcare.git

In [0]:
import os
os.chdir('/content/AI-Healthcare/AI_2D/Ex/data')
!pwd

## Worklist prioritization: Emergency Setting

In [0]:
## First, read in the file of the current worklist with the probabilities that your two algorithms have
## generated for the two types of findings you're most concerned with:

worklist = pd.read_csv('probabilities.csv')
worklist.head()

Here, I'm just creating a new column to address the first question in the exercise, showing that every image taking 6 minutes to read will be read in the order that they are presented in this list.

In [0]:
worklist['time_to_read'] = np.arange(6, 6*(len(worklist)+1),6)
worklist.head()

Now, for each image, I want to see if brain bleed or aortic dissection are likely. I'll create a new column showing the max probability between the two of them

In [0]:
worklist['max_prob'] = worklist[["Brain_bleed_probability", "Aortic_dissection_probability"]].max(axis=1)
worklist.head()

Great, now I want to re-order my worklist based on probabilities of critical findings:

In [0]:
worklist_prioritized = worklist.sort_values(by=['max_prob'],ascending=False)
worklist_prioritized.head()

In [0]:
worklist_prioritized['time_to_read_prioritized'] = np.arange(6, 6*(len(worklist)+1),6)

In [0]:
worklist_prioritized['time_delta'] = worklist_prioritized['time_to_read'] - worklist_prioritized['time_to_read_prioritized']
worklist_prioritized.head()

Now, I want to find places where my algorithm saved at least 30 minutes for brain bleeds:

In [0]:
worklist_prioritized[((worklist_prioritized.time_delta>30)&(worklist_prioritized.Image_Type=='head_ct'))]

Looks like there are 14 head CTs that were read more than 30 minutes faster than their original order. All but the last three had a probability of brain bleed < 0.4.

Do the same analysis for saving at least 15 minutes with aortic dissections: 

In [0]:
worklist_prioritized[((worklist_prioritized.time_delta>=15)&(worklist_prioritized.Image_Type=='chest_xray'))]

In [0]:
len(worklist_prioritized[((worklist_prioritized.time_delta>=15)&(worklist_prioritized.Image_Type=='chest_xray'))])

Looks like there are 28 chest x-rays that were read more than 15 minutes faster than their original order. All but the last nine had a probability of aortic dissection < 0.4.

Finally, I'll take a look at anywhere that my algorithm made brain bleeds or aortic dissections with a probability of 0.5 or higher be read _slower._

In [0]:
worklist_prioritized[((worklist_prioritized.time_delta<0)&(worklist_prioritized.max_prob>=0.5))]

Looks like there were two cases where my algorithm caused an image to be read slower than the priority order it came in. Given that I had images with probabilities <0.5 that were read faster, it is definitely possible to improve my algorithm by adding some more heuristics. 