# Welcome

The process of manually searching for clinical trials is very resource intensive and should be automated by a computer.

## Problem
It is very costly (human resources) to manually check each patient description and assign them to a clinical trial

## Objective

Create an algorithm that automatically assigns possible candidates (patients) to clinical trials

In [1]:
import pandas as pd

# Data

In [2]:
clinical_trials = pd.read_csv('data/5ksample.csv')
patients = pd.read_csv('patients_15.csv')

In [5]:
clinical_trials.head(2)
clinical_trials.shape

patients.head(2)
patients.shape

Unnamed: 0,id,title,summary,gender,min_age,max_age
0,NCT01000168,Treadmill Therapy and Brain Injuries,\n Introduction.\r\n\r\n - There ...,All,18 Years,
1,NCT01000194,Acute Fatty Acid Intervention Study (AFAST),\n The main objective of this study is to...,Male,18 Years,30 Years


(5000, 6)

Unnamed: 0,patient_id,text
0,1,\nPatient is a 45-year-old man with a history ...
1,2,"\n48 M with a h/o HTN hyperlipidemia, bicuspid..."


(15, 2)

# Let's go easy


Main objective: Match based on **gender** requirements

## Part 1: Divide and conquer

We have a big objective but in order to solve it let's divide it into smaller objectives that are easier to complete.

<br>

**Sub-objectives**:
- **A**: Detect gender from patient description
- **B**: Go through each trial and assign patients based on their gender

In [79]:
def naive_man_detector(text):

    possible_male_references = ['man', 'male', 'm']
    
    # convert everything to lower case
    text = text.lower() # possible exercise!!
    
    # usually gender is in the first sentence
    # so let's pick the first ~100 characters to find the gender
    first_part = text[:100]
    
    words = first_part.split(" ")
    
    for word in words:
        if word in possible_male_references:
            return True
    return False

In [80]:
patients['is_male'] = patients.text.apply(naive_man_detector)

Now that we can classify each patient in gender, lets assign patients to clinical trials

In [91]:
for patient in patients.itertuples(index=False):
    patient
    break

Pandas(patient_id=1, text='\nPatient is a 45-year-old man with a history of anaplastic astrocytoma of the spine complicated by severe lower extremity weakness and urinary retention s/p Foley catheter, high-dose steroids, hypertension, and chronic pain. The tumor is located in the T-L spine, unresectable anaplastic astrocytoma s/p radiation. Complicated by progressive lower extremity weakness and urinary retention. Patient initially presented with RLE weakness where his right knee gave out with difficulty walking and right anterior thigh numbness. MRI showed a spinal cord conus mass which was biopsied and found to be anaplastic astrocytoma. Therapy included field radiation t10-l1 followed by 11 cycles of temozolomide 7 days on and 7 days off. This was followed by CPT-11 Weekly x4 with Avastin Q2 weeks/ 2 weeks rest and repeat cycle. \n', is_male=True)

In [97]:
patient2trials = {}

for patient in patients.itertuples(index=False):

    patient_id = patient.patient_id
    patient_description = patient.text
    gender = 'male' if patient.is_male == True else 'female'
    
    patient2trials[patient_id] = []
    for trial in clinical_trials.itertuples(index=False):
        if patient.is_male == True and trial.gender in ['All', 'Male']:
            patient2trials[patient_id].append(trial)
        elif patient.is_male == False and trial.gender in ['All','Female']:
            patient2trials[patient_id].append(trial)
            
    print(f'Patient {patient_id} is believed to be a {gender} and was attributed to {len(patient2trials[patient_id])} trials!')

Patient 1 is believed to be a male and was attributed to 4543 trials!
Patient 2 is believed to be a male and was attributed to 4543 trials!
Patient 3 is believed to be a female and was attributed to 4772 trials!
Patient 4 is believed to be a female and was attributed to 4772 trials!
Patient 5 is believed to be a female and was attributed to 4772 trials!
Patient 6 is believed to be a female and was attributed to 4772 trials!
Patient 7 is believed to be a male and was attributed to 4543 trials!
Patient 8 is believed to be a female and was attributed to 4772 trials!
Patient 9 is believed to be a male and was attributed to 4543 trials!
Patient 10 is believed to be a female and was attributed to 4772 trials!
Patient 11 is believed to be a male and was attributed to 4543 trials!
Patient 12 is believed to be a female and was attributed to 4772 trials!
Patient 13 is believed to be a male and was attributed to 4543 trials!
Patient 14 is believed to be a female and was attributed to 4772 trials!