# EN.580.637 Homework

Goal: The goal of the project will be to write a piece of software (possible languages are
Python, Julia, and MATLAB), that matches N patients with K doctors. Each patient is allowed to
provide a ranked list of their preference for doctors, however doctors are prohibited from
displaying preferences for patients. Thus the code should takes in the following:

● A list of ranked preferences, 1 list for each patient

● A maximum capacity for each doctor (can initially assume the same capacity - note the
total capacity should exceed the number of patients
And the code should return:

● A list of assignments indicating which doctors are to take care of which patients

Details: For this assignment please work in groups of at most 3 individuals. Teams can choose
to implement a classical algorithm, such as the Hungarian algorithm, however other algorithms
are also acceptable, including any of a number of auction or transport optimization algorithms in
the literature. The code should be include:

1. A Github repository housing all the code synced amongst the group
2. Commented and documented code, including references and an explanation of the
algorithm implemented
3. A functioning demo script (can be a jupytr notebook)

## Import Packages

In [10]:
import numpy as np
from scipy.optimize import linear_sum_assignment

## Load Example

Here is a example:
* Assuming each patient has a unqiue preference to each doctor
* doctors_capacity indicates each doctor's max capacity for patients
* preference contains each patient's preference for doctor
* 0 is the max preference, len(doctors_capacity)-1 is the least preference

In [12]:
doctors_capacity = [2, 3, 1, 5]
doc_name2id = {0:"A", 1:"B", 2:"C", 3:"D"}
#rank from 0-3 high to low
preference = np.array([[0, 1, 2, 3], [0, 3, 1, 2], [1, 3, 2, 0], [3, 1, 0, 2], [3, 2, 0, 1]])
print(preference)

[[0 1 2 3]
 [0 3 1 2]
 [1 3 2 0]
 [3 1 0 2]
 [3 2 0 1]]


## Print preference

In [13]:
for i in range(len(preference)):
    pref_each = preference[i]
    pref_name = []
    for j in pref_each:
        pref_name.append(doc_name2id.get(j))
    pref_name_list = '>'.join(pref_name)
    print("For patient %d, his or her preference is %s" % (i, pref_name_list))

For patient 0, his or her preference is A>B>C>D
For patient 1, his or her preference is A>D>B>C
For patient 2, his or her preference is B>D>C>A
For patient 3, his or her preference is D>B>A>C
For patient 4, his or her preference is D>C>A>B


## replicate columns by doctors' capacities

Our program uses hungarian algorithm to achieve matching. One issue that Hungarian algorithm could not achieve is that it could not handle multiple capacity. In order to fix this issue, we multiple the columns with each corresponding capacity. For instance, if a doctor has 4 capacity, we make him 4 columns.

In [14]:
hungarian_matrix = preference[:, 0]
cur_index = 0
for i in range(len(doctors_capacity)):
    capacity = doctors_capacity[i]
    preference = np.insert(preference, 
                           (capacity - 1) * [cur_index], 
                           preference[:, [cur_index]],
                           axis=1)
    cur_index += capacity


In [15]:
print(preference)

[[0 0 1 1 1 2 3 3 3 3 3]
 [0 0 3 3 3 1 2 2 2 2 2]
 [1 1 3 3 3 2 0 0 0 0 0]
 [3 3 1 1 1 0 2 2 2 2 2]
 [3 3 2 2 2 0 1 1 1 1 1]]


## Run Hungarian algorithm

* Here we will run Hungarian algorithm coded in the HungarianAlgorithm.py

In [22]:
from HungarianAlgorithm import *

In [23]:
result = hungarian_algorithm(preference)
sorted_by_second = sorted(result, key=lambda tup: tup[0])
print(sorted_by_second)

[(0, 1), (1, 0), (2, 6), (3, 2), (4, 5)]


In [24]:
row_ind, col_ind = linear_sum_assignment(preference)
print(row_ind)
print(col_ind)

[0 1 2 3 4]
[0 1 6 2 5]


In [25]:
doctors_range = [sum(doctors_capacity[: i]) for i in range(1, len(doctors_capacity) + 1)]
print(doctors_range)

[2, 5, 6, 11]


In [26]:
result = {}
for patient_idx in range(len(sorted_by_second)):
    doctor_idx = sorted_by_second[patient_idx][1]
    j = 0
    while j < len(doctors_range) and doctor_idx >= doctors_range[j]:
        j+=1
    result[patient_idx] = doc_name2id.get(j)
print(result)

{0: 'A', 1: 'A', 2: 'D', 3: 'B', 4: 'C'}


## Print Result

In [20]:
for patient_idx in range(len(col_ind)):
    doctor_idx = col_ind[patient_idx]
    j = 0
    while j < len(doctors_range) and doctor_idx >= doctors_range[j]:
        j+=1
    print("patient {} is taken care by doctor {}".format(str(patient_idx), doc_name2id.get(j)))
       

patient 0 is taken care by doctor A
patient 1 is taken care by doctor A
patient 2 is taken care by doctor D
patient 3 is taken care by doctor B
patient 4 is taken care by doctor C


In [33]:
doctors_to_patients = {}
for patient in result.keys():
    doctor = result[patient]
    if doctor not in doctors_to_patients.keys():
        doctors_to_patients[doctor] = []
    doctors_to_patients[doctor].append(patient)

for doctor in doctors_to_patients.keys():
    print("doctor {} takes care of patients {}".format(doctor, str(doctors_to_patients[doctor])))

doctor A takes care of patients [0, 1]
doctor D takes care of patients [2]
doctor B takes care of patients [3]
doctor C takes care of patients [4]


In [30]:
doc_name2id.values()

dict_values(['A', 'B', 'C', 'D'])