# Interview Scheduler

## Assumptions
You don't care about the ordering of the interviews, I.E. you have no preference to say "I would like to go to the UF interview before the FSU interview" assuming both were chosen. What this means - this scheduler assumes if you are going to both, you don't care about which day you go if one is on say 11/12 and the other is on 11/13. It's going to randomly pick one. This probably works for virtual interviews, but it's not very realistic for physical interviews. 

## How to Use this Notebook

* First, under the options at the top of google colab, select runtime. Select "Change Runtime Type" and under hardware accelerator, select "GPU".

* Each cell containing code needs to be run individually. Next to each cell, there is a play button visible when you hover over the cell. Follow the instructions in the first several blocks, then run the code sequentially, clicking the play button in each cell. Make sure you run the cell immediately beneath these instructions. 

* You have two options for importing a file, you can either import from your google drive, or from your computer. Follow the instructions under "Import Files from Google Drive" or "Import Files from Computer". After that, you'll modify the cells under User Modified Values. Be sure to use the example formatting if you have blackout dates or selected schools.

In [None]:
!git clone https://github.com/abaily/med_school_scheduling.git
%cd med_school_scheduling/


# Import Files from Google Drive

After you run the following cell, you'll click on a link that will redirect you to sign into your google account. After that you'll get an authorization code that you paste into this notebook. 

## Uploading the csv

On the left hand side of google colab, click the image that looks like a file. Click 'drive', 'MyDrive', then right click the csv you want to use and select 'copy path' 

In [None]:
from google.colab import drive
drive.mount('/content/drive')

# Import Files from Computer

Run the following cell, click the choose_files button, and upload the csv from whatever directory it's on on your computer

In [None]:
from google.colab import files
uploaded = files.upload()
filename = list(uploaded.keys())[0]

# User Modified Values

In [None]:
#Blackout dates - dates you would NOT like to do interviews on for whatever reason (holiday, vacation), in format ['11/12', '11/13', '11/14']
blackout_dates = []
#Selected schools - Schools you've already selected, with an interview date, in the format {'10/25': ['University of Florida'], '10:26': ['Florida State']}
#NOTE# the date and school name must be EXACTLY as they appear in the csv
selected_schools = {}
#If you imported from your computer, ignore this. If you uploaded from google drive, inside the quotes, paste the path of the file you copied above.
file_path = 'COPY_FILE_PATH_HERE' 

In [None]:
import utils
import collections
import numpy as np
import random as rd 

"""modify EXTRA_OUTPUT to 1 if you want to see more stuff printed
"""
EXTRA_OUTPUT = 0
"""Modify this number to run for longer/shorter. Depending how complex the problem is (how many schools you were accepted to interview for, and how many interview
days each school has) you may need to run longer. For 95% of situations, this number is more than large enough"""
GIANT_NUMBER = 10000

if filename:
    file_path = filename

school_array = np.genfromtxt(file_path, delimiter=',',names=True,dtype=None, encoding=None)

school_name_list, date_list, prio_list = utils.parse_school_array(school_array)
school_choice_dict = {name:0 for name in school_name_list}
unique_date_list = np.sort(utils.get_unique_date_list(date_list))

match_dict = {}
for date in unique_date_list:
    match_dict[date] = [row[0] for row in school_array if date in row]
match_dict.update(selected_schools)

unique_date_list = [date for date in unique_date_list if date not in blackout_dates]

In [None]:
if EXTRA_OUTPUT:
    for key in match_dict:
        print(f'Date: {key} Schools: {match_dict[key]}')

## Here comes the limit of my technical competence. Scheduling problem was doable if I wanted to look into cplex modules for python, but uh. We didn't do that. Lot of projects going on and busted this out in a few hours, don't hate me or judge me if this gets shared lul. The following cell might take a while to run. I'd recommend if you're on google colab, you go to the edit settings, and under the runtime option, you make sure "Gpu" is selected.

In [None]:
school_value = 0
print(f'Starting School Value: {school_value}')
prio_list = np.ones(len(prio_list))

optimal_dict = {}

for i in range(GIANT_NUMBER): #lul don't judge me on how hacky this is

    school_choice_dict.update((k,0) for k in school_choice_dict)
    chosen_list = []
    comparison_dict = {}
    keys = list(match_dict.keys())
    for key in rd.sample(keys,len(keys)):
        reference_row = [value for value in match_dict[key] if value not in chosen_list]
        if len(reference_row) != 0:
            random_choice = rd.randint(0,(len(reference_row)-1))
            choice = reference_row[random_choice]
            comparison_dict[key] = choice
            chosen_list.append(choice)

    for school in list(school_choice_dict.keys()):
        if school in chosen_list:
            school_choice_dict[school] = 1
    
    iter_value = utils.get_schedule_value(prio_list, school_choice_dict)
    if iter_value > school_value: 
        print(f'School Value for iteration: {i} improved to {iter_value}')
        school_value = iter_value
        optimal_dict = collections.OrderedDict(sorted(comparison_dict.items()))
        if EXTRA_OUTPUT:
            for value in optimal_dict:
                print(f'Date: {value} School: {optimal_dict[value]}')

if EXTRA_OUTPUT:
    print("Dates Not Used: ")
    print([value for value in unique_date_list if value not in list(optimal_dict.keys())])
    print("Schools Not Used:")
    print([value for value in school_name_list if value not in list(optimal_dict.values())])

# Final Schedule from Algorithm
Note: This schedule optimizes the amount of interviews for a certain number of iterations, but maybe doesn't optimize the schedule for you. Run the notebook a few times. You will likely see the same number of interviews, but maybe in a different order. You can run it for an order that you like, or accept whatever order it outputs. 

In [None]:
print(f'Number of interviews able to schedule: {len(optimal_dict)}')
for item in optimal_dict:
    print(f'Date For Interview: {item}: School For That Date: {optimal_dict[item]}')

if EXTRA_OUTPUT:
    print("Dates Not Used: ")
    print([value for value in unique_date_list if value not in list(optimal_dict.keys())])
    print("Schools Not Used:")
    print([value for value in school_name_list if value not in list(optimal_dict.values())])