# Scheduling Algorithm

In [None]:
import pandas as pd
import os
from utils import COLLEGES, heuristic_function

## Read Data

In [None]:
data = pd.read_csv('data/filtered-data.csv', low_memory=False)

## Define Scheduling Algorithm

The below algorithm is a "priority" scheduling algorithm. The priority is determined by the value of the college as determined by the heuristic function defined in [utils.py](utils.py).

In [None]:
def schedule_by_priority(colleges: list[str]) -> pd.DataFrame:
    by_value = [[college, heuristic_function(college, data)[0]] for college in colleges]
    by_value.sort(key=lambda e: e[1], reverse=True)

    # TODO: Add a second sort based on projected value (CollegeEducationMotion)?

    values = [college_data[0] for college_data in by_value]
    return pd.DataFrame(data=values, columns=['College (By Initial Value)'])

Other algorithms we may want to implement in the future:

* An algorithm that's based on the travel distance to visit a college. This would be analogous to a "shortest job first" algorithm.

* Similar to "shortest job first", but tailored to visiting the colleges all at once (without returning home) in the most efficient way possible. Note this would basically be the [travelling salesman problem](https://en.wikipedia.org/wiki/Travelling_salesman_problem).

## Run Scheduling Algorithm

In [None]:
schedule = schedule_by_priority(COLLEGES)
display(schedule)

if not os.path.exists('output'):
    os.mkdir('output')
schedule.to_csv('output/schedule.csv')