**Greedy Algorithms for Minimizing the Weighted Sum of Completion Times**

This file describes a set of jobs with positive and integral weights and lengths. It has the format

[number_of_jobs]

[job_1_weight] [job_1_length]

[job_2_weight] [job_2_length]

...

For example, the third line of the file is "74 59", indicating that the second job has weight 74 and length 59.

You should NOT assume that edge weights or lengths are distinct.

In [1]:
def load_file(filename):
    data = []
    for line in open(filename, 'r'):
        try:
            int(line)
        except ValueError:
            data.append([int(i) for i in line.split()])
    
    if DEBUG > 1:
        print data
    
    return data

**Problem 1**
Your task in this problem is to run the greedy algorithm that schedules jobs in decreasing order of the difference (weight - length). Recall from lecture that this algorithm is not always optimal. IMPORTANT: if two jobs have equal difference (weight - length), you should schedule the job with higher weight first. Beware: if you break ties in a different way, you are likely to get the wrong answer. You should report the sum of weighted completion times of the resulting schedule --- a positive integer --- in the box below.

ADVICE: If you get the wrong answer, try out some small test cases to debug your algorithm (and post your test cases to the discussion forum).

In [2]:
def schedule_diff(datalist):
    datasorted = sorted(datalist, key = lambda x: (x[0] - x[1], x[0]),
                        reverse=True)
    if DEBUG > 1:
        print datasorted
    
    complete = 0
    total = 0
    for w, l in datasorted:
        complete += l
        total += w * complete
    return total

**Problem 2**
Your task now is to run the greedy algorithm that schedules jobs (optimally) in decreasing order of the ratio (weight/length). In this algorithm, it does not matter how you break ties. You should report the sum of weighted completion times of the resulting schedule --- a positive integer --- in the box below.

In [3]:
def schedule_ratio(datalist):
    datasorted = sorted(datalist, key = lambda x: x[0] / float(x[1]),
                        reverse=True)
    if DEBUG > 1:
        print datasorted
    
    total, complete = 0, 0
    for w, l in datasorted:
        complete += l
        total += w * complete
    return total

In [4]:
# test case
DEBUG = 2
test_data = load_file('./test.txt')

# Difference Method: 32780
assert schedule_diff(test_data) == 32780, "schedule_diff does not pass the test!"
print "schedule_diff passes the test!"

# Ratio Method: 32104
assert schedule_ratio(test_data) == 32104, "schedule_ratio does not pass the test!"
print "schedule_ratio passes the test!"

[[8, 50], [74, 59], [31, 73], [45, 79], [24, 10], [41, 66]]
[[74, 59], [24, 10], [41, 66], [45, 79], [31, 73], [8, 50]]
schedule_diff passes the test!
[[24, 10], [74, 59], [41, 66], [45, 79], [31, 73], [8, 50]]
schedule_ratio passes the test!


In [None]:
DEBUG = 0

data = load_file('./jobs.txt')
print "Weighted completion time for diff: {0}".format(schedule_diff(data))
print "Weighted completion time for ratio: {0}".format(schedule_ratio(data))