# Smallest Difference in Sums of Subsets

Given an array of positive integers, divide the array into two subsets such that the difference between the sum of the subsets is as small as possible.

For example, given `[5, 10, 15, 20, 25]`, return the sets `{10, 25}` and `{5, 15, 20}`, which has a difference of 5, which is the smallest possible difference.

In [85]:
''' Plan... 
- Sort the array
- Distribute the largest numbers into two different arrays..
- At each juncture, calculate the difference in sums...
- Then continue to distribute numbers and compare the differences..?
'''

def small_differences(arr): 
    
    sub1 = []
    sub2 = [] 
    arr.sort() 
    diff_record = 1000 # some large number
    half_floor_range = int(len(arr)/2 // 1) 
    # the // 1 returns the floor of the division, need to turn float into int
    
    for i in range(half_floor_range):
        
        # first check that we can distribute at least two
        if len(arr) <= 1: 
            break
        
        sub1.append(arr.pop()) # distributing largest
        sub2.append(arr.pop()) # distributing second largest
        diff = sum(sub1) - sum(sub2)
        
        # the moment we get worse in the difference, exit 
        if diff <= diff_record: 
            diff_record = diff
        elif diff > diff_record: 
            break
    
    return diff, sub1, sub2

In [86]:
testARR = [5, 10, 15, 20, 25]

In [87]:
small_differences(testARR)

(10, [25, 15], [20, 10])

### Ok, so one bug is that we didn't distribute the rest of the array

- So what makes more sense is probably keeping the original arr and distributing
- Pop every other item

In [88]:
testARR1 = [5, 10, 15, 20, 25]

In [94]:
def small_diff(arr): 
    
    sub1 = []
    arr.sort() 
    diff_record = 1000 # some large number
    half_floor_range = int(len(arr)/2 // 1) 
    # the // 1 returns the floor of the division, need to turn float into int
    
    for i in range(half_floor_range):
        
        # first check that we can distribute at least two
        if len(arr) <= 1: 
            break
        
        # distribute second largest with pop
        # keep the rest of the arr in array
        sub1.append(arr.pop(len(arr)-2))
        diff = abs(sum(arr) - sum(sub1)) 
        
        # the moment we get worse in the difference, exit 
        if diff <= diff_record: 
            diff_record = diff
        elif diff > diff_record: 
            break
    
    return diff, sub1, arr

In [95]:
testARR1 = [5, 10, 15, 20, 25]
small_diff(testARR1) # nice this solves it, but i wonder if we're just getting lucky

(5, [20, 15], [5, 10, 25])

In [96]:
testARR2 = [15,20,33,44,55,66]
small_diff(testARR2) # oh ok ... need to add absolute value to diff

(31, [55, 44, 33], [15, 20, 66])

In [99]:
testARR3 = [1,2,3,4,5,6,7]
small_diff(testARR3) # this works

(2, [6, 5, 4], [1, 2, 3, 7])

In [101]:
testARR4 = [10,1,2,3,4,5,6,7,20]
small_diff(testARR4) # wow nice

(2, [10, 7, 6, 5], [1, 2, 3, 4, 20])

### Alternative answer (look at the style) 

- notice yield is used
- notice that smallest_diff = inf is used which is cleaner than mine, but requires math import inf... which is less clean. 
- combinations is used in subset_pairs... 

The basic idea is that you're enumerating over all possible pairs of subsets and find the smallest difference between each pair. 

In [4]:
from math import inf
from itertools import combinations

In [5]:
def two_subsets(nums): 
    smallest_diff = inf
    
    result = None
    for subset1, subset2 in subset_pairs(nums): 
        diff = abs(sum(subset1) - sum(subset2))
        if diff < smallest_diff: 
            smallest_diff = diff
            result = (subset1, subset2) 
    return result

In [6]:
def subset_pairs(nums): 
    n = len(nums) 
    for r in range(n+1): 
        for indices in combinations(range(n),r): 
            subset1 = [nums[i] for i in indices] 
            subset2 = [nums[i] for i in set(range(n)) - set(indices)]
            yield subset1, subset2