Given a collection of integers that might contain duplicates, nums, return all possible subsets (the power set).

Note: The solution set must not contain duplicate subsets.

I previously solved 078. Subsets and this requires a modification of that solution.

Essentially, instead of thinking of each number as appearing or not, we can count the number of times the number has appeared and how many times we'd like it to appear.

For example, consider the list `[1, 2, 2, 2, 3, 3, 4]`. The number 1 appears once, the number 2 appears 3 times, the number 3 appears twice, and the number 4 appears once. Thus in any subset, number 1 can appear 0 or 1 time, number 2 can appear 0, 1, 2, or 3 times, and so on. In total there will be `2*4*3*2` subsets. Note that if we require that each number in the list is unique, then we recover the original problem with `2**n` subsets.

In [4]:
from collections import Counter

class Solution:
    def subsets(self, nums):
        res = [[]]
        c = Counter(nums)
        for i, (num, count) in enumerate(c.items()):
            n = len(res)
            for j in range(n):
                for k in range(1, count+1):
                    res.append(list(res[j])+[num]*k)
                    
        return res

Thinking about the slow run time. I supposed it would reduce the growth at each step by starting with those items with the smallest counts first.

In [5]:
from collections import Counter

class SolutionExtraSort:
    def subsets(self, nums):
        res = [[]]
        c = Counter(nums)
        for i, (num, count) in enumerate(sorted(c.items(), key = lambda x: x[1])):
            n = len(res)
            for j in range(n):
                for k in range(1, count+1):
                    res.append(list(res[j])+[num]*k)
                    
        return res

# Comparing Run Time of the Two Solutions

We compare the runtimes for the two solutions across 10 random lists of length 30.

In [33]:
from random import randint
tests = [[randint(0, 10) for i in range (30)] for i in range(10)]

In [34]:
%%timeit
s2 = Solution()
[s2.subsets(test) for test in tests]

4.26 s ± 50.4 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)


In [35]:
%%timeit
s1 = SolutionExtraSort()
[s1.subsets(test) for test in tests]

3.47 s ± 31.6 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)


In [36]:
print([len(s2.subsets(test)) for test in tests])
print([len(s1.subsets(test)) for test in tests])

[518400, 967680, 480000, 1080000, 583200, 580608, 967680, 777600, 302400, 622080]
[518400, 967680, 480000, 1080000, 583200, 580608, 967680, 777600, 302400, 622080]


Remark: on LeetCode the first had runtime 36 ms (and memory 13.2 MB) while the second solution with the sort had runtime 28 ms (and the same memory usage) - "faster than 96.42% of Python3 online submissions."

# Test Cases

In [9]:
s = Solution()

In [10]:
for nums in [[], [1], [1,2], [1,2,3], [1, 2, 2], [2, 2, 1], [1, 1], [1, 1, 2, 2, 3, 3, 3, 4]]:
    print("Input:", nums)
    output = s.subsets(nums)
    print("Output:", output)
    print("Length:", len(output), "\n")

Input: []
Output: [[]]
Length: 1 

Input: [1]
Output: [[], [1]]
Length: 2 

Input: [1, 2]
Output: [[], [1], [2], [1, 2]]
Length: 4 

Input: [1, 2, 3]
Output: [[], [1], [2], [1, 2], [3], [1, 3], [2, 3], [1, 2, 3]]
Length: 8 

Input: [1, 2, 2]
Output: [[], [1], [2], [2, 2], [1, 2], [1, 2, 2]]
Length: 6 

Input: [2, 2, 1]
Output: [[], [1], [2], [2, 2], [1, 2], [1, 2, 2]]
Length: 6 

Input: [1, 1]
Output: [[], [1], [1, 1]]
Length: 3 

Input: [1, 1, 2, 2, 3, 3, 3, 4]
Output: [[], [4], [1], [1, 1], [4, 1], [4, 1, 1], [2], [2, 2], [4, 2], [4, 2, 2], [1, 2], [1, 2, 2], [1, 1, 2], [1, 1, 2, 2], [4, 1, 2], [4, 1, 2, 2], [4, 1, 1, 2], [4, 1, 1, 2, 2], [3], [3, 3], [3, 3, 3], [4, 3], [4, 3, 3], [4, 3, 3, 3], [1, 3], [1, 3, 3], [1, 3, 3, 3], [1, 1, 3], [1, 1, 3, 3], [1, 1, 3, 3, 3], [4, 1, 3], [4, 1, 3, 3], [4, 1, 3, 3, 3], [4, 1, 1, 3], [4, 1, 1, 3, 3], [4, 1, 1, 3, 3, 3], [2, 3], [2, 3, 3], [2, 3, 3, 3], [2, 2, 3], [2, 2, 3, 3], [2, 2, 3, 3, 3], [4, 2, 3], [4, 2, 3, 3], [4, 2, 3, 3, 3], [4, 2, 2,