# Interview Cake \#16: [The Cake Thief](https://www.interviewcake.com/question/python/cake-thief)

### You are a renowned thief who has recently switched from stealing precious metals to stealing cakes because of the insane profit margins. You end up hitting the jackpot, breaking into the world's largest privately owned stock of cakes—the vault of the Queen of England.

While Queen Elizabeth has a limited number of *types of cake*, she has an *unlimited supply of each type*.

Each type of cake has a weight and a value, stored in a tuple with two indices:
* An integer representing the weight of the cake in kilograms
* An integer representing the monetary value of the cake in British pounds

You brought a duffel bag that can hold limited weight, and you want to make off with the most valuable haul possible.

Write a function `max_duffel_bag_value()` that takes **a list of cake type tuples** and a **weight capacity**, and returns the ***maximum monetary value* the duffel bag can hold.**

In [1]:
def max_duffel_bag_value(cakes, capacity):
    haul_values = [0] * (capacity + 1)
    for weight, value in cakes:
        if weight == 0 and value > 0:
            return math.inf
        for i in range(weight, capacity + 1):
            best_haul = haul_values[i]
            new_haul = haul_values[i - weight] + value
            haul_values[i] = max(new_haul, best_haul)
    return haul_values[capacity]

### We know the *max value* we can carry, but which cakes should we take, and how many? Try adjusting your answer to return this information as well.

In [2]:
def sum_cakes(cakes):
    value = 0
    for cake, quantity in cakes.items():
        value += quantity * cake[1]
    return value

def add_cake_to_haul(cake, haul):
    if cake not in haul:
        haul[cake] = 0
    haul[cake] += 1
    return haul

def steal_cakes(cakes, capacity):
    hauls = [{} for i in range(capacity + 1)]
    for cake_weight, cake_value in cakes:
        if cake_weight == 0 and cake_value > 0:
            return { (cake_weight, cake_value) : "infinity" }
        for i in range(cake_weight, capacity + 1):
            best_haul = sum_cakes(hauls[i])
            new_haul = sum_cakes(hauls[i - cake_weight]) + cake_value
            if new_haul > best_haul:
                cake = (cake_weight, cake_value)
                haul_without_cake = hauls[i - cake_weight].copy()
                hauls[i] = add_cake_to_haul(cake, haul_without_cake)
    return hauls[capacity]

### What if we check to see if all the cake weights have a common denominator? Can we improve our algorithm?

If all the weights have a common denominator, then you can set your interval to be the LCD. So instead of incrementing by 1 every time, you could instead do a loop like: `for i in range(cake_weight, capacity + 1, LCD)`.

### A cake that's both *heavier* and *worth less* than another cake would never be in the optimal solution. This idea is called dominance relations. Can you apply this idea to save some time? Hint: dominance relations can apply to sets of cakes, not just individual cakes.

This is a really intersting way to optimize the cakes you look for. After all, if a cake weighs less and is worth more, you always prefer that over a cake that weighs more and is worth less--you'd rather take a 4lb, \$50 cake and have one pound left over than a 5lb \$40 cake where you'll end up with less money. These dominant cakes can be represented by the [Pareto Frontier](http://oco-carbon.com/metrics/find-pareto-frontiers-in-python/)...the subset of cakes that dominate the cakes that lie beyond it. 

![alt text](https://upload.wikimedia.org/wikipedia/commons/thumb/e/e2/PareoEfficientFrontier1024x1024.png/256px-PareoEfficientFrontier1024x1024.png "Example of the Pareto Frontier")

So what this means for our problem is that we can narrow our cakes to be only those on the Pareto Frontier, thus reducing our value of n (the amount will vary on the cake distribution). To do this, I simply need to write a `get_pareto_cakes()` function, then call it before my main cake loop.

In [3]:
from collections import namedtuple

# For convenience/readability
Cake = namedtuple("Cake", ("weight", "value"))

def get_pareto_cakes(cakes):
    frontier = []
    cakes = sorted(cakes)
    best_cake = cakes[0]
    for cake in cakes:
        if cake.value > best_cake.value:
            if cake.weight > best_cake.weight:
                frontier.append(best_cake)
            best_cake = cake
    frontier.append(best_cake)
    return frontier

# Simple Test
all_cakes = [Cake(1, 3), Cake(2, 7), Cake(3, 9), Cake(4, 9), Cake(5, 8), Cake(3, 10), Cake(1, 4)]
print(get_pareto_cakes(all_cakes))

[Cake(weight=1, value=4), Cake(weight=2, value=7), Cake(weight=3, value=10)]


### What if we had a tuple for *every individual cake* instead of types of cakes? So now there's not an unlimited supply of a type of cake—there's exactly one of each. This is a similar but harder problem, known as the 0/1 Knapsack problem.

It makes it a bit harder but you can still use the same general approach! Now, since you can't use each element multiple times, you need to write to a temporary buffer to make sure you aren't incorporating that element in the sum. This adds an extra O(n) to create and/or merge the two buffers at the end of each pass, but since our complexity is O(n\*m), it doesn't affect our bottom line. I actually already solved this problem in one of the In-Flight Entertainment (Interview Cake \#14) bonus questions, so look there for an example solution.

If you want to be able to elegantly re-trace your steps to find what elements are included in your knapsack, take a look at [this cool video](https://youtu.be/8LusJS5-AGo) by Tushar Roy.