# Subset Sum Problem

In [1]:
import tensorflow as tf

In [2]:
tf.executing_eagerly()

True

In [3]:
tf.config.list_physical_devices('GPU')

[PhysicalDevice(name='/physical_device:GPU:0', device_type='GPU')]

## Naive Implementation (Approximate)

Let the superset be the following vector and our target be $7.0$.

$$
target = 7.0
$$

\begin{equation*}
superset = 
\begin{bmatrix}
1.0 & 2.0 & 3.0 & 4.0 & 5.0 \\
\end{bmatrix}
\end{equation*}

Our goal is to find a mask, such that, the dot product results in the target. Here is an example of a mask that adds up to our target.

\begin{equation*}
mask = 
\begin{bmatrix}
0.0 & 0.0 & 1.0 & 1.0 & 0.0 \\
\end{bmatrix}
\end{equation*}

We can verify that $$ mask \cdot superset = target $$


### Bistable loss
See [boolean-satisfiability.ipynb](boolean-satisfiability.ipynb) for more details

In [4]:
@tf.function
def bistable_loss_fn(x):
    a = (x ** 2)
    b = (x - 1) ** 2
    
    return a * b

### Total Loss

To force the optimzer to pick boolean like values over minimizing squared difference, we give more weight to the bistable loss.

$$ loss_{total} = \sqrt{(mask \cdot superset - target) ^ 2} + e^{loss_{bistable}} $$

In [5]:
@tf.function
def total_loss(target, subset_sum, mask):
    l2_loss = tf.math.squared_difference(target, subset_sum)
    bistable_loss = tf.reduce_sum(bistable_loss_fn(mask))
    
    return l2_loss + tf.exp(bistable_loss)

In [6]:
@tf.function
def subset_sum_fn(mask, container):
    return tf.tensordot(mask, container, axes=1)

In [7]:
container = tf.Variable([1,2,3,4,5],dtype=tf.float32)
mask = tf.Variable(tf.ones(tf.shape(container)),dtype=tf.float32)
target = tf.constant(7.0, dtype=tf.float32)

with tf.GradientTape(persistent=True) as tape:
    subset_sum = subset_sum_fn(mask, container)
    loss = total_loss(target, subset_sum, mask)

print(loss)
print(mask.numpy())
print(tape.gradient(loss,mask))
print(tape.gradient(loss,container))

tf.Tensor(65.0, shape=(), dtype=float32)
[1. 1. 1. 1. 1.]
tf.Tensor([16. 32. 48. 64. 80.], shape=(5,), dtype=float32)
tf.Tensor([16. 16. 16. 16. 16.], shape=(5,), dtype=float32)


In [8]:
container = tf.Variable([1,2,3,4,5],dtype=tf.float32)
mask = tf.Variable(tf.ones(tf.shape(container)),dtype=tf.float32)
target = tf.constant(7.0, dtype=tf.float32)

# opt = tf.keras.optimizers.Adam(learning_rate=3e-4)
opt = tf.keras.optimizers.Adam()
for i in range(10000):
    with tf.GradientTape() as tape:
        subset_sum = subset_sum_fn(mask, container)
        loss = total_loss(target, subset_sum, mask)
    if i % 1000 == 0:
        answer = tf.reduce_sum(tf.round(mask) * container)
        print(i, loss.numpy(), mask.numpy(), answer.numpy())
    grads = tape.gradient(loss, mask)
    opt.apply_gradients(zip([grads], [mask]))

0 65.0 [1. 1. 1. 1. 1.] 15.0
1000 1.4589697 [0.49017498 0.4877046  0.4869398  0.48656726 0.48634836] 0.0
2000 1.3628395 [0.46271044 0.46523246 0.46614125 0.4666031  0.46688387] 0.0
3000 1.3617111 [0.44354406 0.46168292 0.46652192 0.46874067 0.4700129 ] 0.0
4000 1.3545029 [0.37869254 0.4552448  0.46950325 0.4751845  0.4782023 ] 0.0
5000 1.3024449 [0.1474902  0.44448635 0.48462448 0.4968005  0.5023167 ] 5.0
6000 1.2715133 [0.00471203 0.36340576 0.5007747  0.52516717 0.532443  ] 12.0
7000 1.2001972 [-0.00185637  0.05919809  0.53151894  0.5876373   0.58854026] 12.0
8000 1.182796 [-0.00845798 -0.01663175  0.44833654  0.6654444   0.6089821 ] 9.0
9000 1.0536028 [-0.01366616 -0.02505278 -0.02220595  0.90308255  0.70674056] 9.0


### Result

We get the mask

\begin{equation*}
mask = 
\begin{bmatrix}
0.0 & 0.0 & 0.0 & 1.0 & 1.0 \\
\end{bmatrix}
\end{equation*}

Which gives the sum of $9.0$ instead of $7.0$.

The system seems to be stuck in a local optima. Training further would not improve the results. 

**Note: Since, our batch size is one. We are training with Gradient Descent instead of Stochastic Gradient Descent. Global Optima is not guaranteed**