# Top K

We consider a simple version of the knapsack problem, where each item is associated with a value and the task is to choose a subset of the items that maximizes the sum of the values of the items. We assume there are 10 items with the same weight 2, and the capacity of the knapsack is 15. For example,

    [2,7,3,5,2,3,8,2,1,5][1,2,3,4,5,6,9]

is a labeled example such that the first list specifies the values of the 10 items and the second list is a solution that specifies the indices of the items to be put into the knapsack. Since the capacity of the knapsack is fixed to be 15 and each item has weight 2, one can infer that the solutions always contain 7 items.
## Data Format

In dataGen.py, a class named "KsData" is defined in the following way.

KsData class has 6 attributes: train_data, test_data, valid_data, train_labels, test_labels, valid_labels.

train_data is an numpy array of size (1800, 10). It consists of 1800 data as follows 

        [
          data,
          ...,
          data
        ]
        
where data is a vector (numpy array) of length 10. For example, the data shown below  

        [2 2 1 3 1 2 8 1 5 1]
        
defines the 10 values of the 10 items.  
train_labels is an numpy array of size (1800, 10). It consists of 1800 label as follows.  

        [
          label,
          ...,
          label
        ]

where label is a vector (numpy array) of length 10, with k "1" and (10-k) "0". For example, the label shown below  

        [0 0 1 0 0 0 0 0 0 0]

means that the item 2 is chosen to be put into the knapsack.  
test_data is a numpy array of size (600, 10).   
valid_data is a numpy array of size (600, 10).  
test_labels is a numpy array of size (600, 10).  
valid_labels is a numpy array of size (600, 10).  



## Imports

In [1]:
import sys
sys.path.append("../../")
import random
import time

import torch
from torch.autograd import Variable
import numpy as np

from dataGen import KsData
from neurasp import NeurASP
from network import FC

## NeurASP Program for Training and Testing

In [2]:
dprogram='''
% define maxweight k 
#const k = 7.

nn(in(10, k), [true, false]).

% we make a mistake if the total weight of the chosen items exceeds maxweight 
:- #sum{1, I : in(k,I,true)} > k.
'''

dprogram_test='''
% define maxweight k 
#const k = 7.

% we make a mistake if the total weight of the chosen items exceeds maxweight 
:- #sum{1, I : in(k,I,true)} > k.
'''

## Neural Network Instantiation
- Instantiate neural networks.
- Define nnMapping: a dictionary that maps neural network names (i.e., strings) to the neural network objects (i.e., torch.nn.Module object)
- Define optimizers: a dictionary that specifies the optimizer for each network (we use the Adam optimizer here).

In [3]:
m = FC(10, *[50, 50, 50, 50, 50], 10)
nnMapping = {'in': m}
optimizer = {'in':torch.optim.Adam(m.parameters(), lr=0.001)}

Neural Network (MLP) Structure: (10, 50, 50, 50, 50, 50, 10)


## Create NeurASP Object

In [4]:
NeurASPobj = NeurASP(dprogram, nnMapping, optimizer)

## Create dataList and obsList for Training, testDataList and testObsList for Testing
### Create the dataset object

In [6]:
dataset = KsData("data/data.txt",10)

### Construct dataList and obsList

In [7]:
dataList = []
obsList = []
for i, d in enumerate(dataset.train_data):
    d_tensor = Variable(torch.from_numpy(d).float(), requires_grad=False)
    dataList.append({'k': d_tensor})
with open('data/evidence_train.txt', 'r') as f:
    obsList = f.read().strip().strip('#evidence').split('#evidence')


### Construct testDataList and testObsList

In [8]:
testData = []
testObsLost = []
for d in dataset.test_data:
    d_tensor = Variable(torch.from_numpy(d).float(), requires_grad=False)
    testData.append({'k': d_tensor})
with open('data/evidence_test.txt', 'r') as f:
    testObsLost = f.read().strip().strip('#evidence').split('#evidence')

## Training and Testing

Note that our target is to find the set of items with maximal sum of the values, which is represented by the optimal stable models of the logic program. To find the optimal stable models instead of stable models during training, we need to specify "opt=True" in the learning function.

In [9]:

startTime = time.time()
for i in range(200):
    print('Epoch {}...'.format(i+1))
    time1 = time.time()
    NeurASPobj.learn(dataList=dataList, obsList=obsList, epoch=1, alpha=0, opt=True, smPickle='data/stableModels.pickle')
    time2 = time.time()
    NeurASPobj.testConstraint(testData, testObsLost,[dprogram_test])
    print('--- train time: %s seconds ---' % (time2 - time1))
    print('--- test time: %s seconds ---' % (time.time() - time2))
    print('--- total time from beginning: %s minutes ---' % int((time.time() - startTime)/60) )


Epoch 1...
The accuracy for constraint 1 is 0.13333333333333333
--- train time: 18.8836829662323 seconds ---
--- test time: 1.6695325374603271 seconds ---
--- total time from beginning: 0 minutes ---
Epoch 2...
The accuracy for constraint 1 is 0.21666666666666667
--- train time: 16.198837995529175 seconds ---
--- test time: 1.7106995582580566 seconds ---
--- total time from beginning: 0 minutes ---
Epoch 3...
The accuracy for constraint 1 is 0.34
--- train time: 16.13485550880432 seconds ---
--- test time: 1.663581371307373 seconds ---
--- total time from beginning: 0 minutes ---
Epoch 4...
The accuracy for constraint 1 is 0.3933333333333333
--- train time: 15.817415237426758 seconds ---
--- test time: 1.7325234413146973 seconds ---
--- total time from beginning: 1 minutes ---
Epoch 5...


KeyboardInterrupt: 