# Most Reliable Path

Most Reliable Path (MRP) is a variant of the Shortest Path Problem, where, each edge is
randomly associated with probabilities (0.512 or 0.8) which denotes the “reliability” of the edge,
and the task is to find the most reliable path between the source and the destination node. We
do not remove edges this time.

We use the same 5-layer MLP in the SP experiment as the baseline. We also use the simplepath and the reachability constraints to train the neural network by NeurASP. Besides, we
use weak constraints to represent the probability of each edge in the grid.

## Data Format

In dataGen.py, a class named "GridProbData" is defined in the following way.

GridProbData class has 6 attributes: train_data, test_data, valid_data, train_labels, test_labels, valid_labels.

train_data is an numpy array of size (1800, 40). It consists of 1800 data as follows 

        [  
          data,  
          ...,  
          data  
        ]  

where data is a vector (numpy array) of length 40. For example, the data shown below  

        [  
          0.512 0.8 0.512 0.8 0.512  
          0.512 0.8 0.512 0.512 0.8  
          0.8 0.512 0.512 0.8 0.512  
          0.512 0.512 0.512 0.512 0.8  
          0.512 0.8 0.512 0.8  
          10000 00000 01000 0  
        ]  

defines the 24 probabilities of the 24 edges and specified that the nodes 0 and 11 are the starting and ending nodes.  
train_labels is an numpy array of size (1800, 24). It consists of 1800 label as follows.  

        [  
          label,  
          ...,  
          label  
        ]  

where label is a vector (numpy array) of length 24. For example, the label shown below  

        [11100 00000 00000 00000 0110]  

means that the edges 0, 1, 2, 21, 22 form a most reliable path.  
test_data is a numpy array of size (600, 40).  
valid_data is a numpy array of size (600, 40).  
test_labels is a numpy array of size (600, 24).  
valid_labels is a numpy array of size (600, 24).  



## Imports

In [37]:
import sys
sys.path.append("../../")
import random
import time

import numpy as np
import torch
from torch.autograd import Variable

from dataGen import GridProbData
from neurasp import NeurASP
from network import FC

In [38]:

import os,sys,inspect
currentdir = os.path.dirname(os.path.abspath(inspect.getfile(inspect.currentframe())))
parentdir = os.path.dirname(currentdir)
sys.path.insert(0,parentdir) 




## NeurASP Programs for Training and Testing

In [39]:
dprogram = '''
nn(sp(24, g), [true, false]).

sp(X) :- sp(g,X,true).

sp(0,1) :- sp(0).
sp(1,2) :- sp(1).
sp(2,3) :- sp(2).
sp(4,5) :- sp(3).
sp(5,6) :- sp(4).
sp(6,7) :- sp(5).
sp(8,9) :- sp(6).
sp(9,10) :- sp(7).
sp(10,11) :- sp(8).
sp(12,13) :- sp(9).
sp(13,14) :- sp(10).
sp(14,15) :- sp(11).
sp(0,4) :- sp(12).
sp(4,8) :- sp(13).
sp(8,12) :- sp(14).
sp(1,5) :- sp(15).
sp(5,9) :- sp(16).
sp(9,13) :- sp(17).
sp(2,6) :- sp(18).
sp(6,10) :- sp(19).
sp(10,14) :- sp(20).
sp(3,7) :- sp(21).
sp(7,11) :- sp(22).
sp(11,15) :- sp(23).

sp(X,Y) :- sp(Y,X).

:- X=0..15, #count{Y: sp(X,Y)} = 1.
:- X=0..15, #count{Y: sp(X,Y)} >= 3.
reachable(X, Y) :- sp(X, Y).
reachable(X, Y) :- reachable(X, Z), sp(Z, Y).
:- sp(X, _), sp(Y, _), not reachable(X, Y).
'''

dprogram_test = '''
sp(X) :- sp(g,X,true).

sp(0,1) :- sp(0).
sp(1,2) :- sp(1).
sp(2,3) :- sp(2).
sp(4,5) :- sp(3).
sp(5,6) :- sp(4).
sp(6,7) :- sp(5).
sp(8,9) :- sp(6).
sp(9,10) :- sp(7).
sp(10,11) :- sp(8).
sp(12,13) :- sp(9).
sp(13,14) :- sp(10).
sp(14,15) :- sp(11).
sp(0,4) :- sp(12).
sp(4,8) :- sp(13).
sp(8,12) :- sp(14).
sp(1,5) :- sp(15).
sp(5,9) :- sp(16).
sp(9,13) :- sp(17).
sp(2,6) :- sp(18).
sp(6,10) :- sp(19).
sp(10,14) :- sp(20).
sp(3,7) :- sp(21).
sp(7,11) :- sp(22).
sp(11,15) :- sp(23).

sp(X,Y) :- sp(Y,X).

:- X=0..15, #count{Y: sp(X,Y)} = 1.
:- X=0..15, #count{Y: sp(X,Y)} >= 3.
reachable(X, Y) :- sp(X, Y).
reachable(X, Y) :- reachable(X, Z), sp(Z, Y).
:- sp(X, _), sp(Y, _), not reachable(X, Y).
'''

## Neural Network Instantiation
- Instantiate neural networks.
- Define nnMapping: a dictionary that maps neural network names (i.e., strings) to the neural network objects (i.e., torch.nn.Module object)
- Define optimizers: a dictionary that specifies the optimizer for each network (we use the Adam optimizer here).

In [40]:
m = FC(40, 50, 50, 50, 50, 50, 24)

nnMapping = {'sp': m}

optimizer = {'sp':torch.optim.Adam(m.parameters(), lr=0.001)}

Neural Network (MLP) Structure: (40, 50, 50, 50, 50, 50, 24)


## Create NeurASP object

In [41]:
NeurASPobj = NeurASP(dprogram, nnMapping, optimizers)

## Create dataList and obsList for Training, testDataList and testObsList for Testing
### Create the dataset object

In [42]:
dataset = GridProbData("data/data.txt")

there are 3000 data in total, 60% training data, 20% validation data, 20% testing data!


### Construct dataList and obsList

In [43]:
dataList = []
obsList = []

for i, d in enumerate(dataset.train_data):
    d_tensor = Variable(torch.from_numpy(d).float(), requires_grad=False)
    dataList.append({'g': d_tensor})

with open('data/evidence_train.txt', 'r') as f:
    obsList = f.read().strip().strip('#evidence').split('#evidence')

### Construct testDataList and testObsList

In [44]:
dataListTest = []
obsListTest = []

for d in dataset.test_data:
    d_tensor = Variable(torch.from_numpy(d).float(), requires_grad=False)
    dataListTest.append({'g': d_tensor})

with open('data/evidence_test.txt', 'r') as f:
    obsListTest = f.read().strip().strip('#evidence').split('#evidence')

## Training and Testing

Note that our target is to find the path with the highest probability, which is represented by the optimal stable models of the logic program. To find the optimal stable models instead of stable models during training, we need to specify "opt=True" in the learning function.

In [45]:
startTime = time.time()
for i in range(20):
    print('Continuously training for 10 epochs round {}...'.format(i+1))
    time1 = time.time()
    NeurASPobj.learn(dataList=dataList, obsList=obsList, epoch=10, opt=True, smPickle='data/stableModels.pickle')
    time2 = time.time()
    NeurASPobj.testConstraint(dataList=dataListTest, obsList=obsListTest, mvppList=[dprogram_test])
    print("--- train time: %s seconds ---" % (time2 - time1))
    print("--- test time: %s seconds ---" % (time.time() - time2))
    print('--- total time from beginning: %s minutes ---' % int((time.time() - startTime)/60) )

Continuously training for 10 epochs round 1...
The accuracy for constraint 1 is 0.0
--- train time: 57.582571029663086 seconds ---
--- test time: 5.7079596519470215 seconds ---
--- total time from beginning: 1 minutes ---
Continuously training for 10 epochs round 2...
The accuracy for constraint 1 is 0.0
--- train time: 58.763999938964844 seconds ---
--- test time: 5.77888822555542 seconds ---
--- total time from beginning: 2 minutes ---
Continuously training for 10 epochs round 3...
The accuracy for constraint 1 is 0.0
--- train time: 58.62405037879944 seconds ---
--- test time: 5.654390573501587 seconds ---
--- total time from beginning: 3 minutes ---
Continuously training for 10 epochs round 4...
The accuracy for constraint 1 is 0.0
--- train time: 57.486855268478394 seconds ---
--- test time: 5.713911771774292 seconds ---
--- total time from beginning: 4 minutes ---
Continuously training for 10 epochs round 5...
The accuracy for constraint 1 is 0.0
--- train time: 84.83867859840393

KeyboardInterrupt: 

## Comparison

The following table compares the different accuracies on the test data between MLP Only
trained by cross entropy loss and the same MLP trained by NeurASP. 

In [None]:
from IPython.display import display
from PIL import Image

path="accuracy_comparison.jpg"
display(Image.open(path))

In the above table, the label accuracy is the measure against the label for which the neural
network is trained for. Note that the MRP problem (and the SP and the top K in the next
section) are not “functional” problems in the sense that there may be multiple solutions
possible. In such a case, we select only one among them as the label in the training set to be in
favor of neural network learning.

However, it is possible that the prediction different from the label is still correct. To account for
this, we consider the ultimate accuracy, which counts such prediction correct. The near
ultimate accuracy is similar but a bit more relaxed. It allows more predictions to be correct by
including near optimal paths in which the number of edges is the same as that of a most
reliable path but a 0.512 edge is contained instead of a 0.8 edge.

The constraint accuracy counts if the prediction satisfies the simple-path and the reachability
constraints regardless whether it’s most reliable or not.
