# Predicting Satisfiability of SAT-3 Problems
## Solving the Classification Task using a ***Long Short-Term Memory (LSTM)*** 

A *satisfiability problem* (SAT problem) is a family of problems, where given a Boolean expression written using only the logical connectives AND, OR, NOT, variables and parentheses, we examine if there is some truth-assignment to the variables that will make the entire expression true. A **SAT-3** problem is a special case of SAT problem, where Boolean expression should have very strict form : the conjunctive normal form (CNF). A CNF is a conjunction of one or more clauses, where a clause is a disjunction of literals; otherwise put, it is an AND of ORs. In the SAT-3 problem, every disjunction consists of 3 literals. An example of a SAT-3 problem instance is the following:
$$(x_1 \vee x_2 \vee \neg x_4) \wedge (\neg x_3 \vee x_4 \vee x_2)$$

The SAT-3 problem is NP-complete and many solvers have been developed in order to prove the (un)satisfiability of the problems and point a truth assignment of the variables. However, not many large instances can be solved by traditional solvers and, thus, some researchers have come up with the idea to use Machine Learning methods in order to solve those problems, given that with the models we cannot have a guaranteed accuracy, but if the model generalizes well, then we would be able to solve some difficult problem-instances that the state-of-the-art solvers cannot.<br>

In this assignment we will try predicting the satisfiability of the instances using a *Graph Transformer Network* and a *Long Short-Term Memory network (LSTM)*. In this notebook will be explained the process of using a LSTM.

In [1]:
%load_ext autoreload
%autoreload 2

In [2]:
import random
__counter__ = random.randint(0,2e9)

from IPython.display import HTML, display

In [3]:
import os, shutil
import json

from tuning import tune_parameters
from data_loader import dataset_processing
from train import training, testing

Device is: cuda:0


### Training and Evaluation process

In order to train the model, a *training set* is used alongside with a *validation set*. The latter is used to measure the performance of the model. The training and validation set were extracted from the 80% (60%-20%) of the data and the rest 20% was kept to evaluate the model after its training (*test set*).  

**Create the dataset**: The dataset is created from the raw data in such a format that it can be loader by the *DataLoader* module of torch.utils.data. <br>

In this case, the data was treated as *timeseries* with sequence length equal to 2. In order for that to be able to happen, some data augmentation was performed. Specifically, as noted by the dataset documentation (<a href="https://www.cs.ubc.ca/~hoos/SATLIB/Benchmarks/SAT/RND3SAT/descr.html">Dataset Documentation</a>), for all problem instances there is a clause number-threshold after which the problem is highly unlikely to be satisfiable. So, before this threshold, the problem is most likely to be satisfiable and, for that reason, it is possible to augment the dataset in two ways :
<ol>
    <li>For each instance provide a satisfiable smaller version of it (the instance below the threshold).</li>
    <li>For each unsatisfiable instance provide another unsatisfiable instance (the instance just above the threshold).</li>
</ol>

In [4]:
pos_weight = dataset_processing()

Start the data processing...

Satisfiable CNFs   : 7402
Unsatisfiable CNFs : 5154

Ratio of SAT   : 0.5895
Ratio of UNSAT : 0.4105

Training set size: 11275
Validation set size: 1410
Test set size: 1407

Processing completed.


It should be noted that the LSTM model needs about 300 times more data that its parameters and in our case we have 970561 parameters and 14092 input data. **Thus, due to the amount of data, even after the augmentation, the model is not expected to reach high values in the evaluation metrics**. 

**Tune parameters**: Tune the parameters of the model (LSTM itself) as well as parameters regarding the training process e.g. *batch-size*, *weight decay* etc. More specidically, the parameters that are tuned are the following.
<ol><li>Model parameters:</li>
    <ul> 
        <li>Number of layers</li>
        <li>Dropout rate</li>
        <li>Hidden Units</li><br>
    </ul>
    <li>Training parameters:</li>
    <ul> 
        <li>Batch size used</li>
        <li>Learning rate</li>
        <li>Weight decay</li>
    </ul>
</ol>

The parameters of the final model are the ones that result in the minimum ***validation error***. <br>
Also, ***early stopping*** is used in order to avoid *overfitting*.

In [5]:
# Tune parameters
best_parameters = tune_parameters(pos_weight=pos_weight)

# Show best parameters
print(f'\nBest hyperparameters were: {best_parameters}\n')
# Store best parameters
with open('./best_parameters_same_sets.txt', 'w') as f:
    f.write(json.dumps(best_parameters))

f.close()


Test number 1 | Start testing new parameter-combination...

Dataset loading...
Dataset loading completed

Model loading...
Model loading completed

Number of model parameters: 102889

EPOCH | 0
Training Loss   : 0.5716
Validation Loss : 0.5788

EPOCH | 1
Training Loss   : 0.5710
Validation Loss : 0.5791

EPOCH | 2
Training Loss   : 0.5685
Validation Loss : 0.5769

EPOCH | 3
Training Loss   : 0.5665
Validation Loss : 0.5692

EPOCH | 4
Training Loss   : 0.5620
Validation Loss : 0.5730

EPOCH | 5
Training Loss   : 0.5597
Validation Loss : 0.5733

EPOCH | 6
Training Loss   : 0.5523
Validation Loss : 0.5643

EPOCH | 7
Training Loss   : 0.5466
Validation Loss : 0.5908

EPOCH | 8
Training Loss   : 0.5397
Validation Loss : 0.5674

EPOCH | 9
Training Loss   : 0.5282
Validation Loss : 0.5813

EPOCH | 10
Training Loss   : 0.5179
Validation Loss : 0.5862

EPOCH | 11
Training Loss   : 0.4965
Validation Loss : 0.5967

EPOCH | 12
Training Loss   : 0.4767
Validation Loss : 0.6203

EPOCH | 13
Training

Training Loss   : 0.5477
Validation Loss : 0.5778

EPOCH | 8
Training Loss   : 0.5401
Validation Loss : 0.5721

EPOCH | 9
Training Loss   : 0.5302
Validation Loss : 0.5888

EPOCH | 10
Training Loss   : 0.5182
Validation Loss : 0.6114

EPOCH | 11
Training Loss   : 0.5002
Validation Loss : 0.6008

EPOCH | 12
Training Loss   : 0.4770
Validation Loss : 0.6238

EPOCH | 13
Training Loss   : 0.4483
Validation Loss : 0.6413

EPOCH | 14
Training Loss   : 0.4113
Validation Loss : 0.6932

EPOCH | 15
Early stopping activated, with training and validation loss difference: 0.0008

Test number 9 | Start testing new parameter-combination...

Dataset loading...
Dataset loading completed

Model loading...
Model loading completed

Number of model parameters: 105193

EPOCH | 0
Training Loss   : 0.5705
Validation Loss : 0.5709

EPOCH | 1
Training Loss   : 0.5681
Validation Loss : 0.5783

EPOCH | 2
Training Loss   : 0.5672
Validation Loss : 0.5801

EPOCH | 3
Training Loss   : 0.5658
Validation Loss : 0.5683

Dataset loading completed

Model loading...
Model loading completed

Number of model parameters: 210641

EPOCH | 0
Training Loss   : 0.5717
Validation Loss : 0.5749

EPOCH | 1
Training Loss   : 0.5713
Validation Loss : 0.5803

EPOCH | 2
Training Loss   : 0.5683
Validation Loss : 0.5901

EPOCH | 3
Training Loss   : 0.5668
Validation Loss : 0.5713

EPOCH | 4
Training Loss   : 0.5644
Validation Loss : 0.5684

EPOCH | 5
Training Loss   : 0.5595
Validation Loss : 0.5698

EPOCH | 6
Training Loss   : 0.5584
Validation Loss : 0.5751

EPOCH | 7
Training Loss   : 0.5524
Validation Loss : 0.5839

EPOCH | 8
Training Loss   : 0.5418
Validation Loss : 0.5650

EPOCH | 9
Training Loss   : 0.5338
Validation Loss : 0.5703

EPOCH | 10
Training Loss   : 0.5264
Validation Loss : 0.5900

EPOCH | 11
Early stopping activated, with training and validation loss difference: 0.0033

Test number 18 | Start testing new parameter-combination...

Dataset loading...
Dataset loading completed

Model loading...
Model lo

Training Loss   : 0.5404
Validation Loss : 0.5664

EPOCH | 10
Training Loss   : 0.5280
Validation Loss : 0.5891

EPOCH | 11
Training Loss   : 0.5102
Validation Loss : 0.5894

EPOCH | 12
Early stopping activated, with training and validation loss difference: 0.0014

Test number 26 | Start testing new parameter-combination...

Dataset loading...
Dataset loading completed

Model loading...
Model loading completed

Number of model parameters: 414625

EPOCH | 0
Training Loss   : 0.5783
Validation Loss : 0.5964

EPOCH | 1
Training Loss   : 0.5724
Validation Loss : 0.5739

EPOCH | 2
Training Loss   : 0.5729
Validation Loss : 0.5843

EPOCH | 3
Training Loss   : 0.5681
Validation Loss : 0.5729

EPOCH | 4
Training Loss   : 0.5658
Validation Loss : 0.5847

EPOCH | 5
Training Loss   : 0.5647
Validation Loss : 0.5728

EPOCH | 6
Training Loss   : 0.5626
Validation Loss : 0.5833

EPOCH | 7
Training Loss   : 0.5552
Validation Loss : 0.5649

EPOCH | 8
Training Loss   : 0.5459
Validation Loss : 0.5894



Training Loss   : 0.5577
Validation Loss : 0.5865

EPOCH | 8
Training Loss   : 0.5505
Validation Loss : 0.5602

EPOCH | 9
Training Loss   : 0.5421
Validation Loss : 0.5737

EPOCH | 10
Training Loss   : 0.5303
Validation Loss : 0.5697

EPOCH | 11
Training Loss   : 0.5156
Validation Loss : 0.5990

EPOCH | 12
Early stopping activated, with training and validation loss difference: 0.0026

Test number 34 | Start testing new parameter-combination...

Dataset loading...
Dataset loading completed

Model loading...
Model loading completed

Number of model parameters: 448417

EPOCH | 0
Training Loss   : 0.5763
Validation Loss : 0.5936

EPOCH | 1
Training Loss   : 0.5748
Validation Loss : 0.5722

EPOCH | 2
Training Loss   : 0.5746
Validation Loss : 0.5796

EPOCH | 3
Training Loss   : 0.5701
Validation Loss : 0.5880

EPOCH | 4
Training Loss   : 0.5676
Validation Loss : 0.5800

EPOCH | 5
Training Loss   : 0.5658
Validation Loss : 0.5757

EPOCH | 6
Training Loss   : 0.5614
Validation Loss : 0.5771



Dataset loading completed

Model loading...
Model loading completed

Number of model parameters: 904001

EPOCH | 0
Training Loss   : 0.5929
Validation Loss : 0.6009

EPOCH | 1
Training Loss   : 0.5857
Validation Loss : 0.5984

EPOCH | 2
Training Loss   : 0.5860
Validation Loss : 0.5932

EPOCH | 3
Training Loss   : 0.5786
Validation Loss : 0.5740

EPOCH | 4
Training Loss   : 0.5748
Validation Loss : 0.5892

EPOCH | 5
Training Loss   : 0.5726
Validation Loss : 0.5955

EPOCH | 6
Training Loss   : 0.5687
Validation Loss : 0.5808

EPOCH | 7
Training Loss   : 0.5649
Validation Loss : 0.5874

EPOCH | 8
Training Loss   : 0.5578
Validation Loss : 0.5900

EPOCH | 9
Training Loss   : 0.5529
Validation Loss : 0.5849

EPOCH | 10
Training Loss   : 0.5391
Validation Loss : 0.5811

EPOCH | 11
Training Loss   : 0.5253
Validation Loss : 0.5852

EPOCH | 12
Training Loss   : 0.5020
Validation Loss : 0.6183

EPOCH | 13
Training Loss   : 0.4753
Validation Loss : 0.6091

EPOCH | 14
Early stopping activated, 

Training Loss   : 0.5682
Validation Loss : 0.5768

EPOCH | 7
Training Loss   : 0.5615
Validation Loss : 0.6138

EPOCH | 8
Training Loss   : 0.5568
Validation Loss : 0.5739

EPOCH | 9
Training Loss   : 0.5466
Validation Loss : 0.6015

EPOCH | 10
Training Loss   : 0.5383
Validation Loss : 0.5943

EPOCH | 11
Training Loss   : 0.5169
Validation Loss : 0.5971

EPOCH | 12
Training Loss   : 0.5008
Validation Loss : 0.6016

EPOCH | 13
Early stopping activated, with training and validation loss difference: 0.0021

Test number 49 | Start testing new parameter-combination...

Dataset loading...
Dataset loading completed

Model loading...
Model loading completed

Number of model parameters: 104041

EPOCH | 0
Training Loss   : 0.5458
Validation Loss : 0.5795

EPOCH | 1
Training Loss   : 0.3616
Validation Loss : 0.6428

EPOCH | 2
Training Loss   : 0.1837
Validation Loss : 0.7592

EPOCH | 3
Training Loss   : 0.0977
Validation Loss : 0.8836

EPOCH | 4
Training Loss   : 0.0571
Validation Loss : 0.9890


Training Loss   : 0.5709
Validation Loss : 0.5688

EPOCH | 1
Training Loss   : 0.5679
Validation Loss : 0.5760

EPOCH | 2
Training Loss   : 0.5670
Validation Loss : 0.5757

EPOCH | 3
Training Loss   : 0.5654
Validation Loss : 0.5671

EPOCH | 4
Training Loss   : 0.5639
Validation Loss : 0.5631

EPOCH | 5
Training Loss   : 0.5603
Validation Loss : 0.5810

EPOCH | 6
Training Loss   : 0.5525
Validation Loss : 0.5661

EPOCH | 7
Training Loss   : 0.5477
Validation Loss : 0.5778

EPOCH | 8
Training Loss   : 0.5401
Validation Loss : 0.5721

EPOCH | 9
Training Loss   : 0.5302
Validation Loss : 0.5888

EPOCH | 10
Training Loss   : 0.5182
Validation Loss : 0.6114

EPOCH | 11
Training Loss   : 0.5002
Validation Loss : 0.6008

EPOCH | 12
Training Loss   : 0.4770
Validation Loss : 0.6238

EPOCH | 13
Training Loss   : 0.4483
Validation Loss : 0.6413

EPOCH | 14
Training Loss   : 0.4113
Validation Loss : 0.6932

EPOCH | 15
Early stopping activated, with training and validation loss difference: 0.0008


Training Loss   : 0.4518
Validation Loss : 0.7002

EPOCH | 1
Training Loss   : 0.1617
Validation Loss : 1.0713

EPOCH | 2
Training Loss   : 0.0535
Validation Loss : 1.3698

EPOCH | 3
Training Loss   : 0.0582
Validation Loss : 1.4344

EPOCH | 4
Training Loss   : 0.0390
Validation Loss : 1.5413

EPOCH | 5
Training Loss   : 0.0127
Validation Loss : 1.6513

EPOCH | 6
Training Loss   : 0.0058
Validation Loss : 1.7250

EPOCH | 7
Training Loss   : 0.0044
Validation Loss : 1.7595

EPOCH | 8
Training Loss   : 0.0036
Validation Loss : 1.7671

EPOCH | 9
Training Loss   : 0.0032
Validation Loss : 1.7552

EPOCH | 10
Training Loss   : 0.0030
Validation Loss : 1.7556

EPOCH | 11
Early stopping activated, with training and validation loss difference: 0.2484

Test number 66 | Start testing new parameter-combination...

Dataset loading...
Dataset loading completed

Model loading...
Model loading completed

Number of model parameters: 104041

EPOCH | 0
Training Loss   : 0.4940
Validation Loss : 0.6392

E

Training Loss   : 0.1139
Validation Loss : 0.7695

EPOCH | 9
Training Loss   : 0.0985
Validation Loss : 0.7954

EPOCH | 10
Training Loss   : 0.0865
Validation Loss : 0.8196

EPOCH | 11
Early stopping activated, with training and validation loss difference: 0.0161

Test number 74 | Start testing new parameter-combination...

Dataset loading...
Dataset loading completed

Model loading...
Model loading completed

Number of model parameters: 104041

EPOCH | 0
Training Loss   : 0.5648
Validation Loss : 0.5809

EPOCH | 1
Training Loss   : 0.5045
Validation Loss : 0.5864

EPOCH | 2
Training Loss   : 0.4041
Validation Loss : 0.6038

EPOCH | 3
Training Loss   : 0.3059
Validation Loss : 0.6305

EPOCH | 4
Training Loss   : 0.2315
Validation Loss : 0.6621

EPOCH | 5
Training Loss   : 0.1791
Validation Loss : 0.6922

EPOCH | 6
Training Loss   : 0.1421
Validation Loss : 0.7211

EPOCH | 7
Training Loss   : 0.1157
Validation Loss : 0.7514

EPOCH | 8
Training Loss   : 0.0965
Validation Loss : 0.7817

E

Training Loss   : 0.4279
Validation Loss : 0.7584

EPOCH | 3
Training Loss   : 0.3841
Validation Loss : 0.8422

EPOCH | 4
Training Loss   : 0.3509
Validation Loss : 0.8757

EPOCH | 5
Training Loss   : 0.2984
Validation Loss : 1.0412

EPOCH | 6
Training Loss   : 0.2538
Validation Loss : 1.0502

EPOCH | 7
Training Loss   : 0.2205
Validation Loss : 1.1965

EPOCH | 8
Training Loss   : 0.1952
Validation Loss : 1.1952

EPOCH | 9
Training Loss   : 0.1526
Validation Loss : 1.4399

EPOCH | 10
Training Loss   : 0.1271
Validation Loss : 1.4700

EPOCH | 11
Early stopping activated, with training and validation loss difference: 0.0737

Test number 83 | Start testing new parameter-combination...

Dataset loading...
Dataset loading completed

Model loading...
Model loading completed

Number of model parameters: 104041

EPOCH | 0
Training Loss   : 0.5523
Validation Loss : 0.6486

EPOCH | 1
Training Loss   : 0.5256
Validation Loss : 0.6112

EPOCH | 2
Training Loss   : 0.5023
Validation Loss : 0.6371

E

**Train with the best parameters**: The LSTM is trained using the selected parameters.

In [6]:
# Access the best parameters in order to train final model
with open('best_parameters_same_sets.txt') as f:
    data = f.read()

best_parameters_loaded = json.loads(data)

print('\nNow training with the best parameters\n')
training(params=best_parameters_loaded, make_err_logs=True)


Now training with the best parameters

Dataset loading...
Dataset loading completed

Model loading...
Model loading completed

Number of model parameters: 104041

EPOCH | 0
Training Loss   : 0.5709
Validation Loss : 0.5688

EPOCH | 1
Training Loss   : 0.5679
Validation Loss : 0.5760

EPOCH | 2
Training Loss   : 0.5670
Validation Loss : 0.5757

EPOCH | 3
Training Loss   : 0.5654
Validation Loss : 0.5671

EPOCH | 4
Training Loss   : 0.5639
Validation Loss : 0.5631

EPOCH | 5
Training Loss   : 0.5603
Validation Loss : 0.5810

EPOCH | 6
Training Loss   : 0.5525
Validation Loss : 0.5661

EPOCH | 7
Training Loss   : 0.5477
Validation Loss : 0.5778

EPOCH | 8
Training Loss   : 0.5401
Validation Loss : 0.5721

EPOCH | 9
Training Loss   : 0.5302
Validation Loss : 0.5888

EPOCH | 10
Training Loss   : 0.5182
Validation Loss : 0.6114

EPOCH | 11
Training Loss   : 0.5002
Validation Loss : 0.6008

EPOCH | 12
Training Loss   : 0.4770
Validation Loss : 0.6238

EPOCH | 13
Training Loss   : 0.4483
Vali

0.5631081468603584

The following Figure shows the ***training and validation set errors***, the epoch where the ***early stopping*** was activated, as well as the epoch of the final ***selected model***.

In [7]:
display(HTML('<img src="plots/train_valid_error.png?%d" height=400 width=400>' % __counter__))

**Test the model**: Predict the satisfiability of the clauses in the *Test Set* and get the corresponding metrics.

In [8]:
print('\nResults on the test set:\n')
testing(params=best_parameters_loaded)


Results on the test set:

Dataset loading...
Dataset loading completed

Model loading...

Model loading completed


Test set metrics:

 Confusion matrix: 
 [[699 637]
 [ 34  37]]
F1 Score  : 0.0993
Accuracy  : 0.5231
Precision : 0.5211
Recall    : 0.0549
ROC AUC   : 0.5043
Test Loss with final model : 0.5796111920340494


The following **Figures** show:
<ol>
<li>The <b>confusion matrix</b> for the test set-prediction</li>
<li>The <b>ROC-AUC curve</b> for the test set-prediction</li>
<li>The <b>precision recall curve</b> for the test set-prediction</li>
</ol>

In [9]:
print("\n1.")
display(HTML('<img src="plots/cm.png?%d" height=500 width=500>' % __counter__))
print("2.")
display(HTML('<img src="plots/roc_auc.png?%d" height=450 width=450>' % __counter__))
print("3.")
display(HTML('<img src="plots/pr.png?%d" height=450 width=450>' % __counter__))


1.


2.


3.


Since, as one can notice, ***the results are not very promising using this model and this architecture***, with the current amount of data, the model is not tested on a Test Set from a different data distribution.