# Refactored BinaPs Demo

Copyright 2022 Bernardo C. Rodrigues

This program is free software: you can redistribute it and/or modify it under the terms of the GNU General Public
License as published by the Free Software Foundation, either version 3 of the License, or (at your option) any later
version. This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the
implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more
details. You should have received a copy of the GNU General Public License along with this program. If not, 
see <https://www.gnu.org/licenses/>. 

This notebook demonstrates the usability of an overhauled BinaPs implementation that is supposed to be more flexible and
developer friendly. It consists on three steps:

**1. Generate synthetic data:** With one of the provided scripts, we generate synthetic data in which a set of known
patterns are planted.

**2. Run BinaPs:** We'll run BinaPs over this synthetic dataset and get a list of inferred patterns


In [1]:
import torch

# Check if CUDA is available
if(torch.cuda.is_available()):
    print(f"CUDA Device: {torch.cuda.get_device_name(0)}")
else:
    print("No CUDA device avaiable")

CUDA Device: NVIDIA GeForce GTX 1050


In [2]:
from lib.BinapsWrapper import generate_synthetic_data

output_file = f"data"

row_quantity = 10000
column_quantity = 20
max_pattern_size = 10
noise = 0.001
density = 0.05

generate_synthetic_data(row_quantity, column_quantity, output_file, max_pattern_size, noise, density)

Rscript binaps/Data/Synthetic_data/generate_toy.R AND 20 10000 10 data 0.001 0.05
[1] 108
[1] 10000   108
[1] "Added noise."
[1] "Converted to dat file."
[1] "Removed rows without content."
         used (Mb) gc trigger (Mb) max used (Mb)
Ncells 367397 19.7     641325 34.3   641325 34.3
Vcells 720519  5.5    8388608 64.0  5384870 41.1
[1] 10000    86



In [3]:
%load_ext autoreload
%autoreload 2

In [10]:
from torch import device, cuda

import numpy as np

from lib.binaps.network import learn, get_patterns
from lib.binaps.dataset import BinaryDataset, parse_dat_file, divide_data

data_file = "data.dat"
proportion = 0.9

device = device("cuda:0" if cuda.is_available() else "cpu")

data = parse_dat_file(data_file)
train_data, test_data = divide_data(data, 0.9)

train_dataset = BinaryDataset(train_data, device)
test_dataset = BinaryDataset(test_data, device)

batch_size = 64
test_batch_size = 64
hidden_dim = 20
lr = 0.01
weight_decay = 0
gamma = 0.1
epochs = 100

model, weights = learn(train_dataset, 
                       test_dataset,
                       batch_size,
                       test_batch_size,
                       hidden_dim,
                       lr,
                       weight_decay,
                       gamma,
                       epochs)


on 9: Test set: Average loss: 10.157369, Accuracy: 22/683 (3%)                                                          
on 19: Test set: Average loss: 10.201138, Accuracy: 13/683 (2%)                                                         
on 29: Test set: Average loss: 9.945162, Accuracy: 29/683 (4%)                                                          
on 39: Test set: Average loss: 10.445082, Accuracy: 11/683 (2%)                                                         
on 49: Test set: Average loss: 9.255984, Accuracy: 11/683 (2%)                                                          
on 59: Test set: Average loss: 9.823914, Accuracy: 26/683 (4%)                                                          
on 69: Test set: Average loss: 9.510533, Accuracy: 30/683 (4%)                                                          
on 79: Test set: Average loss: 10.414618, Accuracy: 32/683 (5%)                                                         
on 89: Test set: Average loss: 1

In [11]:
patterns = get_patterns(weights)

for pattern in patterns:
    print(pattern)

[38 39 65 66 67 68 69 70 88]
[46 47 48 49 50 51 52 53 54]
[0 1 2 3 4 5 6 7 8]
[ 9 10 11 12 13 14 45]
[55 56 57 58 59 60 61 62 63 64]
[ 23  24  25  26  27  99 100 101 102 103]
[ 9 10 11 12 13 14]
[15 16 17 28 29 30 38 39 44 45 87 88 89 90]
[46 47 48 49 50 51 52 53 54]
[ 38  39  44  45  87  88  89  90 104 105 106 107]
[18 19 20 21 22 87]
[91 92 93 94 95 96 97 98]
[104 105 106 107]
[71 72 73 74 75 76 77 78]
[31 32 33 34 35 36 37]
[79 80 81 82 83 84 85 86]
[0 1 2 3 4 5 6 7 8]
[40 41 42 43 44]
[55 56 57 58 59 60 61 62 63 64]
[46 47 48 49 50 51 52 53 54]
