# Exact Rule Learning via Boolean Compressed Sensing

The format is best broken into two parts:
1. Rule Mining
2. Rule Selection

### Rule Mining with SLIPPER
SLIPPER is the algorithm that learns all of the rules with an associated confidence. For this application, we don't necessarily care about the confidence. SLIPPER trains weak learners

In [1]:
import gurobipy as gp
from gurobipy import GRB

In [2]:
import numpy as np
import pandas as pd
import seaborn as sns
from sklearn.datasets import load_iris

In [3]:
data = load_iris()

In [4]:
data_df = pd.DataFrame(data['data'])
labels = pd.DataFrame(data['target'])

### Work in Progress on Rule Building

In [5]:
def SLIPPER(data, labels, max_iter=1000):
    # Initialize some variables for later
    obs = data.shape[0]  # number of observations
    D = np.array([1/m for i in range(obs)])  # intiialize to uniform distribution for all i 
    
    for t in T:
        # 1. Train the weak learner using the current distribution D
        # (a) Split the data into grow and prune sets
        grow_idx = np.random.choice(obs, np.math.floor((2/3) * obs), replace=False)
        prune_idx = np.array(set([i for i in range(obs)]) - set(grow_idx))
        
        grow_X, grow_Y = data[grow_idx, :], labels[grow_idx, :]
        prune_X, prune_Y = data[prune_idx, :], labels[prune_idx, :]

        # (b) GrowRule: 
        # 

### Make some fake rules

#### Create Measurement Matrix

In [6]:
target = data['target']
temp = data['data']

In [7]:
col1 = temp[:, 0] > 1  # Sepal Length (no rule)
col2 = temp[:, 1] > 1  # Sepal Width (no rule)
col3 = temp[:, 2] <= 5.35  # Petal length
col4 = temp[:, 3] <= 1.7  # Petal width
col5 = temp[:, 3] > 0.875  # Petal width
measurement = np.vstack((col1, col2, col3, col4, col5)).T.astype(int)
measurement.shape

(150, 5)

#### Create positive and negative matrices

In [65]:
A_p = measurement[np.where(target != 1)].astype(float)
A_n = measurement[np.where(target == 1)].astype(float)

A_p = 1 - A_p
A_n = 1 - A_n

In [56]:
target.shape

(150,)

In [22]:
np.where(target!=0)[0].shape

(100,)

In [19]:
np.where(target==1)[0].shape

(50,)

### Get the Model 

In [66]:
m = gp.Model("rule-extraciton")

In [67]:
w = m.addMVar(shape=measurement.shape[1], name="weights")
psi_p = m.addMVar(shape=A_p.shape[0], name="psi_p")
psi_n = m.addMVar(shape=A_n.shape[0], name="psi_n")

In [68]:
m.addConstr(w <= 1.0)
m.addConstr(w >= 0.0)
m.addConstr(psi_p <= 1)
m.addConstr(psi_p >= 0)
m.addConstr(psi_n >= 0)
m.addConstr(A_p @ w + psi_p >= 1.0)
m.addConstr(A_n @ w == psi_n)
m.update()

In [69]:
m.setObjective(sum(w) + 1000 * (sum(psi_p) + sum(psi_n)), GRB.MINIMIZE)

In [70]:
m.optimize()

Gurobi Optimizer version 9.0.3 build v9.0.3rc0 (linux64)
Optimize a model with 410 rows, 155 columns and 536 nonzeros
Model fingerprint: 0x62236fb6
Coefficient statistics:
  Matrix range     [1e+00, 1e+00]
  Objective range  [1e+00, 1e+03]
  Bounds range     [0e+00, 0e+00]
  RHS range        [1e+00, 1e+00]
Presolve removed 410 rows and 155 columns
Presolve time: 0.00s
Presolve: All rows and columns removed
Iteration    Objective       Primal Inf.    Dual Inf.      Time
       0    4.0030000e+03   0.000000e+00   1.003000e+03      0s
Extra 4 simplex iterations after uncrush
       4    4.0030000e+03   0.000000e+00   0.000000e+00      0s

Solved in 4 iterations and 0.01 seconds
Optimal objective  4.003000000e+03


In [71]:
m.getVarByName("weights[0]")
for i in range(measurement.shape[1]):
    print(m.getVarByName("weights[" + str(i) + "]"))

<gurobi.Var weights[0] (value 0.0)>
<gurobi.Var weights[1] (value 0.0)>
<gurobi.Var weights[2] (value 1.0)>
<gurobi.Var weights[3] (value 1.0)>
<gurobi.Var weights[4] (value 1.0)>


In [72]:
for val in m.getVars():
    print(val.varName, val.x)

weights[0] 0.0
weights[1] 0.0
weights[2] 1.0
weights[3] 1.0
weights[4] 1.0
psi_p[0] 0.0
psi_p[1] 0.0
psi_p[2] 0.0
psi_p[3] 0.0
psi_p[4] 0.0
psi_p[5] 0.0
psi_p[6] 0.0
psi_p[7] 0.0
psi_p[8] 0.0
psi_p[9] 0.0
psi_p[10] 0.0
psi_p[11] 0.0
psi_p[12] 0.0
psi_p[13] 0.0
psi_p[14] 0.0
psi_p[15] 0.0
psi_p[16] 0.0
psi_p[17] 0.0
psi_p[18] 0.0
psi_p[19] 0.0
psi_p[20] 0.0
psi_p[21] 0.0
psi_p[22] 0.0
psi_p[23] 0.0
psi_p[24] 0.0
psi_p[25] 0.0
psi_p[26] 0.0
psi_p[27] 0.0
psi_p[28] 0.0
psi_p[29] 0.0
psi_p[30] 0.0
psi_p[31] 0.0
psi_p[32] 0.0
psi_p[33] 0.0
psi_p[34] 0.0
psi_p[35] 0.0
psi_p[36] 0.0
psi_p[37] 0.0
psi_p[38] 0.0
psi_p[39] 0.0
psi_p[40] 0.0
psi_p[41] 0.0
psi_p[42] 0.0
psi_p[43] 0.0
psi_p[44] 0.0
psi_p[45] 0.0
psi_p[46] 0.0
psi_p[47] 0.0
psi_p[48] 0.0
psi_p[49] 0.0
psi_p[50] 0.0
psi_p[51] 0.0
psi_p[52] 0.0
psi_p[53] 0.0
psi_p[54] 0.0
psi_p[55] 0.0
psi_p[56] 1.0
psi_p[57] 0.0
psi_p[58] 0.0
psi_p[59] 0.0
psi_p[60] 0.0
psi_p[61] 0.0
psi_p[62] 0.0
psi_p[63] 0.0
psi_p[64] 0.0
psi_p[65] 0.0
psi_p[66] 0