# EQUIVALENCE CLASS:
A stimulus class (usually produced through conditional discrimination in matching-to-sample) that includes all possible emergent relations among its members. The properties of an equivalence class are derived from the logical relations of reflexivity, symmetry, and transitivity. **Reflexivity** *refers to the matching of a sample to itself*, sometimes called identity matching (AA, BB, CC, in these examples, each letter pair represents a sample and its matching comparison stimulus). **Symmetry** *refers to the reversibility of a relation (if AB, then BA)*. **Transitivity** *refers to the transfer of the relation to new combinations through shared membership (if AB and BC, then AC)*. 
If these properties are characteristics of a matching to-sample performance, then training AB and BC may produce AC, BA, CA, and CB as emergent relations (reflexivity provides the three other possible relations, AA, BB, and CC). Given AB and BC, for example, the combination of symmetry and transitivity implies the CA relation. The emergence of all possible stimulus relations after only AB and BC are trained through contingencies is the criterion for calling the three stimuli members of an equivalence class. The class can be extended by training new stimulus relations (e.g., if CD is learned, then AD, DA, BD, DB, and DC may be created as emergent relations). Stimuli that are members of an equivalence class are likely also to be functionally equivalent. It remains to be seen whether the logical properties of these classes are fully consistent with their behavioral ones. Cf. ** EQUIVALENCE RELATION**. ([source](http://www.scienceofbehavior.com/lms/mod/glossary/view.php?id=408&mode=letter&hook=E&sortkey=CREATION&sortorder=asc&fullsearch=0&page=3))


# Libraries

In [1]:
import numpy as np
import pandas as pd
import random
import matplotlib.pyplot as plt
from matplotlib.colors import ListedColormap
from sklearn.model_selection import train_test_split
from sklearn.neural_network import MLPClassifier
from sklearn.neighbors import KNeighborsClassifier
from sklearn.svm import SVC
from sklearn.gaussian_process import GaussianProcessClassifier
from sklearn.gaussian_process.kernels import RBF
from sklearn.tree import DecisionTreeClassifier
from sklearn.ensemble import RandomForestClassifier, AdaBoostClassifier
from sklearn.naive_bayes import GaussianNB
from sklearn.discriminant_analysis import QuadraticDiscriminantAnalysis

from sklearn.multiclass import OneVsRestClassifier
from sklearn.metrics import average_precision_score

In [2]:
stims={"A1":[1,0,0,0,0,0,0,0,0,0,0,0],
       "A2":[0,1,0,0,0,0,0,0,0,0,0,0],
       "A3":[0,0,1,0,0,0,0,0,0,0,0,0],
       "A4":[0,0,0,1,0,0,0,0,0,0,0,0],
       "B1":[0,0,0,0,1,0,0,0,0,0,0,0],
       "B2":[0,0,0,0,0,1,0,0,0,0,0,0],
       "B3":[0,0,0,0,0,0,1,0,0,0,0,0],
       "B4":[0,0,0,0,0,0,0,1,0,0,0,0],
       "C1":[0,0,0,0,0,0,0,0,1,0,0,0],
       "C2":[0,0,0,0,0,0,0,0,0,1,0,0],
       "C3":[0,0,0,0,0,0,0,0,0,0,1,0],
       "C4":[0,0,0,0,0,0,0,0,0,0,0,1]
      }

options={"O_1":[1,0,0],
         "O_2":[0,1,0],
         "O_3":[0,0,1],
         "O_0":[0,0,0],
        }

labels=np.array([[i,j,k,l] for i in stims.keys() for j in stims.keys()for k in stims.keys()for l in stims.keys()])
values_x=np.array([np.array(i+j+k+l) for i in stims.values() for j in stims.values()for k in stims.values()for l in stims.values()])

# Test 1: Identify

### Can a shallow classifier identify the stimulus presented in the set?

The first step is check if a classifier can indentify the stimulus presented in the group of stimulus.

In [3]:
h = .02  # step size in the mesh

names = [
    "Nearest Neighbors", 
    "Linear SVM", 
    "RBF SVM", 
#    "Gaussian Process",
    "Decision Tree", 
    "Random Forest", 
    "Neural Net", 
    "AdaBoost",
    "Naive Bayes", 
    "QDA"
]

classifiers = [
    KNeighborsClassifier(3),
    SVC(kernel="linear", C=0.025),
    SVC(gamma=2, C=1),
#    GaussianProcessClassifier(1.0 * RBF(1.0), warm_start=True),
    DecisionTreeClassifier(max_depth=5),
    RandomForestClassifier(max_depth=5, n_estimators=10, max_features=1),
    MLPClassifier(alpha=1),
    AdaBoostClassifier(),
    GaussianNB(),
    QuadraticDiscriminantAnalysis()
]

In [4]:
test1_y=np.array([list(np.bitwise_or(np.bitwise_or(np.bitwise_or(i,j),k),l)) for i in stims.values() for j in stims.values()for k in stims.values()for l in stims.values()])

In [5]:
def test_1_view ():
    # Selects a random stimulus and shows the corresponding encoding, labels and y value
    n_dat=random.randrange(len(values_x))
    print(n_dat)
    print(values_x[n_dat,0:12])
    print(values_x[n_dat,12:24])
    print(values_x[n_dat,24:36])
    print(values_x[n_dat,36:48])
    print(test1_y[n_dat,:])
    print(labels[n_dat,:])

In [6]:
test_1_view ()

6342
[0 1 0 0 0 0 0 0 0 0 0 0]
[0 0 0 0 0 0 0 0 0 1 0 0]
[0 0 0 0 0 0 0 1 0 0 0 0]
[0 0 0 0 0 0 1 0 0 0 0 0]
[0 1 0 0 0 0 1 1 0 1 0 0]
['A2' 'C2' 'B4' 'B3']


In [7]:
X_train, X_test, y_train, y_test = train_test_split(values_x, test1_y, test_size=.4, random_state=42)

In [8]:
#clsf=OneVsRestClassifier(classifiers[9]).fit(X_train, y_train)
#average_precision_score(y_test,clsf.predict(X_test)) 
#print(clasif.predict(X_test[0].reshape(1, -1)))
#print(y_test[0])

In [9]:
avg_scores=[]
for name, clf in zip(names, classifiers):
    clasif=OneVsRestClassifier(clf).fit(X_train, y_train)
    scr=average_precision_score(y_test,clasif.predict(X_test))
    avg_scores.append(scr)
    print(name,scr)


('Nearest Neighbors', 0.86024148519679766)
('Linear SVM', 1.0)
('RBF SVM', 0.43721867758549932)
('Decision Tree', 1.0)
('Random Forest', 0.32453160152933702)
('Neural Net', 1.0)
('AdaBoost', 1.0)
('Naive Bayes', 1.0)
('QDA', 0.72034975041386939)




# Reflexivity test
### Can a shallow classificator mark the correct position of the sample when it's presented in the comparators?

In [10]:
reflexivity_labels=[]
reflexivity_values=[]
reflexivity_y=[]
for lab in stims.keys(): 
    rflxvt_labels=labels[(labels[:,0]==lab)&((labels[:,1]==lab)|(labels[:,2]==lab)|(labels[:,3]==lab))]
    rflxvt_values=values_x[(labels[:,0]==lab)&((labels[:,1]==lab)|(labels[:,2]==lab)|(labels[:,3]==lab))]

    rflxvt_values=rflxvt_values[np.sum((rflxvt_labels[:,1:]==lab)*1.0, axis=1)==1]
    rflxvt_labels=rflxvt_labels[np.sum((rflxvt_labels[:,1:]==lab)*1.0, axis=1)==1]
    rflxvt_y=(rflxvt_labels[:,1:]==lab)*1
    
    [reflexivity_labels.append(lbl) for lbl in rflxvt_labels]
    [reflexivity_values.append(vle) for vle in rflxvt_values]
    [reflexivity_y.append(vly) for vly in rflxvt_y]

reflexivity_labels=np.array(reflexivity_labels)
reflexivity_values=np.array(reflexivity_values)
reflexivity_y=np.array(reflexivity_y)

In [11]:
def reflexivity_view():
    n_dat=random.randrange(len(reflexivity_values))
    print(n_dat)
    print(reflexivity_values[n_dat,0:12])
    print(reflexivity_values[n_dat,12:24])
    print(reflexivity_values[n_dat,24:36])
    print(reflexivity_values[n_dat,36:48])
    print(reflexivity_y[n_dat,:])
    print(reflexivity_labels[n_dat,:])

In [12]:
reflexivity_view()

1861
[0 0 0 0 0 1 0 0 0 0 0 0]
[0 0 1 0 0 0 0 0 0 0 0 0]
[0 0 1 0 0 0 0 0 0 0 0 0]
[0 0 0 0 0 1 0 0 0 0 0 0]
[0 0 1]
['B2' 'A3' 'A3' 'B2']


In [13]:
X_train_reflexivity, X_test_reflexivity, y_train_reflexivity, y_test_reflexivity = train_test_split(reflexivity_values, reflexivity_y, test_size=.4, random_state=42)

In [14]:
avg_scores_reflexivity=[]
for name, clf in zip(names, classifiers):
    clasif=OneVsRestClassifier(clf).fit(X_train_reflexivity, y_train_reflexivity)
    scr=average_precision_score(y_test_reflexivity,clasif.predict(X_test_reflexivity))
    avg_scores_reflexivity.append(scr)
    print(name,scr)

('Nearest Neighbors', 0.9965674967022462)
('Linear SVM', 0.33333333333333331)
('RBF SVM', 0.33333333333333331)
('Decision Tree', 0.38267232996054451)
('Random Forest', 0.3364817666656601)
('Neural Net', 1.0)
('AdaBoost', 0.33333333333333331)
('Naive Bayes', 0.32228568956277076)
('QDA', 0.70881861116483824)


# Results

In [15]:
pd.DataFrame(np.column_stack([avg_scores,avg_scores_reflexivity]),index=names, columns=["Identify", "Reflexivity"])

Unnamed: 0,Identify,Reflexivity
Nearest Neighbors,0.860241,0.996567
Linear SVM,1.0,0.333333
RBF SVM,0.437219,0.333333
Decision Tree,1.0,0.382672
Random Forest,0.324532,0.336482
Neural Net,1.0,1.0
AdaBoost,1.0,0.333333
Naive Bayes,1.0,0.322286
QDA,0.72035,0.708819
