# Using Simulated Data

This notebook contains instructions on generating your own synthetic data to test estimator performance under a variety of controlled conditions.  The simulated datasets are created with the `make_classification_labels` function

In [1]:
import numpy as np
from statswag.datasets import make_classification_labels

### Set number of samples and labelers

In [2]:
num_instances = 1000
num_labelers = 6

To use the ``make_classification_labels`` function, we need to input a set of ground truth labels (``y``), the number of labelers desired (``n_labelers``), and a confusion matrix for each labeler (``confusion``).  For the ``confusion`` parameter, you can provide either a single confusion matrix, that will be used for each labeler, or you can provide a (potentially) different matrix for each labeler.

In the example below, we will build one confusion matrix for the first labeler and copy the same confusion matrix for all other labelers. This is to represent the scenario where there is one outlying good labeler amongst several poor performing labelers.

You can use this function to generate labels representative of your own scenarios of interest.

_Note: In the simulated data, all labelers are conditionally independent.  This may be updated in future versions._

In [3]:
# Generate random ground truth labels (0, 1, or 2)
y = np.random.randint(0, 3, num_instances)

# Create a confusion matrix for the first labeler, here we use 3 classes
diagonal_first = 0.9
off_diagonal_first = (1.0-diagonal_first)/2.0
confusion_mat_first = [[diagonal_first,off_diagonal_first,off_diagonal_first],
                       [off_diagonal_first,diagonal_first,off_diagonal_first],
                       [off_diagonal_first,off_diagonal_first,diagonal_first]]

# Create a confusion matrix for the other labelers
diagonal_other = 0.55
off_diagonal_other = (1.0-diagonal_other)/2.0
confusion_mat_other = [[diagonal_other,off_diagonal_other,off_diagonal_other],
                       [off_diagonal_other,diagonal_other,off_diagonal_other],
                       [off_diagonal_other,off_diagonal_other,diagonal_other]]

# From the ground truth labels, produce the predicted labels
labels = make_classification_labels(y=y,
                                    n_labelers=num_labelers,
                                    confusion=[confusion_mat_first]+
                                    [confusion_mat_other for i in range(num_labelers-1)])

# Display first 5 true and predicted labels
print(np.expand_dims(y[0:5],axis=1))
print('')
print(labels[0:5,:])

[[2]
 [0]
 [1]
 [2]
 [0]]

[[2 2 0 2 2 0]
 [0 2 0 0 0 2]
 [1 0 2 1 1 2]
 [2 2 2 2 2 1]
 [0 0 0 2 2 2]]


### Display the true labeler accuracies

In [4]:
from statswag.metrics import nan_accuracy
true_labeler_accuracy = np.asarray([nan_accuracy(y,labels[:,col]) for col in range(np.size(labels,axis=1))])

true_labeler_accuracy

array([0.915, 0.512, 0.563, 0.556, 0.579, 0.545])

### Run estimators on this dataset

Note that Majority Vote produces worse estimates of labeler accuracy (the errors are considerably larger), particularly for the first (outlying) labeler

In [5]:
# Estimate the accuracies using the MV estimator
from statswag.estimators import MajorityVote

mv = MajorityVote()
results = mv.fit(labels)
print('Errors')
print(true_labeler_accuracy - results['accuracies'])
print('Mean Absolute Error')
print(np.mean(np.abs(true_labeler_accuracy - results['accuracies'])))

Errors
[0.232 0.055 0.074 0.057 0.082 0.058]
Mean Absolute Error
0.09300000000000001


In [6]:
# Estimate the accuracies using the IWMV estimator
from statswag.estimators import IWMV

iwmv = IWMV()
results = iwmv.fit(labels)
print('Errors')
print(true_labeler_accuracy - results['accuracies'])
print('Mean Absolute Error')
print(np.mean(np.abs(true_labeler_accuracy - results['accuracies'])))

Errors
[-0.038 -0.013 -0.014 -0.01  -0.008 -0.014]
Mean Absolute Error
0.016166666666666645


In [7]:
# Estimate the accuracies using the Spectral estimator
from statswag.estimators import Spectral

spectral = Spectral()
results = spectral.fit(labels)
print('Errors')
print(true_labeler_accuracy - results['accuracies'])
print('Mean Absolute Error')
print(np.mean(np.abs(true_labeler_accuracy - results['accuracies'])))

Errors
[-0.06545671  0.013402    0.016388    0.01128057  0.02237434  0.01188404]
Mean Absolute Error
0.02346427672216254


In [8]:
# Estimate the accuracies using the Agreement estimator
from statswag.estimators import Agreement

agreement = Agreement()
results = agreement.fit(labels)
print('Errors')
print(true_labeler_accuracy - results['accuracies'])
print('Mean Absolute Error')
print(np.mean(np.abs(true_labeler_accuracy - results['accuracies'])))

Errors
[-0.06357893 -0.01760571 -0.0569633  -0.08961293 -0.04720874 -0.08206775]
Mean Absolute Error
0.059506225311042105


In [9]:
from statswag.estimators import MLEOneParameterPerLabeler

MLE = MLEOneParameterPerLabeler()
ll_list = []
results_list = []
for i in range(5):
    results_list.append(MLE.fit(labels))
    ll_list.append(MLE.expert_models.log_likelihood(labels,results_list[i]['class_names']))
index = np.argmax(ll_list)
print('Errors')
print(true_labeler_accuracy-results_list[index]['accuracies'])
print('Mean Absolute Error')
print(np.mean(np.abs(true_labeler_accuracy-results_list[index]['accuracies'])))

Errors
[-0.03930892  0.00889757 -0.00353322  0.00949756  0.00673285  0.00724001]
Mean Absolute Error
0.012535018639368475
