This repository contains codes and collected user study data for the paper Classification from Ambiguity Comparisons.
Python 3.6.2
is used. Dependencies are specified in the requirements.txt
file.
Options of the execution file simulation_study.py
are
--setting
: Settingpassive
for the sufficient budget case andactive
for the insufficient budget case.--dataset
: Selections for sufficient budget aremnist1v7
,mnist3v5
,fashion0v6
,fashion2v4
,kuzushi1v7
,kuzushi2v6
,cifar1v9
andcifar4v7
. Dataset for insufficient budget isgaussian
.--run
: Times for independent repetitions.-m
: The hyper-parameter for repetition number of a single query.-t
: The hyper-parameter for the size of the delegation subset.--noise
: The noise rate for both oracles.--eps
: The precision hyper-parameter for active learning.-n
: Number of generated data points for active learning.
Command python simulation_study.py --setting passive --dataset mnist1v7 --run 1 -m 5 -t 20 --noise 0.3
will generate the following results.
run 1/10: label 0.9665, knn 0.9881
Command python simulation_study.py --setting passive --dataset cifar1v9 --run 1 -m 5 -t 20 --noise 0.3
will finally generate the following results.
run 1/1: label 0.9655, knn 0.9980, co 0.8440
Command python simulation_study.py --setting active --dataset gaussian --run 1 --noise 0.3 --eps 0.1 -n 10000
will generate the following results.
Step 1: HS 655, ACC 0.9767
Step 2: HS 460, ACC 0.9950
Step 3: HS 31, ACC 0.9990
The data files are data/user/collect25_1.csv
and data/user/collect25_2.csv
.
Command python user_study_kuzushiji_difficulty.py
will generate the following results.
mean 2.7486, std 0.3421
Ttest_1sampResult(statistic=-5.145177699062356, pvalue=4.693217551403353e-06)
The data file is data/user/kuzushiji-medoids-simulation.csv
.
It has 20
rows, with each row indicating the results from one user.
It has 100
columns, as the explicit labeling and its difficulty are collected for 50
medoids.
The data file is data/user/kuzushiji-medoids-annotation.npz
and data/user/kuzushiji-uniform-annotation.npz
.
Inside the files, the array 'pos_res'
has size (25*25*10)
and containes the pairwise annotation for all possible pairs among 25
selected images from 10
users.
Similarly, the array 'amb_res'
in the files has the same size.
The file user_study_kuzushiji_feedback.py
has only one option --setting
.
This option can be set as medoids
or uniform
.
Command python user_study_kuzushiji_feedback.py --setting medoids
will evaluate methods on the medoids feedback.
The data file is data/user/car-simulation-annotation.npz
.
In the file, the array 'abso'
has shape (4*150)
as four users are queried on 150
images.
In user_study_car.py
, the function sim()
constructs pairwise comparison results from the 'abso'
array.
Command python user_study_car.py
will evalute methods on the data.
If you use the proposed method or any of the data in your work, we would appreciate a reference to our paper:
@online{cui2020,
author = {Zhenghang Cui and Issei Sato},
title = {Classification from Ambiguity Comparisons},
year = {2020},
eprinttype = {arXiv:2008.00645},
}