# Tutorial loading and exploring LabelMe dataset

## Basic imports

In [1]:
import numpy as np
from pathlib import Path

In [2]:
DIR = Path().cwd()
DIRlabelme = (DIR / ".." / "datasets" / "labelme").resolve()
DIR_module = DIRlabelme / "labelme.py"
print(DIRlabelme)

/home/tlefort/Documents/peerannot/peerannot/datasets/labelme


## Install dataset

Only run this command once

In [3]:
# ! peerannot install $DIR_module

# Majority vote

In [4]:
! peerannot aggregate $DIRlabelme -s MV

Running aggregation mv with options {}
Aggregated labels stored at /home/tlefort/Documents/peerannot/peerannot/datasets/labelme/labels/labels_labelme_mv.npy with shape (1000,)


# Naive Soft labelling

In [5]:
! peerannot aggregate $DIRlabelme -s NaiveSoft

Running aggregation naivesoft with options {}
Aggregated labels stored at /home/tlefort/Documents/peerannot/peerannot/datasets/labelme/labels/labels_labelme_naivesoft.npy with shape (1000, 8)


# Loading datasets

The majority voting in case of equality of votes returns one of the possible classes with equal probabilities (the naive soft aggregation computes the accuracy on aggregation with a simple `np.argmax` and thus does not sample the equalities).
This can lead to small differences in aggregation accuracy in practice.

In [7]:
from peerannot.runners.train import load_all_data

labels_path_mv = DIRlabelme / "labels" / "labels_labelme_mv.npy"
trainset, valset, testset = load_all_data(DIRlabelme,
                                          labels_path_mv,
                                          path_remove=None,
                                          labels=labels_path_mv,
                                          data_augmentation=False)

Loading datasets
Accuracy on aggregation: 77.200%


In [11]:
labels_path_soft = DIRlabelme / "labels" / "labels_labelme_naivesoft.npy"
trainset, valset, testset = load_all_data(DIRlabelme,
                                          labels_path_soft,
                                          path_remove=None,
                                          labels=labels_path_soft,
                                          data_augmentation=False)

Loading datasets
Accuracy on aggregation: 76.900%


# Train a network

In [8]:
! peerannot train $DIRlabelme -o labelme_mv \
            -K 8 --labels=$labels_path_mv \
            --model resnet18 --n-epochs=150 --lr=0.1 --scheduler -m 50 -m 100 \
            --scheduler --num-workers=8 --pretrained

Running the following configuration:
----------
- Data at /home/tlefort/Documents/peerannot/peerannot/datasets/labelme will be saved with prefix labelme_mv
- number of classes: 8
- labels: /home/tlefort/Documents/peerannot/peerannot/datasets/labelme/labels/labels_labelme_mv.npy
- model: resnet18
- n_epochs: 150
- lr: 0.1
- scheduler: True
- milestones: (50, 100)
- num_workers: 8
- pretrained: True
- optimizer: SGD
- img_size: 224
- data_augmentation: False
- path_remove: None
- momentum: 0.9
- decay: 0.0005
- n_params: 3072
- lr_decay: 0.1
- batch_size: 64
----------
Loading datasets
Accuracy on aggregation: 77.200%
Train set: 1000 tasks
Test set: 1188 tasks
Validation set: 500 tasks
Using cache found in /home/tlefort/.cache/torch/hub/pytorch_vision_main
Using cache found in /home/tlefort/.cache/torch/hub/pytorch_vision_main
Using cache found in /home/tlefort/.cache/torch/hub/pytorch_vision_main
Successfully loaded resnet18 with n_classes=8
Training epoch:   1%|▎                       

In [17]:
! peerannot train $DIRlabelme -o labelme_soft \
            -K 8 --labels=$labels_path_soft \
            --model resnet18 --n-epochs=150 --lr=0.1 --scheduler -m 50 -m 100 \
            --scheduler --num-workers=8 --pretrained

Running the following configuration:
----------
- Data at /home/tlefort/Documents/peerannot/peerannot/datasets/labelme will be saved with prefix labelme_soft
- number of classes: 8
- labels: /home/tlefort/Documents/peerannot/peerannot/datasets/labelme/labels/labels_labelme_naivesoft.npy
- model: resnet18
- n_epochs: 150
- lr: 0.1
- scheduler: True
- milestones: (50, 100)
- num_workers: 8
- pretrained: True
- optimizer: SGD
- img_size: 224
- path_remove: None
- momentum: 0.9
- decay: 0.0005
- n_params: 3072
- lr_decay: 0.1
- batch_size: 64
----------
Loading datasets
Accuracy on aggregation: 76.900%
Train set: 1000 tasks
Test set: 1188 tasks
Validation set: 500 tasks
Using cache found in /home/tlefort/.cache/torch/hub/pytorch_vision_main
Using cache found in /home/tlefort/.cache/torch/hub/pytorch_vision_main
Using cache found in /home/tlefort/.cache/torch/hub/pytorch_vision_main
Successfully loaded resnet18 with n_classes=8
Training epoch:  33%|████████▋                 | 50/150 [09:30<

KeyboardInterrupt: 

# WAUM stacked identification

In [8]:
path_votes = DIRlabelme / "answers.json"

In [5]:
! peerannot identify $DIRlabelme -K 8 --method WAUMstacked --labels $path_votes\
    --model res --n-epochs 2 --lr=0.1\
    --maxiter-DS=50 --pretrained

Running the following configuration:
----------
- Data at /home/tlefort/Documents/peerannot/peerannot/datasets/labelme
- number of classes: 8
- labels: /home/tlefort/Documents/peerannot/peerannot/datasets/labelme/answers.json
- model: vgg16_bn
- n_epochs: 2
- lr: 0.1
- maxiter_ds: 50
- pretrained: True
- use_pleiss: False
- alpha: 0.01
- n_params: 3072
- momentum: 0.9
- decay: 0.0005
- img_size: 224
----------
Train set: 2547 tasks
Using cache found in /home/tlefort/.cache/torch/hub/pytorch_vision_main
Using cache found in /home/tlefort/.cache/torch/hub/pytorch_vision_main
Using cache found in /home/tlefort/.cache/torch/hub/pytorch_vision_main
Successfully loaded vgg16_bn with n_classes=8
Running identification with method: WAUMstacked
Finished: 100%|█████████████████████████████████| 50/50 [00:07<00:00,  6.56it/s]
epoch:   0%|                                              | 0/2 [00:03<?, ?it/s]
Traceback (most recent call last):
  File "/home/tlefort/condaenvs/phd/bin/peerannot", line 