The very first publicly available chicken re-identification dataset is available on π€ Hugging Face: huggingface.co/datasets/dariakern/Chicks4FreeID
pip install datasets
Load the data:
from datasets import load_dataset
train_ds = load_dataset("dariakern/Chicks4FreeID", split="train")
train_ds[0]
Output:
{'crop': <PIL.PngImagePlugin.PngImageFile image mode=RGB size=2630x2630 at 0x7AA95E7D1720>, 'identity': 43}
Tip
Find more information on how to work with π€ huggingface.co/docs/datasets
To establish a baseline on the dataset, we explore 3 approaches
-
We evaluate the SotA model in animal re-identification: MegaDescriptor-L-384, a feature extractor, pre-trained on many species and identities.
timm.create_model("hf-hub:BVRA/MegaDescriptor-L-384", pretrained=True)
-
We train MegaDescriptor-L-384's underlying architecture; a Swin-Transformer, in the same way it has been used to build the MegaDescriptor-L-384, but now on our own dataset.
timm.create_model('swin_large_patch4_window12_384')
-
We train a Vision Transformer (ViT-B/16) as a fully supervised baseline, and focus on embeddings by replacing the classifier head with a linear layer.
from torchvision.models import vit_b_16
Evaluation settings are based on:
Metrics are from torchmetrics
- mAP:
MulticlassAveragePrecision(average="macro")
- top1:
MulticlassAccuracy(top_k=1)
- top5:
MulticlassAccuracy(top_k=5)
Below are the metrics for the test set. Standard deviations are based on 3 runs:
Setting | Evaluation | mAP | top-1 | top-5 |
---|---|---|---|---|
MegaDescriptor-L-384 (frozen) | k-NN | 0.649 Β± 0.044 | 0.709 Β± 0.026 | 0.924 Β± 0.027 |
MegaDescriptor-L-384 (frozen) | Linear | 0.935 Β± 0.005 | 0.883 Β± 0.009 | 0.985 Β± 0.003 |
Swin-L-384 | k-NN | 0.837 Β± 0.062 | 0.881 Β± 0.041 | 0.983 Β± 0.010 |
Swin-L-384 | Linear | 0.963 Β± 0.022 | 0.922 Β± 0.042 | 0.987 Β± 0.012 |
ViT-B/16 | k-NN | 0.893 Β± 0.010 | 0.923 Β± 0.005 | 0.985 Β± 0.019 |
ViT-B/16 | Linear | 0.976 Β± 0.007 | 0.928 Β± 0.002 | 0.990 Β± 0.012 |
The most interesting observation in this table is that, even though the MegaDescriptor-L-384 feature extractor has never seen our dataset, its embeddings are still relatively helpful in identifiying the chickens, even when compared to the fully supervised approaches.
git clone https://github.com/DariaKern/Chicks4FreeID
cd Chicks4FreeID
pip install requirements.txt
python run_baseline.py
You can pass different options, depending on your hardware configuration
python run_baseline.py --devices=4 --batch-size-per-device=128
For a full list of arguments type
python run_baseline.py --help
In a sepearte shell, open tensorboard to view progress and results
tensorboard --logdir baseline_logs
Note
Differnt low-level accelerator implementations (TPU, MPS, CUDA) yield different results. The original hardware config for the reported results is based on the MPS implementation accessible on a 64GB Apple M3 Max chip (2023) π» - it is recommened to run the baseline script with at least 64GB of VRAM / Shared RAM. On this device, one run takes around 9:30h
- [2024/05/30] DOI created: https://doi.org/10.57967/hf/2345
- [2024/05/23] the first version of the dataset was uploaded to Hugging Face. https://huggingface.co/datasets/dariakern/Chicks4FreeID
coming soon ...
@misc{kern2024Chicks4FreeID,
title={Chicks4freeID},
author={Daria Kern and Tobias Schiele and Ulrich Klauck and Winfred Ingabire},
year={2024},
doi={https://doi.org/10.57967/hf/2345},
note={under review}
}