# ePillID-benchmark

* https://github.com/usuyama/ePillID-benchmark
* https://arxiv.org/abs/2005.14288


```
@inproceedings{usuyama2020epillid,
  title={ePillID Dataset: A Low-Shot Fine-Grained Benchmark for Pill Identification},
  author={Usuyama, Naoto and Delgado, Natalia Larios and Hall, Amanda K and Lundin, Jessica},
  booktitle={Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops},
  year={2020}
}
```

In [2]:
import os

if not os.path.exists("ePillID-benchmark"):
  !git clone https://github.com/usuyama/ePillID-benchmark.git

## Download the data and unzip

In [3]:
if not os.path.exists("ePillID_data.zip"):
  !curl -o ePillID_data.zip "https://pillidwus2.blob.core.windows.net/data/ePillID_data.zip?st=2020-04-30T09%3A34%3A02Z&se=2031-05-01T02%3A38%3A00Z&sp=rl&sv=2018-03-28&sr=b&sig=Xvmr0RFWOeOSw74iyQso%2F7WQHs6lTxbtyQ9jrBTuLKI%3D"

In [4]:
if not os.path.exists("ePillID_data"):
  !unzip ePillID_data.zip

Archive:  ePillID_data.zip
  End-of-central-directory signature not found.  Either this file is not
  a zipfile, or it constitutes one disk of a multi-part archive.  In the
  latter case the central directory and zipfile comment will be found on
  the last disk(s) of this archive.
unzip:  cannot find zipfile directory in one of ePillID_data.zip or
        ePillID_data.zip.zip, and cannot find ePillID_data.zip.ZIP, period.


In [6]:
import zipfile
with zipfile.ZipFile("ePillID_data.zip", 'r') as zip_ref:
    zip_ref.extractall("/content/")

In [7]:
if not os.path.exists("train_cv.py"):
  %cd ePillID-benchmark/src

!echo `pwd`
!ls

/content/ePillID-benchmark/src
/content/ePillID-benchmark/src
arguments.py	  image_augmentators.py  metric_utils.py       pillid_datasets.py  train_nocv.py
classif_utils.py  metrics.py		 models		       sanitytest_eval.py
configs		  metric_test_eval.py	 multihead_trainer.py  train_cv.py


## Enabling GPU on Colab

Need to enable GPU from Notebook settings

- Navigate to Edit-Notebook settings menu
- Select GPU from the Hardware Accelerator dropdown list


In [8]:
import torch

if not torch.cuda.is_available():
  raise Exception("GPU not availalbe. CPU training will be too slow.")

print("device name", torch.cuda.get_device_name(0))

device name Tesla T4


## Install dependencies

The library versions on Colab might be different from what we used in our experiments. Check our repo for full dependencies
 https://github.com/usuyama/ePillID-benchmark/tree/master/docker

In [9]:
!pip install azureml-sdk

Collecting azureml-sdk
  Downloading azureml_sdk-1.55.0-py3-none-any.whl (2.7 kB)
Collecting azureml-core~=1.55.0 (from azureml-sdk)
  Downloading azureml_core-1.55.0-py3-none-any.whl (3.3 MB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m3.3/3.3 MB[0m [31m14.2 MB/s[0m eta [36m0:00:00[0m
[?25hCollecting azureml-dataset-runtime[fuse]~=1.55.0 (from azureml-sdk)
  Downloading azureml_dataset_runtime-1.55.0-py3-none-any.whl (2.3 kB)
Collecting azureml-train-core~=1.55.0 (from azureml-sdk)
  Downloading azureml_train_core-1.55.0-py3-none-any.whl (8.6 MB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m8.6/8.6 MB[0m [31m49.2 MB/s[0m eta [36m0:00:00[0m
[?25hCollecting azureml-train-automl-client~=1.55.0 (from azureml-sdk)
  Downloading azureml_train_automl_client-1.55.0-py3-none-any.whl (137 kB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m137.6/137.6 kB[0m [31m16.3 MB/s[0m eta [36m0:00:00[0m
[?25hCollecting azureml-pipeli

In [10]:
# check library versions
!pip freeze

absl-py==1.4.0
adal==1.2.7
aiohttp==3.9.3
aiosignal==1.3.1
alabaster==0.7.16
albumentations==1.3.1
altair==4.2.2
annotated-types==0.6.0
anyio==3.7.1
appdirs==1.4.4
applicationinsights==0.11.10
argcomplete==3.2.2
argon2-cffi==23.1.0
argon2-cffi-bindings==21.2.0
array-record==0.5.0
arviz==0.15.1
astropy==5.3.4
astunparse==1.6.3
async-timeout==4.0.3
atpublic==4.0
attrs==23.2.0
audioread==3.0.1
autograd==1.6.2
azure-common==1.1.28
azure-core==1.30.0
azure-graphrbac==0.61.1
azure-identity==1.15.0
azure-mgmt-authorization==4.0.0
azure-mgmt-containerregistry==10.3.0
azure-mgmt-core==1.4.0
azure-mgmt-keyvault==10.3.0
azure-mgmt-network==25.2.0
azure-mgmt-resource==23.0.1
azure-mgmt-storage==21.1.0
azureml-automl-core==1.55.0
azureml-core==1.55.0
azureml-dataprep==5.1.4
azureml-dataprep-native==41.0.0
azureml-dataprep-rslex==2.22.2
azureml-dataset-runtime==1.55.0
azureml-pipeline==1.55.0
azureml-pipeline-core==1.55.0
azureml-pipeline-steps==1.55.0
azureml-sdk==1.55.0
azureml-telemetry==1.55.0
a

## Run training and evaluation using resnet18

In [12]:
!python train_nocv.py \
 --appearance_network resnet50 \
 --pooling GAvP\
 --max_epochs 50 \
 --data_root_dir \
 /content/ePillID_data

[1;30;43mStreaming output truncated to the last 5000 lines.[0m
Attempted to log image metric holdout_12:02:09_PR-curve:
<module 'matplotlib.pyplot' from '/usr/local/lib/python3.10/dist-packages/matplotlib/pyplot.py'>
Attempted to log scalar metric holdout_f_top1-acc:
0.7428115010261536
Attempted to log scalar metric holdout_f_top5-acc:
0.9217252135276794
Attempted to log scalar metric holdout_f_map:
0.8238388414025868
Attempted to log scalar metric holdout_f_map_at_1:
0.7428115015974441
Attempted to log scalar metric holdout_f_gap:
0.595498160462138
Attempted to log scalar metric holdout_f_gap_at_1:
0.8878331681093763
Attempted to log scalar metric holdout_f_micro-ap:
0.595498160462138
Attempted to log scalar metric holdout_b_top1-acc:
0.7380191683769226
Attempted to log scalar metric holdout_b_top5-acc:
0.9041533470153809
Attempted to log scalar metric holdout_b_map:
0.8204273537021196
Attempted to log scalar metric holdout_b_map_at_1:
0.7380191693290735
Attempted to log scalar metr

In [13]:
# Check predictions
!ls outputs

eval_predictions_pilltypeid_nih_sidelbls0.01_metric_5folds_3.csv


## Next steps

* Try train_cv.py for cross-validation
* Try different models e.g. resnet50 and BCNN
* Increase max_epochs (need 200-300 epochs to converge)