The dataset consits of ~22k images of predominately front facing dog heads with a size of 256x256 pixels. It was derived from the Tsinghua Dog Dataset.
- Download the Tsinghua Dog Dataset here.
- Install libs for preprocessing:
pip install -r requirements.txt
- Switch to scripts folder:
cd scripts
- Run preprocessing steps 1-3:
python step_1.py
etc. - Alternatively you can start a jupyter notebook and run the provided notebooks:
jupyter-lab
.
- During step 1 images are cropped according to bounding boxes provided by Tsinghua Dogs Dataset creators.
- During step 2 a second dog face detector is used to extract more accurate bounding boxes and key frames (eyes, nose, forehead). Both are used to refine the final cropped image. In addition, small images and images showing a dog from the side are filtered out.
- During step 3 the dataset is split into a train and validation set.