New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
oxford_flowers102 bad splits #3022
Comments
This is not an issue within TFDS itself but perhaps a bug in the dataset itself. These also match with the table 6 in the paper: In my opinion, the fix for this should come from the dataset itself. |
A workaround would be to do something like this >>> import tensorflow_datasets as tfds
>>> test, train, validation = tfds.load('oxford_flowers102', split=['train', 'test', 'validation'])
>>> sum(1 for _ in train)
6149 Perhaps, we can add a warning in the dataset description. For warning example, see: datasets/tensorflow_datasets/image_classification/imagenet_resized.py Lines 41 to 46 in d9b91c5
|
TFDS provide the datasets as close as the original datasets authors. As pointed out above, TFDS splits match the splits as defined by the Oxford author. So I'm making this bug as working as intended. Note: Our documentation already provide the number of examples: https://www.tensorflow.org/datasets/catalog/oxford_flowers102 Or programatically: info = tfds.builder('oxford_flowers102)
info.split['test'].num_examples Or test, train, validation = tfds.load('oxford_flowers102', split=['train', 'test', 'validation'])
print(len(train)) |
Thank all, sounds sensible. I don't know the history of when/how/why the dataset splits evolved, but wanted to document it somewhere. |
You can find the splitting and slicing doc here |
The train/val/test splits in the tfds oxford_flowers102 don't match up with established splits.
Training on the train and val splits, only acheive around 91% accuracy with finetuning. Should acheive 98%, e.g. table 6 here. If one reads in the entire dataset and creates a random split, this is acheivable. Has also been noted on stackoverflow here.
The text was updated successfully, but these errors were encountered: