Skip to content

Conversation

vballoli
Copy link
Contributor

Fixes #2630

@codecov
Copy link

codecov bot commented Sep 13, 2020

Codecov Report

Merging #2673 into master will decrease coverage by 0.97%.
The diff coverage is 33.33%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master    #2673      +/-   ##
==========================================
- Coverage   73.38%   72.40%   -0.98%     
==========================================
  Files          99       95       -4     
  Lines        8791     8247     -544     
  Branches     1389     1310      -79     
==========================================
- Hits         6451     5971     -480     
+ Misses       1915     1860      -55     
+ Partials      425      416       -9     
Impacted Files Coverage Δ
torchvision/datasets/mnist.py 53.84% <33.33%> (-1.41%) ⬇️
torchvision/ops/feature_pyramid_network.py 75.82% <0.00%> (-15.39%) ⬇️
torchvision/io/video.py 69.82% <0.00%> (-10.66%) ⬇️
torchvision/io/image.py 71.73% <0.00%> (-10.52%) ⬇️
torchvision/transforms/functional_tensor.py 66.84% <0.00%> (-7.31%) ⬇️
torchvision/transforms/transforms.py 76.53% <0.00%> (-3.99%) ⬇️
torchvision/ops/roi_align.py 68.96% <0.00%> (-2.01%) ⬇️
torchvision/ops/roi_pool.py 73.07% <0.00%> (-1.93%) ⬇️
torchvision/ops/ps_roi_pool.py 73.07% <0.00%> (-1.93%) ⬇️
torchvision/ops/ps_roi_align.py 71.42% <0.00%> (-1.91%) ⬇️
... and 23 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 3756b60...9c6a2d8. Read the comment docs.

Copy link
Member

@fmassa fmassa left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the PR!

I've one comment, let me know what you think

Comment on lines 268 to 269
if self.target_transform is None and self.split == 'letters':
self.target_transform = lambda x: x - 1
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To me I think it might be preferable to change classes_split_dict case for letters, if it's not compatible with the current annotations.

I think that implicitly changing in the target_transform the labels might be misleading to users, and would make previously trained models to not be valid anymore.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did think of that initially while fixing this, but the issue is the class to letter mapping is still offset by 1. The only other solution would be to add a dummy letter to position 0 to fix the class offset. For ex: 1 is A while loading the dataset, but classes_split_dict would return b. I added this fix since transforms is a cleaner approach, but adding a dummy letter to the list is preferred, I'm happy to change it to that too.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

add a dummy letter to position 0

I think this would be the preferred approach IMO. Maybe add something like __unused__ ?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I picked up @vballoli's great work to make the final changes and merge.

@fmassa I used N/A instead because COCO uses this value for the unused categories.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Using N/A is fine with me

# Merged Classes assumes Same structure for both uppercase and lowercase version
_merged_classes = set(['C', 'I', 'J', 'K', 'L', 'M', 'O', 'P', 'S', 'U', 'V', 'W', 'X', 'Y', 'Z'])
_all_classes = set(list(string.digits + string.ascii_letters))
_merged_classes = {'c', 'i', 'j', 'k', 'l', 'm', 'o', 'p', 's', 'u', 'v', 'w', 'x', 'y', 'z'}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One remark concerning cosmetics, not actual functionality. The EMNIST paper describes removing the capital letters. We can actually write the classes in a way that achieves this but then the labels will look weird; something like: A, B, c, D, .... So I think @vballoli's solution is fine.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I think this is fine as is, but thanks for pointing it out

@datumbox datumbox requested a review from fmassa October 23, 2020 11:42
Copy link
Member

@fmassa fmassa left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @vballoli and @datumbox !

@fmassa fmassa merged commit cffac64 into pytorch:master Oct 23, 2020
bryant1410 pushed a commit to bryant1410/vision-1 that referenced this pull request Nov 22, 2020
* Add float support to ColorJitter

* Fix byclass EMNIST

* Fix bymerge, balance, letters EMNIST

* Fix whitespace indent

* Revert unrelated file changes

* Revert unrelated file changes

* Removing unnecessary type conversions.

* Removing the transform and adding dummy class instead.

Co-authored-by: Vasileios Vryniotis <vvryniotis@fb.com>
vfdev-5 pushed a commit to Quansight/vision that referenced this pull request Dec 4, 2020
* Add float support to ColorJitter

* Fix byclass EMNIST

* Fix bymerge, balance, letters EMNIST

* Fix whitespace indent

* Revert unrelated file changes

* Revert unrelated file changes

* Removing unnecessary type conversions.

* Removing the transform and adding dummy class instead.

Co-authored-by: Vasileios Vryniotis <vvryniotis@fb.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Among emnist dataset splits, there are cases where labels do not match or out of list occurs.
3 participants