Skip to content

Conversation

oke-aditya
Copy link
Contributor

@oke-aditya oke-aditya commented Mar 15, 2021

Fixes #3569 .

Fix redirect for

  • Caltech
  • Omniglot
  • CelebA
  • WIDERFace

For google drive in the case of CelebA and WIDERFace it simply redirects to www.drive.google.com/<url>/edit instead of www.drive.google.com/<url>

I have modified the file id and added /edit to it. Hope it works.

@codecov
Copy link

codecov bot commented Mar 15, 2021

Codecov Report

Merging #3574 (f22016a) into master (331f126) will not change coverage.
The diff coverage is 100.00%.

Impacted file tree graph

@@           Coverage Diff           @@
##           master    #3574   +/-   ##
=======================================
  Coverage   79.06%   79.06%           
=======================================
  Files         105      105           
  Lines        9786     9786           
  Branches     1571     1571           
=======================================
  Hits         7737     7737           
  Misses       1568     1568           
  Partials      481      481           
Impacted Files Coverage Δ
torchvision/datasets/caltech.py 83.90% <ø> (ø)
torchvision/datasets/celeba.py 19.75% <ø> (ø)
torchvision/datasets/widerface.py 86.31% <ø> (ø)
torchvision/datasets/omniglot.py 80.85% <100.00%> (ø)
torchvision/datasets/sbd.py 81.66% <100.00%> (ø)

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 331f126...a3df58a. Read the comment docs.

Copy link
Collaborator

@pmeier pmeier left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One URL is not quite right and a few minor nitpicks. Otherwise LGTM! Thanks @oke-aditya!

@oke-aditya
Copy link
Contributor Author

Feel free to merge this. I have fixed the lint and changes look fine from my side 😄

Copy link
Collaborator

@pmeier pmeier left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey @oke-aditya, we had some internal discussion about this and we need to revert some changes you made:

  1. URLs that redirect to a completely other location should probably stay the same as before. For example that redirection from Caltech to Google Drive is an implementation detail on there side. They still list the original URL on the website and thus we should probably keep it.
  2. I'm not sure if adding /edit to ID for files on Google Drive works for us (we don't have a test for this), but it is at least conceptually wrong. This should probably be fixed directly in datasets.utils.download_file_from_google_drive().

I'm sorry for the back and forth.

@oke-aditya
Copy link
Contributor Author

No worries @pmeier .
I will revert changes and try to fix datasets.utils.download_file_from_google_drive() to accept /edit route too.

@pmeier
Copy link
Collaborator

pmeier commented Mar 22, 2021

I will [...] try to fix datasets.utils.download_file_from_google_drive() to accept /edit route too.

Since datasets.utils.download_file_from_google_drive() builds the correct URL from the ID, maybe we can simply add it there.

@oke-aditya
Copy link
Contributor Author

oke-aditya commented Apr 6, 2021

I had a second thought on adding /edit
If the dataset owner removes edit access and makes it read-only (no access to /edit) then it might cause future problems.
So let's play safe and avoid it!

@oke-aditya oke-aditya requested a review from pmeier April 6, 2021 19:04
@pmeier
Copy link
Collaborator

pmeier commented Apr 6, 2021

@oke-aditya Thanks for looking into that!

If the dataset owner removes edit access and makes it read-only (no access to /edit) then it might cause future problems.

I didn't know that! Is there documentation that describes this, i.e. the /edit postfix in case the file is hosted as read-only?

So let's play safe and avoid it!

Completely agree.

@pmeier pmeier requested a review from fmassa April 6, 2021 19:36
Copy link
Member

@fmassa fmassa left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!

@fmassa fmassa merged commit d9ec551 into pytorch:master Apr 7, 2021
@oke-aditya oke-aditya deleted the fix_urls branch April 7, 2021 12:05
@oke-aditya
Copy link
Contributor Author

oke-aditya commented Apr 7, 2021

Hey @pmeier I did a little experiment if /edit would mean edit access. Actually no, it does not mean it.

I will show an example here.
Here is file on Gdrive

Link1 = https://drive.google.com/drive/folders/1rb8fyciuy-uYIPoK2D_wKtaJBGLDxiuH?usp=sharing
Link2 = https://drive.google.com/drive/folders/1rb8fyciuy-uYIPoK2D_wKtaJBGLDxiuH/edit
Link3 = https://drive.google.com/drive/folders/1rb8fyciuy-uYIPoK2D_wKtaJBGLDxiuH/view
Link4 = https://drive.google.com/drive/folders/1rb8fyciuy-uYIPoK2D_wKtaJBGLDxiuH

Try opening all the links in incognito / private windows (it is completely safe the photo in it is Ubuntu Wallpaper permissive license).

Link1 and Link4 would open and work file. But Link2 and Link3 won't. So it always safer to redirect to link4 which is the base url.

Notice that you won't be able to edit this folder, it is view only!.

@pmeier
Copy link
Collaborator

pmeier commented Apr 7, 2021

@oke-aditya Awesome, thanks a lot for the investigation!

facebook-github-bot pushed a commit that referenced this pull request Apr 13, 2021
Summary:
* redirect the urls

* adds /edit

* Apply suggestions from code review

* fix lint

* undo urls

Reviewed By: NicolasHug

Differential Revision: D27706939

fbshipit-source-id: a4a35dc5d3e873c6bb7e3d713b03d6097520c0fc

Co-authored-by: Philip Meier <github.pmeier@posteo.de>
Co-authored-by: Francisco Massa <fvsmassa@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Update datasets URLs

5 participants