Skip to content

Improve LFW download error message with alternative manual download link (Kaggle)#9463

Merged
zy1git merged 8 commits into
pytorch:mainfrom
wei06159:fix-dataset-link
Apr 9, 2026
Merged

Improve LFW download error message with alternative manual download link (Kaggle)#9463
zy1git merged 8 commits into
pytorch:mainfrom
wei06159:fix-dataset-link

Conversation

@wei06159
Copy link
Copy Markdown
Contributor

Summary

torchvision.datasets.LFWPeople/LFWPairs currently raises a ValueError indicating that the LFW dataset - http://vis-www.cs.umass.edu/lfw/ is no longer available for download and must be obtained manually. 
This PR keeps the existing behavior intact, but improves the error message to include a pointer to a commonly used dataset mirror (Kaggle) so users can find the dataset more easily.

Changes

Existing behavior preserved; only message text and link updated.

Related Issue

#8888

@pytorch-bot
Copy link
Copy Markdown

pytorch-bot Bot commented Mar 31, 2026

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/vision/9463

Note: Links to docs will display an error until the docs builds have been completed.

❗ 1 Active SEVs

There are 1 currently active SEVs. If your PR is affected, please view them below:

This comment was automatically generated by Dr. CI and updates every 15 minutes.

@meta-cla meta-cla Bot added the cla signed label Mar 31, 2026
@spzala
Copy link
Copy Markdown

spzala commented Apr 1, 2026

cc @NicolasHug @atalman - review request. Thanks!

Copy link
Copy Markdown
Member

@NicolasHug NicolasHug left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @wei06159 , thanks for the PR, happy to add a note in the docstring and in the error message that https://www.kaggle.com/datasets/jessicali9530/lfw-dataset in a common mirror, but I think it's best to keep the original link for reference - we can mention that they are broken though.

Comment thread torchvision/datasets/lfw.py Outdated

base_folder = "lfw-py"
download_url_prefix = "http://vis-www.cs.umass.edu/lfw/"
download_url_prefix = "https://www.kaggle.com/datasets/jessicali9530/lfw-dataset"
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should probably not be changed because it changes the download URL logic, but we are still raising a valueerror on download.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @NicolasHug , thank you for your review. I put back the original link and I added a note to docstring and error message that the common mirror at https://www.kaggle.com/datasets/jessicali9530/lfw-dataset.

@wei06159
Copy link
Copy Markdown
Contributor Author

wei06159 commented Apr 8, 2026

Hi @NicolasHug, could I get a review on this again? I put back the original link and added a note to docstring and error message that the common mirror is at https://www.kaggle.com/datasets/jessicali9530/lfw-dataset.
Thanks.

Copy link
Copy Markdown
Contributor

@zy1git zy1git left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@wei06159 Hi, thanks a lot for creating this PR.
I left some review comments. Feel free to take a look. Nicolas is on a conference now so he might take a look after coming back.

Comment thread torchvision/datasets/lfw.py Outdated

The LFW dataset is no longer available for automatic download. Please
download it manually and place it in the specified directory.
A commonly used mirror is available at: https://www.kaggle.com/datasets/jessicali9530/lfw-dataset
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This part looks good to me.

However, the warning message does not show up on the webpage (you can click the link of Preview Python docs built from this PR), but this is a pre-existing bug, not introduced by this PR.

If you want, feel free to fix in this PR, or open a new PR to fix it. Otherwise, you don't need to do anything on this part and I will fix it after merging your PR.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, I will open a new PR to fix it.


The LFW dataset is no longer available for automatic download. Please
download it manually and place it in the specified directory.
A commonly used mirror is available at: https://www.kaggle.com/datasets/jessicali9530/lfw-dataset
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This part looks good to me.

However, the warning message does not show up on the webpage (you can click the link of Preview Python docs built from this PR), but this is a pre-existing bug, not introduced by this PR.

If you want, feel free to fix in this PR, or open a new PR to fix it. Otherwise, you don't need to do anything on this part and I will fix it after merging your PR.

wei06159 and others added 2 commits April 9, 2026 15:39
@zy1git zy1git merged commit e160dc8 into pytorch:main Apr 9, 2026
33 of 39 checks passed
@github-actions
Copy link
Copy Markdown

github-actions Bot commented Apr 9, 2026

Hey @zy1git!

You merged this PR, but no labels were added.
The list of valid labels is available at https://github.com/pytorch/vision/blob/main/.github/process_commit.py

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants